Jason Volk
|
665eeb6cd7
|
ircd::gpt::vocab: No-split mask for trailing punctuation.
|
2021-04-22 12:27:57 -07:00 |
|
Jason Volk
|
aaced40d90
|
ircd::gpt::vocab: Mask erroneous trailing character case; fix pretoken case.
|
2021-04-22 12:27:57 -07:00 |
|
Jason Volk
|
b2f788e255
|
ircd::gpt::vocab: Minor reorg pre-tokenize related.
|
2021-04-22 12:27:57 -07:00 |
|
Jason Volk
|
1e08339955
|
ircd::gpt::vocab: Fixes for additional missing cases.
|
2021-04-22 12:27:57 -07:00 |
|
Jason Volk
|
eeadc15319
|
ircd::gpt::vocab: Fixes for additional mismatching cases.
|
2021-04-22 12:27:57 -07:00 |
|
Jason Volk
|
0a6be0efed
|
ircd::gpt::vocab: Fix string length accumulation.
|
2021-04-22 12:27:57 -07:00 |
|
Jason Volk
|
0a87754c99
|
ircd::gpt::vocab: Fix token init missing null terminations.
|
2021-04-22 12:27:57 -07:00 |
|
Jason Volk
|
734948863f
|
ircd::gpt::vocab: Add token debug string tool.
|
2021-03-09 04:50:19 -08:00 |
|
Jason Volk
|
53c4260a21
|
ircd::gpt: Add Basic Latin (lower) and C0 replacement LUT; various.
|
2021-03-09 04:50:19 -08:00 |
|
Jason Volk
|
29b99dcf4d
|
ircd::gpt: Split vocab related into separate unit.
|
2021-03-02 11:13:59 -08:00 |
|