Jason Volk
|
71b1b44a7f
|
ircd::utf: Rename encode() to encode_sparse().
|
2021-08-08 09:47:02 -07:00 |
|
Jason Volk
|
4f97dcf456
|
ircd: Vector initialization fixes for GCC.
|
2021-05-14 05:57:47 -07:00 |
|
Jason Volk
|
665eeb6cd7
|
ircd::gpt::vocab: No-split mask for trailing punctuation.
|
2021-04-22 12:27:57 -07:00 |
|
Jason Volk
|
aaced40d90
|
ircd::gpt::vocab: Mask erroneous trailing character case; fix pretoken case.
|
2021-04-22 12:27:57 -07:00 |
|
Jason Volk
|
b2f788e255
|
ircd::gpt::vocab: Minor reorg pre-tokenize related.
|
2021-04-22 12:27:57 -07:00 |
|
Jason Volk
|
1e08339955
|
ircd::gpt::vocab: Fixes for additional missing cases.
|
2021-04-22 12:27:57 -07:00 |
|
Jason Volk
|
eeadc15319
|
ircd::gpt::vocab: Fixes for additional mismatching cases.
|
2021-04-22 12:27:57 -07:00 |
|
Jason Volk
|
0a6be0efed
|
ircd::gpt::vocab: Fix string length accumulation.
|
2021-04-22 12:27:57 -07:00 |
|
Jason Volk
|
0a87754c99
|
ircd::gpt::vocab: Fix token init missing null terminations.
|
2021-04-22 12:27:57 -07:00 |
|
Jason Volk
|
734948863f
|
ircd::gpt::vocab: Add token debug string tool.
|
2021-03-09 04:50:19 -08:00 |
|
Jason Volk
|
53c4260a21
|
ircd::gpt: Add Basic Latin (lower) and C0 replacement LUT; various.
|
2021-03-09 04:50:19 -08:00 |
|
Jason Volk
|
29b99dcf4d
|
ircd::gpt: Split vocab related into separate unit.
|
2021-03-02 11:13:59 -08:00 |
|