Commit 3a26dad
committed
fix(chunkers): prevent content loss in word boundary splitting
When splitAtWordBoundaries snaps end back to a word boundary, advance
pos from end (not pos + step) in non-overlapping mode. The step-based
advancement is preserved for the sliding window case (TokenChunker).1 parent 4c3508b commit 3a26dad
1 file changed
+10
-4
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
60 | 60 | | |
61 | 61 | | |
62 | 62 | | |
63 | | - | |
64 | 63 | | |
65 | 64 | | |
66 | 65 | | |
| |||
79 | 78 | | |
80 | 79 | | |
81 | 80 | | |
82 | | - | |
83 | | - | |
84 | | - | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
85 | 91 | | |
86 | 92 | | |
87 | 93 | | |
| |||
0 commit comments