-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Pull requests: Unstructured-IO/unstructured
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
fix: restore double-newline row boundaries in Table.text (#4235)
#4299
opened Mar 25, 2026 by
alvinttang
Loading…
3 tasks done
mem: load only 15 common langdetect profiles to reduce memory
#4297
opened Mar 24, 2026 by
KRRT7
Loading…
mem: exclude unused spaCy pipeline components to reduce model memory
#4296
opened Mar 24, 2026 by
KRRT7
Loading…
refactor: don't import unstructured-inference via partition.pdf
#4284
opened Mar 16, 2026 by
artdent
Loading…
fix: improve multi-column layout sorting for academic papers (#4104)
#4283
opened Mar 16, 2026 by
Gopesh111
Loading…
refactor: replace deprecated decorators in partition_image with apply_metadata
#4271
opened Mar 2, 2026 by
HemantSudarshan
Loading…
fix: add 'el' and 'gr' as Greek language code aliases for Tesseract OCR
#4270
opened Feb 27, 2026 by
s0wa48
Loading…
fix: handle list output from group_bullet_paragraph in element apply()
#4253
opened Feb 21, 2026 by
s0wa48
Loading…
feat: add XLSM (Excel Macro-Enabled Workbook) parsing support
#4227
opened Feb 8, 2026 by
longway-code
Loading…
docs: fix redundant whitespace in pyenv command in README
#4224
opened Feb 3, 2026 by
longway-code
Loading…
Fix FutureWarning: Add test to verify bytes are wrapped in BytesIO for read_excel
#4213
opened Jan 27, 2026 by
Achieve3318
Loading…
⚡️ Speed up function
merge_out_layout_with_ocr_layout by 30%
#4212
opened Jan 27, 2026 by
aseembits93
Loading…
feat: chunking by character and title now isolates tables
#4197
opened Jan 15, 2026 by
badGarnet
Loading…
fix: NameError: LayoutElements not defined in paddle_ocr.py
#4195
opened Jan 15, 2026 by
mohansinghi
Loading…
fix: None text attribute when normalizing Picture to Image element
#4083
opened Aug 22, 2025 by
ishahroz
Loading…
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.