Skip to content

Tip: tree_sitter_typescript on aarch64 Android - how to compile for docling #241

@Manamama

Description

@Manamama

Summary:

docling (more precisely: docling-core[chunking]semchunktree-sitter-typescript), at least as of today, pulls an old / incompatible version that failed to build on Termux/Android.

Breakdown of responsibility

Component Version pulled / attempted Blame level Reason
docling 2.77.0 (latest at time) Medium Depends on docling-core[chunking] which requires semchunk
docling-core[chunking] <3.0.0, >=2.66.0 High Explicitly requires semchunk<3.0.0,>=2.2.0 for code chunking
semchunk 2.2.2 High Depends on tree-sitter-typescript (no version pin → pip picks latest)
tree-sitter-typescript 0.23.2 (git HEAD) Very High Build fails on Termux because missing tree_sitter/parser.h headers or clang/Android quirks in generated parser.c (TSFieldMapSlice unknown type)
tree-sitter core Not installed as dev pkg Medium Termux has runtime tree-sitter but no -dev package → pip can't find headers for source build
pip / build env Modern (26.0.1) Low Tries source build when no wheel exists for android_24_arm64_v8a

Root cause chain:

  • docling → wants chunking → pulls semchunk 2.2.2
  • semchunk → wants tree-sitter-typescript
  • pip sees no pre-built wheel for tree-sitter-typescript on Android aarch64 → falls back to source build from git
  • Source build → clang compile of tsx/src/parser.c → fails because tree-sitter headers are missing or incompatible with Termux's clang/Android sysroot

What to do

git clone https://github.com/tree-sitter/tree-sitter-typescript.git and pip install . this version. I guess pip install git+https://github.com/tree-sitter/tree-sitter-typescript.git@0.20.2 --no-build-isolation --no-deps would also do. May need tinkering with compile flags.

Symptoms:

If just pip install -U docling:

    dist._finalize_license_expression()
  /data/data/com.termux/files/usr/lib/python3.13/site-packages/setuptools/dist.py:765: SetuptoolsDeprecationWarning: License classifiers are deprecated.
  !!

          ********************************************************************************
          Please consider removing the following classifiers in favor of a SPDX license expression:

          License :: OSI Approved :: MIT License

          See https://packaging.python.org/en/latest/guides/writing-pyproject-toml/#license for details.
          ********************************************************************************

  !!
    self._finalize_license_expression()
  running bdist_wheel
  running build
  running build_py
  creating build/lib.android-24-arm64_v8a-cpython-313/tree_sitter_typescript
  copying bindings/python/tree_sitter_typescript/__init__.py -> build/lib.android-24-arm64_v8a-cpython-313/tree_sitter_typescript
  running egg_info
  writing bindings/python/tree_sitter_typescript.egg-info/PKG-INFO
  writing dependency_links to bindings/python/tree_sitter_typescript.egg-info/dependency_links.txt
  writing requirements to bindings/python/tree_sitter_typescript.egg-info/requires.txt
  writing top-level names to bindings/python/tree_sitter_typescript.egg-info/top_level.txt
  [03/09/26 09:57:41] ERROR    listing git files failed - pretending     git.py:26
                               there aren't any
  reading manifest file 'bindings/python/tree_sitter_typescript.egg-info/SOURCES.txt'
  adding license file 'LICENSE'
  writing manifest file 'bindings/python/tree_sitter_typescript.egg-info/SOURCES.txt'
  copying bindings/python/tree_sitter_typescript/__init__.pyi -> build/lib.android-24-arm64_v8a-cpython-313/tree_sitter_typescript
  copying bindings/python/tree_sitter_typescript/binding.c -> build/lib.android-24-arm64_v8a-cpython-313/tree_sitter_typescript
  copying bindings/python/tree_sitter_typescript/py.typed -> build/lib.android-24-arm64_v8a-cpython-313/tree_sitter_typescript
  running build_ext
  building '_binding' extension
  creating build/temp.android-24-arm64_v8a-cpython-313/bindings/python/tree_sitter_typescript
  creating build/temp.android-24-arm64_v8a-cpython-313/tsx/src
  creating build/temp.android-24-arm64_v8a-cpython-313/typescript/src
  clang -Wno-error=implicit-function-declaration -Wno-c2y-extensions -Wno-unused-variable -Wno-error=c2y-extensions -fvisibility=default -fPIC -DPy_LIMITED_API=0x03090000 -DPY_SSIZE_T_CLEAN -DTREE_SITTER_HIDE_SYMBOLS -Itypescript/src -I/data/data/com.termux/files/usr/include/python3.13 -c bindings/python/tree_sitter_typescript/binding.c -o build/temp.android-24-arm64_v8a-cpython-313/bindings/python/tree_sitter_typescript/binding.o -std=c11 -fvisibility=hidden
  clang -Wno-error=implicit-function-declaration -Wno-c2y-extensions -Wno-unused-variable -Wno-error=c2y-extensions -fvisibility=default -fPIC -DPy_LIMITED_API=0x03090000 -DPY_SSIZE_T_CLEAN -DTREE_SITTER_HIDE_SYMBOLS -Itypescript/src -I/data/data/com.termux/files/usr/include/python3.13 -c tsx/src/parser.c -o build/temp.android-24-arm64_v8a-cpython-313/tsx/src/parser.o -std=c11 -fvisibility=hidden
  tsx/src/parser.c:2929:14: error: unknown type name 'TSFieldMapSlice'
   2929 | static const TSFieldMapSlice ts_field_map_slices[PRODUCTION_ID_COUNT] = {
        |              ^
  1 error generated.
  error: command '/data/data/com.termux/files/usr/bin/clang' failed with exit code 1
  error: subprocess-exited-with-error
  

Why git clone + local pip install worked:

  • I cloned tree-sitter-typescript repo → local pip install . used your local copy (likely with correct submodules or headers resolved)
  • Build succeeded because the repo includes everything needed (parser generator, bindings) → no external header fetch needed
  • Wheel built locally → pip happy

Why pip install docling failed initially:

  • Pulled 0.23.2 (or HEAD) from PyPI/git → no Android wheel → source build → header missing → error

Now, as for over a year, works again:

Using cached docling-2.77.0-py3-none-any.whl (400 kB)
Using cached huggingface_hub-0.36.2-py3-none-any.whl (566 kB)
Using cached rapidocr-3.7.0-py3-none-any.whl (15.1 MB)
Using cached semchunk-2.2.2-py3-none-any.whl (10 kB)
Using cached transformers-4.57.6-py3-none-any.whl (12.0 MB)
Using cached mpire-2.10.2-py3-none-any.whl (272 kB)
Installing collected packages: tree-sitter-python, tree-sitter-javascript, tree-sitter-c, tree-sitter, pyclipper, mpire, rapidocr, pandas, huggingface_hub, semchunk, transformers, docling
  Attempting uninstall: pandas
    Found existing installation: pandas 3.0.1
    Uninstalling pandas-3.0.1:
      Successfully uninstalled pandas-3.0.1
  Attempting uninstall: huggingface_hub
    Found existing installation: huggingface_hub 1.6.0
    Uninstalling huggingface_hub-1.6.0:
      Successfully uninstalled huggingface_hub-1.6.0
  Attempting uninstall: transformers
    Found existing installation: transformers 5.3.0
    Uninstalling transformers-5.3.0:
      Successfully uninstalled transformers-5.3.0
  Attempting uninstall: docling
    Found existing installation: docling 2.76.0
    Uninstalling docling-2.76.0:
Successfully installed docling-2.77.0 huggingface_hub-0.36.2 mpire-2.10.2 pandas-2.3.3 pyclipper-1.4.0 rapidocr-3.7.0 semchunk-2.2.2 transformers-4.57.6 tree-sitter-0.25.2 tree-sitter-c-0.24.1 tree-sitter-javascript-0.25.0 tree-sitter-python-0.25.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions