Use version-hint.text for StaticTable #1887
Conversation
Fokko
left a comment
There was a problem hiding this comment.
Thanks for raising this @arnaudbriche 🙌
How would you feel about:
- Adding a test to make sure that this works, and we don't break it in the future
- Adding a line of two to the docs so people know that it is here
Sure. I'm not doing much Python these days, so I must miss something. I tried: poetry install
poetry run pytestBut I go the following error: |
|
@arnaudbriche That's no problem. Can you try: make install
make testWe use a |
On OSX: Without virtualenv: make install
/bin/sh: pip: command not found
Poetry version does not match required version 2.0.1. Updating...
/bin/sh: pip: command not found
make: *** [install-poetry] Error 127With virtualenv: make install
WARNING: Package(s) not found: poetry
Poetry version does not match required version 2.0.1. Updating...
[notice] A new release of pip is available: 25.0 -> 25.0.1
[notice] To update, run: pip install --upgrade pip
ERROR: Can not perform a '--user' install. User site-packages are not visible in this virtualenv.
make: *** [install-poetry] Error 1 |
|
Ok so I installed the right poetry version by hand. But now, while trying to run the tests, I get: make test
poetry run pytest tests/ -m "(unmarked or parametrize) and not integration"
/Users/arnaudbriche/Desktop/code/iceberg-python/venv/lib/python3.13/site-packages/_pytest/config/__init__.py:331: PluggyTeardownRaisedWarning: A plugin raised an exception during an old-style hookwrapper teardown.
Plugin: helpconfig, Hook: pytest_cmdline_parse
ConftestImportFailure: ModuleNotFoundError: No module named 'pyiceberg' (from /Users/arnaudbriche/Desktop/code/iceberg-python/tests/conftest.py)
For more information see https://pluggy.readthedocs.io/en/stable/api_reference.html#pluggy.PluggyTeardownRaisedWarning
config = pluginmanager.hook.pytest_cmdline_parse(
ImportError while loading conftest '/Users/arnaudbriche/Desktop/code/iceberg-python/tests/conftest.py'.
tests/conftest.py:51: in <module>
from pyiceberg.catalog import Catalog, load_catalog
E ModuleNotFoundError: No module named 'pyiceberg'
make: *** [test] Error 4 |
|
And now: PYTHONPATH=".:$PYTHONPATH" make test
poetry run pytest tests/ -m "(unmarked or parametrize) and not integration"
=========================================================================================== test session starts ===========================================================================================
platform darwin -- Python 3.13.2, pytest-7.4.4, pluggy-1.5.0
rootdir: /Users/arnaudbriche/Desktop/code/iceberg-python
configfile: pyproject.toml
plugins: checkdocs-2.13.0, mock-3.14.0, lazy-fixture-0.6.3, requests-mock-1.12.1
collected 3705 items / 3 errors / 884 deselected / 2821 selected
================================================================================================= ERRORS ==================================================================================================
_______________________________________________________________________________ ERROR collecting tests/catalog/test_hive.py _______________________________________________________________________________
ImportError while importing test module '/Users/arnaudbriche/Desktop/code/iceberg-python/tests/catalog/test_hive.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/opt/homebrew/Cellar/python@3.13/3.13.2/Frameworks/Python.framework/Versions/3.13/lib/python3.13/importlib/__init__.py:88: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
tests/catalog/test_hive.py:24: in <module>
from hive_metastore.ttypes import (
E ModuleNotFoundError: No module named 'hive_metastore'
____________________________________________________________________________ ERROR collecting tests/integration/test_reads.py _____________________________________________________________________________
ImportError while importing test module '/Users/arnaudbriche/Desktop/code/iceberg-python/tests/integration/test_reads.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/opt/homebrew/Cellar/python@3.13/3.13.2/Frameworks/Python.framework/Versions/3.13/lib/python3.13/importlib/__init__.py:88: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
tests/integration/test_reads.py:29: in <module>
from hive_metastore.ttypes import LockRequest, LockResponse, LockState, UnlockRequest
E ModuleNotFoundError: No module named 'hive_metastore'
______________________________________________________________________ ERROR collecting tests/integration/test_writes/test_writes.py ______________________________________________________________________
ImportError while importing test module '/Users/arnaudbriche/Desktop/code/iceberg-python/tests/integration/test_writes/test_writes.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/opt/homebrew/Cellar/python@3.13/3.13.2/Frameworks/Python.framework/Versions/3.13/lib/python3.13/importlib/__init__.py:88: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
tests/integration/test_writes/test_writes.py:40: in <module>
from pyiceberg.catalog.hive import HiveCatalog
pyiceberg/catalog/hive.py:35: in <module>
from hive_metastore.ThriftHiveMetastore import Client
E ModuleNotFoundError: No module named 'hive_metastore'
========================================================================================= short test summary info =========================================================================================
ERROR tests/catalog/test_hive.py
ERROR tests/integration/test_reads.py
ERROR tests/integration/test_writes/test_writes.py
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 3 errors during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
=================================================================================== 884 deselected, 3 errors in 15.41s ====================================================================================
make: *** [test] Error 2 |
|
Hi @Fokko , Do you know if running the test suite on OSX is supported ? Or should I switch to working from a Linux box ? |
|
Hey @arnaudbriche OSX should work fine (using it myself). We vendor the docker run -v `pwd`:/pyiceberg -t -i python:3.12 bash
root@bdb0eca99544:/# cd /pyiceberg/
root@bdb0eca99544:/pyiceberg# pip install poetry --force
root@bdb0eca99544:/pyiceberg# make install
...
root@bdb0eca99544:/pyiceberg# make lint
...
root@bdb0eca99544:/pyiceberg# make test
... |
9b28cb2 to
9f7aa22
Compare
|
Hi @Fokko I have version |
Fokko
left a comment
There was a problem hiding this comment.
Thanks @arnaudbriche This looks great to me! 👍
This change allow making use of the `version-hint.text` file when a static table is instantiated with a `metadata_location` not ending with '.metadata.json'. User can just point to the table location, and metadata file path will be read from `version-hint.text`. Closes apache#763 # Rationale for this change `version-hint.text` is useful in context where you does not want or need a full-fledge catalog. Our use case is sharing datasets publicly as Iceberg tables on S3. # Are these changes tested? No yet. # Are there any user-facing changes? Yes. User can now points `StaticTable` to the table location rather than a specific version file.
| elif content.isnumeric(): | ||
| return os.path.join(metadata_location, "metadata", "v%s.metadata.json").format(content) | ||
| else: | ||
| return os.path.join(metadata_location, "metadata", "%s.metadata.json").format(content) |
There was a problem hiding this comment.
Here is a syntax error, should be "{}".format.
In pytest.fixture table_location, the case metadata_filename = f"{uuid.uuid4()}.metadata.json" does not cover the faulty code path.
There was a problem hiding this comment.
Good call @arnaudbriche 🙌 Maybe go with f-strings instead?
This change allow making use of the
version-hint.textfile when a static table is instantiated with ametadata_locationnot ending with '.metadata.json'.User can just point to the table location, and metadata file path will be read from
version-hint.text.Closes #763
Rationale for this change
version-hint.textis useful in context where you does not want or need a full-fledge catalog.Our use case is sharing datasets publicly as Iceberg tables on S3.
Are these changes tested?
No yet.
Are there any user-facing changes?
Yes. User can now points
StaticTableto the table location rather than a specific version file.