Skip to content

Handle missing column better in Parquet data loader #173

@wk9874

Description

@wk9874

Currently if you ask for a column which doesnt exist, you get a 500 response from the server, with this debug:

INFO:     127.0.0.1:41280 - "POST /projects/69934fffe7a6e72993ecbe09/samples/699351c7e7a6e72993ecbe0a/data HTTP/1.1" 500 Internal Server Error
ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/home/wk9874/Documents/toktagger/.venv/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 403, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
  File "/home/wk9874/Documents/toktagger/.venv/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__
    return await self.app(scope, receive, send)
  File "/home/wk9874/Documents/toktagger/.venv/lib/python3.10/site-packages/fastapi/applications.py", line 1134, in __call__
    await super().__call__(scope, receive, send)
  File "/home/wk9874/Documents/toktagger/.venv/lib/python3.10/site-packages/starlette/applications.py", line 107, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/home/wk9874/Documents/toktagger/.venv/lib/python3.10/site-packages/starlette/middleware/errors.py", line 186, in __call__
    raise exc
  File "/home/wk9874/Documents/toktagger/.venv/lib/python3.10/site-packages/starlette/middleware/errors.py", line 164, in __call__
    await self.app(scope, receive, _send)
  File "/home/wk9874/Documents/toktagger/.venv/lib/python3.10/site-packages/starlette/middleware/cors.py", line 93, in __call__
    await self.simple_response(scope, receive, send, request_headers=headers)
  File "/home/wk9874/Documents/toktagger/.venv/lib/python3.10/site-packages/starlette/middleware/cors.py", line 144, in simple_response
    await self.app(scope, receive, send)
  File "/home/wk9874/Documents/toktagger/.venv/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 63, in __call__
    await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
  File "/home/wk9874/Documents/toktagger/.venv/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    raise exc
  File "/home/wk9874/Documents/toktagger/.venv/lib/python3.10/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
    await app(scope, receive, sender)
  File "/home/wk9874/Documents/toktagger/.venv/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 18, in __call__
    await self.app(scope, receive, send)
  File "/home/wk9874/Documents/toktagger/.venv/lib/python3.10/site-packages/starlette/routing.py", line 716, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/home/wk9874/Documents/toktagger/.venv/lib/python3.10/site-packages/starlette/routing.py", line 736, in app
    await route.handle(scope, receive, send)
  File "/home/wk9874/Documents/toktagger/.venv/lib/python3.10/site-packages/starlette/routing.py", line 290, in handle
    await self.app(scope, receive, send)
  File "/home/wk9874/Documents/toktagger/.venv/lib/python3.10/site-packages/fastapi/routing.py", line 125, in app
    await wrap_app_handling_exceptions(app, request)(scope, receive, send)
  File "/home/wk9874/Documents/toktagger/.venv/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    raise exc
  File "/home/wk9874/Documents/toktagger/.venv/lib/python3.10/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
    await app(scope, receive, sender)
  File "/home/wk9874/Documents/toktagger/.venv/lib/python3.10/site-packages/fastapi/routing.py", line 111, in app
    response = await f(request)
  File "/home/wk9874/Documents/toktagger/.venv/lib/python3.10/site-packages/fastapi/routing.py", line 391, in app
    raw_response = await run_endpoint_function(
  File "/home/wk9874/Documents/toktagger/.venv/lib/python3.10/site-packages/fastapi/routing.py", line 290, in run_endpoint_function
    return await dependant.call(**values)
  File "/home/wk9874/Documents/toktagger/.venv/lib/python3.10/site-packages/toktagger/api/routers/data.py", line 33, in get_data
    data = data_loader.get_sample(
  File "/home/wk9874/Documents/toktagger/.venv/lib/python3.10/site-packages/pydantic/_internal/_validate_call.py", line 39, in wrapper_function
    return wrapper(*args, **kwargs)
  File "/home/wk9874/Documents/toktagger/.venv/lib/python3.10/site-packages/pydantic/_internal/_validate_call.py", line 136, in __call__
    res = self.__pydantic_validator__.validate_python(pydantic_core.ArgsKwargs(args, kwargs))
  File "/home/wk9874/Documents/toktagger/.venv/lib/python3.10/site-packages/toktagger/api/core/data_loaders.py", line 178, in get_sample
    df = pd.read_parquet(sample_data.file_name, columns=sample_data.signal_names)
  File "/home/wk9874/Documents/toktagger/.venv/lib/python3.10/site-packages/pandas/io/parquet.py", line 669, in read_parquet
    return impl.read(
  File "/home/wk9874/Documents/toktagger/.venv/lib/python3.10/site-packages/pandas/io/parquet.py", line 265, in read
    pa_table = self.api.parquet.read_table(
  File "/home/wk9874/Documents/toktagger/.venv/lib/python3.10/site-packages/pyarrow/parquet/core.py", line 1824, in read_table
    return dataset.read(columns=columns, use_threads=use_threads,
  File "/home/wk9874/Documents/toktagger/.venv/lib/python3.10/site-packages/pyarrow/parquet/core.py", line 1475, in read
    table = self._dataset.to_table(
  File "pyarrow/_dataset.pyx", line 579, in pyarrow._dataset.Dataset.to_table
  File "pyarrow/_dataset.pyx", line 415, in pyarrow._dataset.Dataset.scanner
  File "pyarrow/_dataset.pyx", line 3676, in pyarrow._dataset.Scanner.from_dataset
  File "pyarrow/_dataset.pyx", line 3589, in pyarrow._dataset.Scanner._make_scan_options
  File "pyarrow/_dataset.pyx", line 3539, in pyarrow._dataset._populate_builder
  File "pyarrow/error.pxi", line 92, in pyarrow.lib.check_status
pyarrow.lib.ArrowInvalid: No match for FieldRef.Name(Ip) in ip: double
power_nbi: double
density_gradient: double
t_e_core: double
n_e_core: double
n_e_line: double
dalpha: double
sxr: double
power_radiated: double
time: double
__fragment_index: int32
__batch_index: int32
__last_in_fragment: bool
__filename: string

Should handle this more gracefully and throw an appropriate error message for display in the UI

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions