Describe the bug
When using constants in a dataset view, the query fails.
This is because the code in view uses constants instead of constant which is in the specification and on the java side
changing the view method in pathling/datasource.py to take constant fixes the issues:
To Reproduce
pc = PathlingContext.create(spark)
data_source = pc.read.datasets({"AuditEvent": encoded})
# where encoded is a dataframe of AuditEvents
results = data_source.view(resource="AuditEvent",
constants=[{"name": "requestor", "valueBoolean": "true"}]
select = [
{"column": [{"path": "id", "name": "event_id"}]},
{
"forEach": "agent.where(requestor = %requestor)",
"column": [
{"path": "who.display", "name": "agent_name"},
],
},
],
)
produces the following traceback (truncated for relevance):
File "site-packages/pathling/datasource.py", line 102, in view
return self._wrap_df(jquery.execute())
~~~~~~~~~~~~~~^^
File "site-packages/py4j/java_gateway.py", line 1362, in call
return_value = get_return_value(
answer, self.gateway_client, self.target_id, self.name)
File "site-packages/pyspark/errors/exceptions/captured.py", line 185, in deco
raise converted from None
pyspark.errors.exceptions.captured.IllegalArgumentException: Unknown variable: requestor
Expected behavior
The value of the constant should be injected into the query
Fix
Change the argument name in DataSource.view from constants to constant and change the key in the query args to constant
def view(
self,
resource: Optional[str] = None,
select: Optional[Sequence[Dict]] = None,
constant: Optional[Sequence[Dict]] = None,
where: Optional[Sequence[Dict]] = None,
json: Optional[str] = None,
) -> DataFrame:
"""
Executes a SQL on FHIR view definition and returns the result as a Spark DataFrame.
:param resource: The FHIR resource that the view is based upon, e.g. 'Patient' or
'Observation'.
:param select: A list of columns and nested selects to include in the view.
:param constant: A list of constants that can be used in FHIRPath expressions.
:param where: A list of FHIRPath expressions that can be used to filter the view.
:param json: A JSON string representing the view definition, as an alternative to providing
the parameters as Python objects.
:return: A Spark DataFrame containing the results of the view.
"""
if json:
query_json = json
parsed = loads(json)
resource = parsed.get("resource")
else:
args = locals()
query = {key: args[key] for key in ["resource", "select", "constant", "where"] if args[key] is not None}
query_json = dumps(query)
jquery = self._jds.view(resource)
jquery.json(query_json)
return self._wrap_df(jquery.execute())
Work around
To work around the issue, you can pass the query in as json:
results = data_source.view(json = json.dumps({"resource": "AuditEvent",
"constant" : [{"name": "requestor", "valueBoolean": "true"}],
"select" : [
{"column": [{"path": "id", "name": "event_id"}]},
{
"forEach": "agent.where(requestor = %requestor)",
"column": [
{"path": "who.display", "name": "agent_name"},
],
},
],
})
)
Describe the bug
When using constants in a dataset view, the query fails.
This is because the code in view uses
constantsinstead ofconstantwhich is in the specification and on the java sidechanging the
viewmethod in pathling/datasource.py to takeconstantfixes the issues:To Reproduce
produces the following traceback (truncated for relevance):
File "site-packages/pathling/datasource.py", line 102, in view
return self._wrap_df(jquery.execute())
~~~~~~~~~~~~~~^^
File "site-packages/py4j/java_gateway.py", line 1362, in call
return_value = get_return_value(
answer, self.gateway_client, self.target_id, self.name)
File "site-packages/pyspark/errors/exceptions/captured.py", line 185, in deco
raise converted from None
pyspark.errors.exceptions.captured.IllegalArgumentException: Unknown variable: requestor
Expected behavior
The value of the constant should be injected into the query
Fix
Change the argument name in DataSource.view from constants to constant and change the key in the query args to constant
Work around
To work around the issue, you can pass the query in as json: