Skip to content

[BUG] SedonaContext initialization blocked by AWS Lake Formation Fine-Grained Access Control (FGAC) #2779

@nabilkazi27

Description

@nabilkazi27

When running Apache Sedona in AWS Glue 5.0 with Fine-Grained Access Control (FGAC) enabled (via the --enable-lakeformation-fine-grained-access parameter), the initialization of the SedonaContext fails.
The error appears to be caused by Lake Formation's strict validation engine, which identifies Sedona’s User-Defined Types (UDTs), such as GeometryUDT, as "synthetic types" that are not natively supported by the Glue Data Catalog's restricted execution profile. This prevents the SedonaContext from being created even when using the recommended spark-shaded and geotools-wrapper jars.

Steps to Reproduce

  1. Set up an AWS Glue 5.0 job.
  2. Provide the following job parameters:
    --enable-lakeformation-fine-grained-access: true
    --extra-jars: s3://path/to/sedona-spark-shaded-3.5_2.12-1.5.1.jar,s3://path/to/geotools-wrapper-1.5.1-28.2.jar
  3. Attempt to initialize Sedona in the script:
from sedona.spark.SedonaContext import SedonaContext
from pyspark.sql import SparkSession


spark = SparkSession.builder.getOrCreate()
sedona = SedonaContext.create(spark) # Failure occurs here
  1. Observe the "synthetic type" error in the driver logs.

Environment

  • Sedona Version: 1.8.1
  • Spark Version: 3.5 (Glue 5.0)
  • Deployment: AWS Glue
  • Connector: spark-shaded and geotools-wrapper jars.

Expected Behavior

SedonaContext should initialize successfully even when Lake Formation FGAC is active, or there should be a supported method to register Sedona UDTs so they are not flagged as "synthetic" by the AWS security layer.

Actual Behavior

The job fails during context creation with a log entry indicating that the custom spatial types are unsupported in this restricted security mode.

Logs

INFO  2026-03-18T09:47:43,865  12581  com.amazonaws.services.glue.launch.helpers.OpenTableFormatManager$  [main]  32  Setup data lake format 'iceberg'.
INFO  2026-03-18T09:48:13,925  8703  org.apache.spark.emr.EMRParamSideChannel  [Thread-7]  177  Setting FGAC mode to true
WARN  2026-03-18T09:48:13,927  8705  org.apache.spark.SparkContext  [Thread-7]  72  When executing lakeformation enabled jobs, some user-provided spark configurations may not be applied. Please refer to AWS documentation for the full list of supported configurations.
INFO  2026-03-18T09:48:13,940  8718  org.apache.spark.SparkContext  [Thread-7]  60  Running Spark version 3.5.4-amzn-0
INFO  2026-03-18T09:48:13,941  8719  org.apache.spark.SparkContext  [Thread-7]  60  OS info Linux, 5.10.248-247.988.amzn2.x86_64, amd64
INFO  2026-03-18T09:48:13,942  8720  org.apache.spark.SparkContext  [Thread-7]  60  Java version 17.0.18
INFO  2026-03-18T09:48:14,471  9249  org.apache.spark.fgac.network.plugins.tls.SslOptions  [Thread-7]  60  createNettySslContext: SslOptions{enabled=true, keyStore=Some(/var/certificates/keystore.p12), keyStorePassword=Some(xxx), trustStore=Some(/var/certificates/truststore.p12), trustStorePassword=Some(xxx), crl=Some(/var/certificates/root.crl)protocol=None, enabledAlgorithms=Set()}, endpointIdentificationAlgorithm=None
INFO  2026-03-18T09:48:15,461  10239  org.apache.spark.fgac.network.plugins.tls.TlsTransportPlugin  [Thread-7]  60  Initializing TlsChannelInitializer with ssl context: org.apache.spark.fgac.network.plugins.tls.NettySslContext@21e95207
INFO  2026-03-18T09:48:15,623  10401  org.apache.spark.broadcast.BroadcastManager  [Thread-7]  60  BroadcastManager - initialized using FGAC client
INFO  2026-03-18T09:48:16,042  10820  org.apache.spark.fgac.client.SparkContextClientImpl  [Thread-7]  110  Created SparkContextClient ClientId = 123.
INFO  2026-03-18T09:48:18,749  13527  org.apache.spark.fgac.server.UserDriverServer  [Thread-7]  25  ==================== Starting User Driver Server ====================
INFO  2026-03-18T09:48:19,222  14000  org.apache.spark.fgac.server.UserDriverServer  [Thread-7]  50  ==================== User Driver Server started ====================
INFO  2026-03-18T09:48:19,250  14028  org.apache.spark.fgac.network.plugins.tls.SslOptions  [Thread-7]  60  createNettySslContext: SslOptions{enabled=true, keyStore=Some(/var/certificates/keystore.p12), keyStorePassword=Some(xxx), trustStore=Some(/var/certificates/truststore.p12), trustStorePassword=Some(xxx), crl=Some(/var/certificates/root.crl)protocol=None, enabledAlgorithms=Set()}, endpointIdentificationAlgorithm=None
INFO  2026-03-18T09:48:19,822  14600  org.apache.spark.fgac.network.plugins.tls.TlsTransportPlugin  [Thread-7]  60  Initializing TlsChannelInitializer with ssl context: org.apache.spark.fgac.network.plugins.tls.NettySslContext@64655827
INFO  2026-03-18T09:48:42,426  37204  org.apache.spark.fgac.network.plugins.tls.TlsChannelInitializer  [rpc-server-4-1]  60  initializeServer for connection with
                        client address: ip-XX-XXX-XX-XXX.ec2.internal
INFO  2026-03-18T09:49:01,511  56289  org.apache.spark.fgac.server.SystemDriverInfoReceiverImpl  [grpc-default-executor-0]  48  registerSystemDriver: Received registration of System Driver address: host=ip-XX-XXX-XX-XXX.ec2.internal, port=34119
INFO  2026-03-18T09:49:01,511  56289  org.apache.spark.fgac.client.SparkContextClientImpl  [Thread-7]  381  Creating channel for host ip-XX-XXX-XX-XXX.ec2.internal, port 34119
INFO  2026-03-18T09:49:01,802  56580  org.apache.spark.fgac.client.SparkSessionClientImpl  [Thread-7]  172  Created SparkSessionClient ClientId = 123.
INFO  2026-03-18T09:49:05,287  60065  org.apache.spark.fgac.client.SparkSessionClientImpl  [Thread-7]  277  [XXX] executeCollect.
ERROR  2026-03-18T09:49:59,465  114243  org.apache.spark.fgac.error.SparkFGACExceptionSanitizer  [Thread-7]  76  Client received error with id = XXXX-XXX-XXXX-XXXX-XXXXX, reason = SparkIllegalConfigModificationException, message = spark.sql.optimizer.nestedPredicatePushdown.supportedFileSources
WARN  2026-03-18T09:49:59,466  114244  org.apache.spark.sql.hive.SecureSQLConf  [Thread-7]  72  Config spark.sql.optimizer.nestedPredicatePushdown.supportedFileSources not allowed to be modified on system driver
INFO  2026-03-18T09:50:00,371  115149  org.apache.spark.fgac.client.SparkSessionClientImpl  [Thread-7]  359  [XXX] RegisterFunctionRequest (function=<function1>, expressionInfo=org.apache.spark.sql.catalyst.expressions.ExpressionInfo@4e0d57fd).
ERROR  2026-03-18T09:50:00,879  115657  com.amazonaws.services.glue.ProcessLauncher  [main]  76  Error from Python:Traceback (most recent call last):
  File "<frozen runpy>", line 291, in run_path
  File "<frozen runpy>", line 98, in _run_module_code
  File "<frozen runpy>", line 88, in _run_code
  File "/tmp/glue-job-14352982340482138070/Icerberg Sedona Testing.py", line 54, in <module>
    raise e
  File "/tmp/glue-job-14352982340482138070/Icerberg Sedona Testing.py", line 50, in <module>
    sedona = SedonaContext.create(spark)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/hadoop/.local/lib/python3.11/site-packages/sedona/spark/SedonaContext.py", line 48, in create
    spark._jvm.SedonaContext.create(spark._jsparkSession, "python")
  File "/usr/lib/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/java_gateway.py", line 1322, in __call__
    return_value = get_return_value(
                   ^^^^^^^^^^^^^^^^^
  File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/errors/exceptions/captured.py", line 185, in deco
    raise converted from None
pyspark.errors.exceptions.captured.IllegalArgumentException: Cannot parse synthetic types.
ERROR  2026-03-18T09:50:00,918  115696  com.amazonaws.services.glueexceptionanalysis.GlueExceptionAnalysisListener  [main]  9  [Glue Exception Analysis] {
    "Event": "GlueETLJobExceptionEvent",
    "Timestamp": 1773827400916,
    "Failure Reason": "Traceback (most recent call last):\n  File \"<frozen runpy>\", line 291, in run_path\n  File \"<frozen runpy>\", line 98, in _run_module_code\n  File \"<frozen runpy>\", line 88, in _run_code\n  File \"/tmp/glue-job-14352982340482138070/Icerberg Sedona Testing.py\", line 54, in <module>\n    raise e\n  File \"/tmp/glue-job-14352982340482138070/Icerberg Sedona Testing.py\", line 50, in <module>\n    sedona = SedonaContext.create(spark)\n             ^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/home/hadoop/.local/lib/python3.11/site-packages/sedona/spark/SedonaContext.py\", line 48, in create\n    spark._jvm.SedonaContext.create(spark._jsparkSession, \"python\")\n  File \"/usr/lib/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/java_gateway.py\", line 1322, in __call__\n    return_value = get_return_value(\n                   ^^^^^^^^^^^^^^^^^\n  File \"/usr/lib/spark/python/lib/pyspark.zip/pyspark/errors/exceptions/captured.py\", line 185, in deco\n    raise converted from None\npyspark.errors.exceptions.captured.IllegalArgumentException: Cannot parse synthetic types.",
    "Stack Trace": [
        {
            "Declaring Class": "deco",
            "Method Name": "raise converted from None",
            "File Name": "/usr/lib/spark/python/lib/pyspark.zip/pyspark/errors/exceptions/captured.py",
            "Line Number": 185
        },
        {
            "Declaring Class": "__call__",
            "Method Name": "return_value = get_return_value(",
            "File Name": "/usr/lib/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/java_gateway.py",
            "Line Number": 1322
        },
        {
            "Declaring Class": "create",
            "Method Name": "spark._jvm.SedonaContext.create(spark._jsparkSession, \"python\")",
            "File Name": "/home/hadoop/.local/lib/python3.11/site-packages/sedona/spark/SedonaContext.py",
            "Line Number": 48
        },
        {
            "Declaring Class": "<module>",
            "Method Name": "sedona = SedonaContext.create(spark)",
            "File Name": "/tmp/glue-job-14352982340482138070/Icerberg Sedona Testing.py",
            "Line Number": 50
        },
        {
            "Declaring Class": "<module>",
            "Method Name": "raise e",
            "File Name": "/tmp/glue-job-14352982340482138070/Icerberg Sedona Testing.py",
            "Line Number": 54
        }
    ],
    "Last Executed Line number": 54,
    "script": "Icerberg Sedona Testing.py"
}
ERROR  2026-03-18T09:50:00,919  115697  com.amazonaws.services.glueexceptionanalysis.GlueExceptionAnalysisListener  [main]  9  [Glue Exception Analysis] Last Executed Line number from script Icerberg Sedona Testing.py: 54
INFO  2026-03-18T09:50:01,012  115790  org.apache.spark.fgac.client.SparkContextClientImpl  [shutdown-hook-0]  308  [XXX] shutdownServer client starts.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions