Search before asking
Motivation
Hi Paimon community, I'm currently working on integrating Apache Paimon into the Apache Polaris Spark catalog.
During this integration, we noticed an architectural difference in how Paimon handles Spark catalog operations compared to other formats like Delta and Hudi.
In unified catalogs like Polaris, Delta and Hudi achieve seamless integration by extending Spark's DelegatingCatalogExtension.
This allows them to delegate general catalog operations (such as namespace management and table listing) to the underlying unified catalog, while continuing to handle format-specific operations themselves.
The Current Challenge:
Because Paimon's SparkCatalog does not implement this delegation model and directly manages its own warehouse, we cannot configure Polaris as a delegate catalog for Paimon. Instead, we have to rely on a complex hybrid workaround:
-
We route Paimon operations through a separate catalog instance to handle operations directly against a segregated Paimon warehouse.
-
After a table is created, we must manually register it as an external table back into Polaris so it appears in the unified catalog view.
The Request:
Would the Paimon community be open to supporting DelegatingCatalogExtension (or a similar delegation architecture) in its SparkCatalog?
I'd love to hear your thoughts on the feasibility of this and if it's something that could be considered for the roadmap.
Thanks!
Reference:
Add Apache Paimon support to Polaris Spark Catalog
However, while working on the integration tests, I found out that the delegating catalog does not work.
Solution
No response
Anything else?
No response
Are you willing to submit a PR?
Search before asking
Motivation
Hi Paimon community, I'm currently working on integrating Apache Paimon into the Apache Polaris Spark catalog.
During this integration, we noticed an architectural difference in how Paimon handles Spark catalog operations compared to other formats like Delta and Hudi.
In unified catalogs like Polaris, Delta and Hudi achieve seamless integration by extending Spark's
DelegatingCatalogExtension.This allows them to delegate general catalog operations (such as namespace management and table listing) to the underlying unified catalog, while continuing to handle format-specific operations themselves.
The Current Challenge:
Because Paimon's SparkCatalog does not implement this delegation model and directly manages its own warehouse, we cannot configure Polaris as a delegate catalog for Paimon. Instead, we have to rely on a complex hybrid workaround:
We route Paimon operations through a separate catalog instance to handle operations directly against a segregated Paimon warehouse.
After a table is created, we must manually register it as an external table back into Polaris so it appears in the unified catalog view.
The Request:
Would the Paimon community be open to supporting
DelegatingCatalogExtension(or a similar delegation architecture) in its SparkCatalog?I'd love to hear your thoughts on the feasibility of this and if it's something that could be considered for the roadmap.
Thanks!
Reference:
Add Apache Paimon support to Polaris Spark Catalog
However, while working on the integration tests, I found out that the delegating catalog does not work.
Solution
No response
Anything else?
No response
Are you willing to submit a PR?