Search before asking
Description
Currently, the Lakehouse-related documentation is deeply nested, and core features like Union Read are scattered across different sections. This makes it difficult for users to understand how to set up and use Fluss as a Streaming Lakehouse.
We need to reorganize the documentation structure to provide a clear, step-by-step guide for Lakehouse deployment and consolidate core concept introductions.
Proposed Changes
- Reorganize "Installation & Deployment"
Installation & Deployment
+-- Overview
+-- Deploying Fluss Cluster
+-- Deploying Local Cluster
+-- Deploying Distributed Cluster
+-- Deploying with Docker
+-- Deploying with Helm Charts
+-- Deploying Streaming Lakehouse
Add a new page for Deploying Streaming Lakehouse, it should cover:
- Introduce "Streaming Lakehouse" Core Concepts
Add a top-level or significant section to explain the mechanics and supported integrations.
Streaming Lakehouse
+-- Lakehouse Overview
+-- Tiering Service
+-- Union Read
+-- DataLake Formats
+-- Iceberg
+-- Paimon
+-- Lance
+-- DataLake Catalogs
- Refine "Maintenance" Section
-
Tiered Storage > Lakehouse Storage: Since deployment tutorials will move to the "Deploying" page, this section should be slimmed down. It should focus on component internals and configuration parameters rather than "how-to" steps.
-
Filesystems: Move this under Tiered Storage, as Filesystems are primarily used for remote storage abstraction.
Willingness to contribute
Search before asking
Description
Currently, the Lakehouse-related documentation is deeply nested, and core features like Union Read are scattered across different sections. This makes it difficult for users to understand how to set up and use Fluss as a Streaming Lakehouse.
We need to reorganize the documentation structure to provide a clear, step-by-step guide for Lakehouse deployment and consolidate core concept introductions.
Proposed Changes
Add a new page for Deploying Streaming Lakehouse, it should cover:
How to setup a Fluss cluster that support data lake, note we can also enable it dynamically via
set_cluster_configsprocedure, see document https://fluss.apache.org/docs/next/engine-flink/procedures/#set_cluster_configsHow to start tiering service
Add a top-level or significant section to explain the mechanics and supported integrations.
Tiered Storage > Lakehouse Storage: Since deployment tutorials will move to the "Deploying" page, this section should be slimmed down. It should focus on component internals and configuration parameters rather than "how-to" steps.
Filesystems: Move this under Tiered Storage, as Filesystems are primarily used for remote storage abstraction.
Willingness to contribute