diff --git a/DEVELOPMENT-zh.md b/DEVELOPMENT-zh.md
new file mode 100644
index 000000000..6c096d68f
--- /dev/null
+++ b/DEVELOPMENT-zh.md
@@ -0,0 +1,251 @@
+# 开发指南
+
+本文档为 DataMate 提供全面的本地开发环境搭建和工作流程指南，涵盖 Java、Python、React 三种语言。
+
+## 概述
+
+DataMate 是由多语言（Java 后端、Python 运行时、React 前端）组成的微服务项目，通过 Docker Compose 进行本地开发协调。
+
+## 前置条件
+
+- Git (用于拉取源码)
+- Make (用于构建和安装)
+- Docker (用于构建镜像和部署服务)
+- Docker Compose (用于部署服务 - docker 方式)
+- Kubernetes (用于部署服务 - k8s 方式)
+- Helm (用于部署服务 - k8s 方式)
+
+注意：
+- 确保 Java 和 Python 环境在系统 PATH 中（如适用）
+- Docker Compose 将编排本地开发栈
+
+## 快速开始
+
+### 1. 克隆仓库并安装依赖
+```bash
+git clone git@github.com:ModelEngine-Group/DataMate.git
+cd DataMate
+```
+
+### 2. 启动基础服务
+```bash
+make install
+```
+
+本项目支持 docker-compose 和 helm 两种方式部署，请在执行命令后输入部署部署方式的对应编号，命令回显如下所示：
+```shell
+Choose a deployment method:
+1. Docker/Docker-Compose
+2. Kubernetes/Helm
+Enter choice:
+```
+
+若您使用的机器没有 make，您也可以执行如下命令部署：
+```bash
+REGISTRY=ghcr.io/modelengine-group/ docker compose -f deployment/docker/datamate/docker-compose.yml --profile milvus up -d
+```
+
+当容器运行后，请在浏览器打开 http://localhost:30000 查看前端界面。
+
+### 3. 本地开发部署
+本地代码修改后，请执行以下命令构建镜像并使用本地镜像部署：
+```bash
+make build
+make install dev=true
+```
+
+### 4. 卸载服务
+```bash
+make uninstall
+```
+
+在运行 `make uninstall` 时，卸载流程会只询问一次是否删除卷（数据），该选择会应用到所有组件。卸载顺序为：milvus -> label-studio -> datamate，确保在移除 datamate 网络前，所有使用该网络的服务已先停止。
+
+## 项目结构
+
+```
+DataMate/
+├── backend/              # Java 后端
+│   ├── api-gateway/     # API Gateway
+│   ├── services/          # 核心服务
+│   └── shared/            # 共享库
+├── runtime/              # Python 运行时
+│   ├── datamate-python/   # FastAPI 后端
+│   ├── python-executor/   # Ray 执行器
+│   ├── ops/               # 算子生态
+│   ├── datax/             # DataX 框架
+│   └── deer-flow          # DeerFlow 服务
+├── frontend/             # React 前端
+├── deployment/           # 部署配置
+└── docs/                # 文档
+```
+
+## 开发工作流程
+
+### Java 后端开发
+```bash
+# 构建
+cd backend
+mvn clean install
+
+# 运行测试
+mvn test
+
+# 运行特定服务
+cd backend/services/main-application
+mvn spring-boot:run
+```
+
+### Python 运行时开发
+```bash
+# 安装依赖
+cd runtime/datamate-python
+poetry install
+
+# 运行服务
+poetry run uvicorn app.main:app --reload --port 18000
+
+# 运行测试
+poetry run pytest
+```
+
+### React 前端开发
+```bash
+# 安装依赖
+cd frontend
+npm ci
+
+# 运行开发服务器
+npm run dev
+
+# 构建生产版本
+npm run build
+```
+
+### Docker Compose 开发
+```bash
+# 启动所有服务
+docker compose up -d
+
+# 查看日志
+docker compose logs -f [service-name]
+
+# 停止所有服务
+docker compose down
+```
+
+## 环境配置
+
+每个组件可以有自己的环境变量文件。不要提交包含密钥的 .env 文件。
+
+### 后端（Java）
+- **路径**: `backend/.env`
+- **典型密钥**:
+  - `DB_URL`: 数据库连接字符串
+  - `DB_USER`: 数据库用户名
+  - `DB_PASSWORD`: 数据库密码
+  - `REDIS_URL`: Redis 连接字符串
+  - `REDIS_PASSWORD`: Redis 密码
+  - `JWT_SECRET`: JWT 密钥
+
+### 运行时（Python）
+- **路径**: `runtime/datamate-python/.env`
+- **典型密钥**:
+  - `DATABASE_URL`: PostgreSQL 连接字符串
+  - `RAY_ENABLED`: 是否启用 Ray 执行器
+  - `RAY_ADDRESS`: Ray 集群地址
+  - `LABEL_STUDIO_BASE_URL`: Label Studio 基础 URL
+
+### 前端（React）
+- **路径**: `frontend/.env`
+- **典型密钥**:
+  - `VITE_API_BASE_URL`: API 基础 URL
+  - `VITE_RUNTIME_API_URL`: 运行时 API 基础 URL
+
+## 测试
+
+### Java（JUnit 5）
+```bash
+cd backend
+mvn test
+```
+
+### Python（pytest）
+```bash
+cd runtime/datamate-python
+poetry run pytest
+```
+
+### 前端
+当前未配置测试框架。
+
+## 调试
+
+### Java 后端
+```bash
+# 启用 JDWP 调试端口 5005
+export JAVA_TOOL_OPTIONS='-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=*:5005'
+java -jar backend/main-application/target/*.jar
+```
+
+### Python 运行时
+```bash
+# 启用 debugpy 监听端口 5678
+cd runtime/datamate-python
+python -m debugpy --listen 5678 --wait-for-client -m uvicorn app.main:app --reload --port 18000 --host 0.0.0.0
+```
+
+### React 前端
+使用浏览器开发者工具或 VS Code 调试器。
+
+## 常见问题
+
+### 端口冲突
+检查哪个进程正在使用端口：
+```bash
+lsof -i TCP:8080
+lsof -i TCP:18000
+lsof -i TCP:5173
+```
+停止或重新配置冲突的服务。
+
+### 数据库连接失败
+确保 `.env` 包含正确的 `DATABASE_URL` 和凭据；确保数据库服务在 Docker Compose 中已启动。
+
+### Ray 集群问题
+确保 Ray 已正确启动；检查 Ray 工作进程日志；确保 `RAY_ADDRESS` 配置正确。
+
+## 文档
+
+- **核心文档**:
+  - [ARCHITECTURE.md](./ARCHITECTURE.md) - 系统架构、微服务通信、数据流
+  - [DEVELOPMENT.md](./DEVELOPMENT.md) - 本地开发环境搭建和工作流程
+  - [AGENTS.md](./AGENTS.md) - AI 助手指南和代码规范
+
+- **后端文档**:
+  - [backend/README.md](./backend/README.md) - 后端架构、服务和技术栈
+  - [backend/api-gateway/README.md](./backend/api-gateway/README.md) - API Gateway 配置和路由
+  - [backend/services/main-application/README.md](./backend/services/main-application/README.md) - 主应用模块
+  - [backend/shared/README.md](./backend/shared/README.md) - 共享库（domain-common, security-common）
+
+- **运行时文档**:
+  - [runtime/README.md](./runtime/README.md) - 运行时架构和组件
+  - [runtime/datamate-python/README.md](./runtime/datamate-python/README.md) - FastAPI 后端服务
+  - [runtime/python-executor/README.md](./runtime/python-executor/README.md) - Ray 执行器框架
+  - [runtime/ops/README.md](./runtime/ops/README.md) - 算子生态
+  - [runtime/datax/README.md](./runtime/datax/README.md) - DataX 数据框架
+  - [runtime/deer-flow/README.md](./runtime/deer-flow/README.md) - DeerFlow LLM 服务
+
+- **前端文档**:
+  - [frontend/README.md](./frontend/README.md) - React 前端应用
+
+## 贡献指南
+
+感谢您对本项目的关注！我们非常欢迎社区的贡献，无论是提交 Bug 报告、提出功能建议，还是直接参与代码开发，都能帮助项目变得更好。
+
+• 📮 [GitHub Issues](../../issues)：提交 Bug 或功能建议。
+• 🔧 [GitHub Pull Requests](../../pulls)：贡献代码改进。
+
+## 许可证
+
+DataMate 基于 [MIT](LICENSE) 开源，您可以在遵守许可证条款的前提下自由使用、修改和分发本项目的代码。
diff --git a/DEVELOPMENT.md b/DEVELOPMENT.md
new file mode 100644
index 000000000..bf509f71f
--- /dev/null
+++ b/DEVELOPMENT.md
@@ -0,0 +1,181 @@
+# DEVELOPMENT GUIDE for DataMate
+
+This document provides a comprehensive development guide for DataMate, a polyglot, microservices-based project consisting of Java, Python, and React components. It describes how to set up, build, test, run, and contribute in a local Docker Compose-based environment, without exposing secrets.
+
+<!-- NOTE: This file is intended as a high-level guide. Do not duplicate content from component READMEs; reference them where appropriate. -->
+
+## Overview
+
+DataMate is composed of multiple services (Java backend, Python runtime, and React frontend) coordinated via Docker Compose for local development. The guide below covers prerequisites, quick-start steps, project structure, development workflow, environment configuration, testing, debugging, common issues, documentation, contribution workflow, and licensing.
+
+Refer to the component READMEs for detailed implementation notes:
+- Backend: backend/README.md
+- Runtime: runtime/datamate-python/README.md
+- Frontend: frontend/README.md
+
+For code style guidelines, see AGENTS.md in the repository root.
+
+## Prerequisites
+
+- Java Development: JDK 21 and Maven
+- Python: Python 3.12 and Poetry
+- Node.js: Node.js 18
+- Docker and Docker Compose
+- Optional: Make (for convenience)
+
+Notes:
+- Ensure Java and Python environments are on the system PATH where applicable.
+- Docker Compose will orchestrate the local development stack.
+
+## Quick Start
+
+1) Clone the repository and install dependencies:
+- git clone https://github.com/your-org/datemate.git
+- cd datemate
+- (Optional) Create and activate a Python virtual environment if not using Poetry-managed envs.
+- Build dependencies per component as described below.
+
+2) Start the local stack with Docker Compose:
+- docker compose up -d
+- This brings up the Java backend, Python runtime, and React frontend services along with any required databases and caches as defined in the docker-compose.yml.
+
+3) Start individual components (if you prefer not to use the Docker stack):
+- Java backend
+  - mvn -f backend/pom.xml -DskipTests package
+  - Run the main application (path may vary): java -jar backend/main-application/target/*.jar
+- Python runtime
+  - cd runtime/datamate-python
+  - poetry install
+  - uvicorn app.main:app --reload --port 18000 --host 0.0.0.0
+- React frontend
+  - cd frontend
+  - npm ci
+  - npm run dev
+
+4) Stop the stack:
+- docker compose down
+
+> Tip: In a team setting, prefer Docker Compose for consistency across development environments.
+
+## Project Structure
+
+- backend/
+- frontend/
+- runtime/
+- deployment/
+- docs/
+- AGENTS.md (code style guidelines)
+- docker/ (docker-related tooling)
+- .env* files (per-component configurations, see Environment Configuration section)
+
+This is a polyglot project with the following language footprints:
+- Java for the backend services under backend/
+- Python for the runtime under runtime/datamate-python/
+- React/TypeScript for the frontend under frontend/
+
+## Development Workflow
+
+Language-specific workflows:
+
+- Java (Backend)
+  - Build: mvn -f backend/pom.xml -DskipTests package
+  - Test: mvn -f backend/pom.xml test
+  - Run: mvn -f backend/pom.xml -Dexec.mainClass=... spring-boot:run (or run the packaged jar)
+- Python (Runtime)
+  - Install: cd runtime/datamate-python && poetry install
+  - Test: pytest
+  - Run: uvicorn app.main:app --reload --port 18000 --host 0.0.0.0
+- Frontend (React)
+  - Install: cd frontend && npm ci
+  - Test: No frontend tests configured
+  - Build: npm run build
+  - Run: npm run dev
+
+General tips:
+- Use Docker Compose for a repeatable local stack.
+- Run linters and tests before creating PRs.
+- Keep dependencies in sync across environments.
+
+## Environment Configuration
+
+Each component can have its own environment file(s). Do not commit secrets. Use sample/.env.example files as references when available.
+
+- Backend
+  - Path: backend/.env (example keys below)
+  - Typical keys: DB_URL, DB_USER, DB_PASSWORD, JWT_SECRET, REDIS_URL, CLOUD_STORAGE_ENDPOINT
+- Runtime (Python)
+  - Path: runtime/datamate-python/.env
+  - Typical keys: DATABASE_URL, RAY_ADDRESS, CELERY_BROKER_URL, APP_SETTINGS
+- Frontend
+  - Path: frontend/.env
+  - Typical keys: VITE_API_BASE_URL, VITE_DEFAULT_LOCALE, NODE_ENV
+
+Notes:
+- Copy the corresponding .env.example to .env and fill in values as needed.
+- Do not commit .env files containing secrets.
+
+## Testing
+
+- Java: JUnit 5 tests run via Maven (mvn test).
+- Python: pytest in runtime/datamate-python/test or relevant tests.
+- Frontend: No frontend tests configured in this repo.
+
+## Code Style
+
+Code style follows the repository-wide guidelines described in AGENTS.md. See:
+- AGENTS.md (root): Code style guidelines for all languages.
+- Java: Follow Java conventions in backend/ and accordance with project conventions.
+- Python: Follow PEP 8 and project-specific conventions in runtime/datamate-python.
+- React: Follow the frontend conventions in frontend/ (TypeScript/TSX).
+
+Link to guidelines: AGENTS.md
+
+## Debugging
+
+- Java (Backend): Enable JPDA debugging by starting the JVM with a debug port and attach a debugger.
+  - Example (local): export JAVA_TOOL_OPTIONS='-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=*:5005' && java -jar path/to/app.jar
+  - Attach with IDE on port 5005 after launch.
+- Python (Runtime): Run with debugpy listening on port 5678 to attach from IDEs.
+  - Example: cd runtime/datamate-python && poetry install
+    python -m debugpy --listen 5678 --wait-for-client -m uvicorn app.main:app --reload --port 18000 --host 0.0.0.0
+- Frontend (React): Use Node inspector to debug front-end code in dev server.
+  - Example: npm run dev -- --inspect-brk=9229
+
+Tips: Use your preferred IDEs (IntelliJ/VSCode/WebStorm) to attach to the running processes on their respective ports.
+
+## Common Issues
+
+- Port conflicts: Check which process is using a port with lsof -i TCP:<PORT> or ss -ltnp. Stop or reconfigure conflicting services.
+- Database connection errors: Ensure .env contains correct DATABASE_URL and credentials; ensure the database service is up in Docker Compose.
+- Ray cluster issues (Python runtime): Ensure Ray is started and accessible at the configured RAY_ADDRESS; check logs for worker failures and bootstrap status.
+
+## Documentation
+
+Component READMEs provide detailed usage and design decisions. See:
+- backend/README.md
+- runtime/datamate-python/README.md
+- frontend/README.md
+- deployment/README.md
+
+## Contributing
+
+Contributions follow a PR workflow:
+- Create a feature/bugfix branch from main (e.g., feature/new-action)
+- Implement changes with tests where applicable
+- Run unit tests for the changed components
+- Open a PR with a clear description of the changes and the rationale
+- Ensure CI checks pass (build, unit tests, lint)
+- Obtain reviews and address feedback
+- Merge to main after approval
+
+## License
+
+Apache 2.0
+
+---
+
+References:
+- AGENTS.md for code style guidelines: AGENTS.md
+- Java dependencies: backend/pom.xml
+- Node dependencies: frontend/package.json
+- Python dependencies: runtime/datamate-python/pyproject.toml
diff --git a/README-zh.md b/README-zh.md
index 91e443d3a..1d4987ed0 100644
--- a/README-zh.md
+++ b/README-zh.md
@@ -110,6 +110,29 @@ make uninstall
 
 在运行 `make uninstall` 时，卸载流程会只询问一次是否删除卷（数据），该选择会应用到所有组件。卸载顺序为：milvus -> label-studio -> datamate，确保在移除 datamate 网络前，所有使用该网络的服务已先停止。
 
+## 📚 文档
+
+### 核心文档
+- **[DEVELOPMENT.md](./DEVELOPMENT.md)** - 本地开发环境搭建和工作流程
+- **[AGENTS.md](./AGENTS.md)** - AI 助手指南和代码规范
+
+### 后端文档
+- **[backend/README-zh.md](./backend/README-zh.md)** - 后端架构、服务和技术栈
+- **[backend/api-gateway/README-zh.md](./backend/api-gateway/README-zh.md)** - API Gateway 配置和路由
+- **[backend/services/main-application/README-zh.md](./backend/services/main-application/README-zh.md)** - 主应用模块
+- **[backend/shared/README-zh.md](./backend/shared/README-zh.md)** - 共享库（domain-common, security-common）
+
+### 运行时文档
+- **[runtime/README-zh.md](./runtime/README-zh.md)** - 运行时架构和组件
+- **[runtime/datamate-python/README-zh.md](./runtime/datamate-python/README-zh.md)** - FastAPI 后端服务
+- **[runtime/python-executor/README-zh.md](./runtime/python-executor/README-zh.md)** - Ray 执行器框架
+- **[runtime/ops/README.md](./runtime/ops/README.md)** - 算子生态
+- **[runtime/datax/README-zh.md](./runtime/datax/README-zh.md)** - DataX 数据框架
+- **[runtime/deer-flow/README-zh.md](./runtime/deer-flow/README-zh.md)** - DeerFlow LLM 服务
+
+### 前端文档
+- **[frontend/README-zh.md](./frontend/README-zh.md)** - React 前端应用
+
 ## 🤝 贡献指南
 
 感谢您对本项目的关注！我们非常欢迎社区的贡献，无论是提交 Bug 报告、提出功能建议，还是直接参与代码开发，都能帮助项目变得更好。
diff --git a/README.md b/README.md
index 8b30c5973..97ee80593 100644
--- a/README.md
+++ b/README.md
@@ -113,10 +113,33 @@ make uninstall
 
 When running make uninstall, the installer will prompt once whether to delete volumes; that single choice is applied to all components. The uninstall order is: milvus -> label-studio -> datamate, which ensures the datamate network is removed cleanly after services that use it have stopped.
 
+## 📚 Documentation
+
+### Core Documentation
+- **[DEVELOPMENT.md](./DEVELOPMENT.md)** - Local development environment setup and workflow
+- **[AGENTS.md](./AGENTS.md)** - AI assistant guidelines and code style
+
+### Backend Documentation
+- **[backend/README.md](./backend/README.md)** - Backend architecture, services, and technology stack
+- **[backend/api-gateway/README.md](./backend/api-gateway/README.md)** - API Gateway configuration and routing
+- **[backend/services/main-application/README.md](./backend/services/main-application/README.md)** - Main application modules
+- **[backend/shared/README.md](./backend/shared/README.md)** - Shared libraries (domain-common, security-common)
+
+### Runtime Documentation
+- **[runtime/README.md](./runtime/README.md)** - Runtime architecture and components
+- **[runtime/datamate-python/README.md](./runtime/datamate-python/README.md)** - FastAPI backend service
+- **[runtime/python-executor/README.md](./runtime/python-executor/README.md)** - Ray executor framework
+- **[runtime/ops/README.md](./runtime/ops/README.md)** - Operator ecosystem
+- **[runtime/datax/README.md](./runtime/datax/README.md)** - DataX data framework
+- **[runtime/deer-flow/README.md](./runtime/deer-flow/README.md)** - DeerFlow LLM service
+
+### Frontend Documentation
+- **[frontend/README.md](./frontend/README.md)** - React frontend application
+
 ## 🤝 Contribution Guidelines
 
 Thank you for your interest in this project! We warmly welcome contributions from the community. Whether it's submitting
-bug reports, suggesting new features, or directly participating in code development, all forms of help make the project
+bug reports, suggesting new features, or directly participating in code development, all forms of help make a project
 better.
 
 • 📮 [GitHub Issues](../../issues): Submit bugs or feature suggestions.
diff --git a/backend/README-zh.md b/backend/README-zh.md
new file mode 100644
index 000000000..cdf749b63
--- /dev/null
+++ b/backend/README-zh.md
@@ -0,0 +1,137 @@
+# DataMate 后端
+
+## 概述
+
+DataMate 后端是基于 Spring Boot 3.5 + Java 21 的微服务架构，提供数据管理、RAG 索引、API 网关等核心功能。
+
+## 架构
+
+```
+backend/
+├── api-gateway/          # API Gateway + 认证
+├── services/
+│   ├── data-management-service/  # 数据集管理
+│   ├── rag-indexer-service/      # RAG 索引
+│   └── main-application/         # 主应用入口
+└── shared/
+    ├── domain-common/    # DDD 构建块、异常处理
+    └── security-common/  # JWT 工具
+```
+
+## 服务
+
+| 服务 | 端口 | 描述 |
+|---------|-------|-------------|
+| **main-application** | 8080 | 主应用，包含数据管理、数据清洗、算子市场等模块 |
+| **api-gateway** | 8080 | API Gateway，路由转发和认证 |
+
+## 技术栈
+
+- **框架**: Spring Boot 3.5.6, Spring Cloud 2025.0.0
+- **语言**: Java 21
+- **数据库**: PostgreSQL 8.0.33 + MyBatis-Plus 3.5.14
+- **缓存**: Redis 3.2.0
+- **向量数据库**: Milvus (via SDK 2.6.6)
+- **文档**: SpringDoc OpenAPI 2.2.0
+- **构建**: Maven
+
+## 依赖
+
+### 外部服务
+- **PostgreSQL**: `datamate-database:5432`
+- **Redis**: `datamate-redis:6379`
+- **Milvus**: 向量数据库（RAG 索引）
+
+### 共享库
+- **domain-common**: 业务异常、系统参数、领域实体基类
+- **security-common**: JWT 工具、认证辅助
+
+## 快速开始
+
+### 前置条件
+- JDK 21+
+- Maven 3.8+
+- PostgreSQL 12+
+- Redis 6+
+
+### 构建
+```bash
+cd backend
+mvn clean install
+```
+
+### 运行主应用
+```bash
+cd backend/services/main-application
+mvn spring-boot:run
+```
+
+### 运行 API Gateway
+```bash
+cd backend/api-gateway
+mvn spring-boot:run
+```
+
+## 开发
+
+### 模块结构 (DDD)
+```
+com.datamate.{module}/
+├── interfaces/
+│   ├── rest/       # Controllers
+│   ├── dto/        # Request/Response DTOs
+│   ├── converter/   # MapStruct converters
+│   └── validation/  # Custom validators
+├── application/     # Application services
+├── domain/
+│   ├── model/       # Entities
+│   └── repository/  # Repository interfaces
+└── infrastructure/
+    ├── persistence/  # Repository implementations
+    ├── client/       # External API clients
+    └── config/       # Service configuration
+```
+
+### 代码约定
+- **实体**: Extend `BaseEntity<ID>`, use `@TableName("t_*")`
+- **控制器**: `@RestController` + `@RequiredArgsConstructor`
+- **服务**: `@Service` + `@Transactional`
+- **错误处理**: `throw BusinessException.of(ErrorCode.XXX)`
+- **MapStruct**: `@Mapper(componentModel = "spring")`
+
+## 测试
+
+```bash
+# 运行所有测试
+mvn test
+
+# 运行特定测试
+mvn test -Dtest=ClassName#methodName
+
+# 运行特定模块测试
+mvn -pl services/data-management-service -am test
+```
+
+## 配置
+
+### 环境变量
+- `DB_USERNAME`: 数据库用户名
+- `DB_PASSWORD`: 数据库密码
+- `REDIS_PASSWORD`: Redis 密码
+- `JWT_SECRET`: JWT 密钥
+
+### 配置文件
+- `application.yml`: 默认配置
+- `application-dev.yml`: 开发环境覆盖
+
+## 文档
+
+- **API 文档**: http://localhost:8080/api/swagger-ui.html
+- **AGENTS.md**: 见 `backend/shared/AGENTS.md` 获取共享库文档
+- **服务文档**: 见各服务 README
+
+## 相关链接
+
+- [Spring Boot 文档](https://docs.spring.io/spring-boot/)
+- [MyBatis-Plus 文档](https://baomidou.com/)
+- [PostgreSQL 文档](https://www.postgresql.org/docs/)
diff --git a/backend/README.md b/backend/README.md
new file mode 100644
index 000000000..fb5bb4727
--- /dev/null
+++ b/backend/README.md
@@ -0,0 +1,137 @@
+# DataMate Backend
+
+## Overview
+
+DataMate Backend is a microservices architecture based on Spring Boot 3.5 + Java 21, providing core functions such as data management, RAG indexing, and API gateway.
+
+## Architecture
+
+```
+backend/
+├── api-gateway/          # API Gateway + Authentication
+├── services/
+│   ├── data-management-service/  # Dataset management
+│   ├── rag-indexer-service/      # RAG indexing
+│   └── main-application/         # Main application entry
+└── shared/
+    ├── domain-common/    # DDD building blocks, exception handling
+    └── security-common/  # JWT utilities
+```
+
+## Services
+
+| Service | Port | Description |
+|---------|-------|-------------|
+| **main-application** | 8080 | Main application, includes data management, data cleaning, operator marketplace modules |
+| **api-gateway** | 8080 | API Gateway, route forwarding and authentication |
+
+## Technology Stack
+
+- **Framework**: Spring Boot 3.5.6, Spring Cloud 2025.0.0
+- **Language**: Java 21
+- **Database**: PostgreSQL 8.0.33 + MyBatis-Plus 3.5.14
+- **Cache**: Redis 3.2.0
+- **Vector DB**: Milvus (via SDK 2.6.6)
+- **Documentation**: SpringDoc OpenAPI 2.2.0
+- **Build**: Maven
+
+## Dependencies
+
+### External Services
+- **PostgreSQL**: `datamate-database:5432`
+- **Redis**: `datamate-redis:6379`
+- **Milvus**: Vector database (RAG indexing)
+
+### Shared Libraries
+- **domain-common**: Business exceptions, system parameters, domain entity base classes
+- **security-common**: JWT utilities, auth helpers
+
+## Quick Start
+
+### Prerequisites
+- JDK 21+
+- Maven 3.8+
+- PostgreSQL 12+
+- Redis 6+
+
+### Build
+```bash
+cd backend
+mvn clean install
+```
+
+### Run Main Application
+```bash
+cd backend/services/main-application
+mvn spring-boot:run
+```
+
+### Run API Gateway
+```bash
+cd backend/api-gateway
+mvn spring-boot:run
+```
+
+## Development
+
+### Module Structure (DDD)
+```
+com.datamate.{module}/
+├── interfaces/
+│   ├── rest/       # Controllers
+│   ├── dto/        # Request/Response DTOs
+│   ├── converter/   # MapStruct converters
+│   └── validation/  # Custom validators
+├── application/     # Application services
+├── domain/
+│   ├── model/       # Entities
+│   └── repository/  # Repository interfaces
+└── infrastructure/
+    ├── persistence/  # Repository implementations
+    ├── client/       # External API clients
+    └── config/       # Service configuration
+```
+
+### Code Conventions
+- **Entities**: Extend `BaseEntity<ID>`, use `@TableName("t_*")`
+- **Controllers**: `@RestController` + `@RequiredArgsConstructor`
+- **Services**: `@Service` + `@Transactional`
+- **Error Handling**: `throw BusinessException.of(ErrorCode.XXX)`
+- **MapStruct**: `@Mapper(componentModel = "spring")`
+
+## Testing
+
+```bash
+# Run all tests
+mvn test
+
+# Run specific test
+mvn test -Dtest=ClassName#methodName
+
+# Run specific module tests
+mvn -pl services/data-management-service -am test
+```
+
+## Configuration
+
+### Environment Variables
+- `DB_USERNAME`: Database username
+- `DB_PASSWORD`: Database password
+- `REDIS_PASSWORD`: Redis password
+- `JWT_SECRET`: JWT secret key
+
+### Profiles
+- `application.yml`: Default configuration
+- `application-dev.yml`: Development overrides
+
+## Documentation
+
+- **API Docs**: http://localhost:8080/api/swagger-ui.html
+- **AGENTS.md**: See `backend/shared/AGENTS.md` for shared libraries documentation
+- **Service Docs**: See individual service READMEs
+
+## Related Links
+
+- [Spring Boot Documentation](https://docs.spring.io/spring-boot/)
+- [MyBatis-Plus Documentation](https://baomidou.com/)
+- [PostgreSQL Documentation](https://www.postgresql.org/docs/)
diff --git a/backend/api-gateway/README-zh.md b/backend/api-gateway/README-zh.md
new file mode 100644
index 000000000..a300f7f74
--- /dev/null
+++ b/backend/api-gateway/README-zh.md
@@ -0,0 +1,130 @@
+# API Gateway
+
+## 概述
+
+API Gateway 是 DataMate 的统一入口，基于 Spring Cloud Gateway 实现，负责路由转发、JWT 认证和限流。
+
+## 架构
+
+```
+backend/api-gateway/
+├── src/main/java/com/datamate/gateway/
+│   ├── config/         # Gateway 配置
+│   ├── filter/         # JWT 认证过滤器
+│   └── route/          # 路由定义
+└── src/main/resources/
+    └── application.yml   # Gateway 配置
+```
+
+## 配置
+
+### 端口
+- **默认**: 8080
+- **Nacos 发现端口**: 30000
+
+### 关键配置
+```yaml
+spring:
+  application:
+    name: datamate-gateway
+  cloud:
+    nacos:
+      discovery:
+        port: 30000
+        server-addr: ${NACOS_ADDR}
+        username: consul
+        password:
+datamate:
+  jwt:
+    secret: ${JWT_SECRET}
+    expiration-seconds: 3600
+```
+
+## 功能
+
+### 1. 路由转发
+- 将前端请求转发到对应的后端服务
+- 支持负载均衡
+- 路径重写
+
+### 2. JWT 认证
+- 基于 JWT Token 的认证
+- Token 验证和过期检查
+- 用户上下文传递
+
+### 3. 限流
+- （如配置）请求频率限制
+- 防止 API 滥用
+
+## 快速开始
+
+### 前置条件
+- JDK 21+
+- Maven 3.8+
+- Nacos 服务（如果使用服务发现）
+
+### 构建
+```bash
+cd backend/api-gateway
+mvn clean install
+```
+
+### 运行
+```bash
+cd backend/api-gateway
+mvn spring-boot:run
+```
+
+## 开发
+
+### 添加新路由
+在 `application.yml` 或通过 Nacos 配置路由规则：
+
+```yaml
+spring:
+  cloud:
+    gateway:
+      routes:
+        - id: data-management
+          uri: lb://data-management-service
+          predicates:
+            - Path=/api/data-management/**
+          filters:
+            - StripPrefix=3
+```
+
+### 添加自定义过滤器
+创建 `GlobalFilter` 或 `GatewayFilter`：
+
+```java
+@Component
+public class AuthFilter implements GlobalFilter {
+    @Override
+    public Mono<Void> filter(ServerWebExchange exchange, GatewayFilterChain chain) {
+        // 过滤逻辑
+        return chain.filter(exchange);
+    }
+}
+```
+
+## 测试
+
+### 测试路由转发
+```bash
+curl http://localhost:8080/api/data-management/datasets
+```
+
+### 测试 JWT 认证
+```bash
+curl -H "Authorization: Bearer <token>" http://localhost:8080/api/protected-endpoint
+```
+
+## 文档
+
+- **Spring Cloud Gateway 文档**: https://docs.spring.io/spring-cloud-gateway/
+- **Nacos 发现**: https://nacos.io/
+
+## 相关链接
+
+- [后端 README](../README.md)
+- [主应用 README](../services/main-application/README.md)
diff --git a/backend/api-gateway/README.md b/backend/api-gateway/README.md
new file mode 100644
index 000000000..23ef8fbf5
--- /dev/null
+++ b/backend/api-gateway/README.md
@@ -0,0 +1,130 @@
+# API Gateway
+
+## Overview
+
+API Gateway is DataMate's unified entry point, built on Spring Cloud Gateway, responsible for route forwarding, JWT authentication, and rate limiting.
+
+## Architecture
+
+```
+backend/api-gateway/
+├── src/main/java/com/datamate/gateway/
+│   ├── config/         # Gateway configuration
+│   ├── filter/         # JWT authentication filter
+│   └── route/          # Route definitions
+└── src/main/resources/
+    └── application.yml   # Gateway configuration
+```
+
+## Configuration
+
+### Port
+- **Default**: 8080
+- **Nacos Discovery Port**: 30000
+
+### Key Configuration
+```yaml
+spring:
+  application:
+    name: datamate-gateway
+  cloud:
+    nacos:
+      discovery:
+        port: 30000
+        server-addr: ${NACOS_ADDR}
+        username: consul
+        password:
+datamate:
+  jwt:
+    secret: ${JWT_SECRET}
+    expiration-seconds: 3600
+```
+
+## Features
+
+### 1. Route Forwarding
+- Forward frontend requests to corresponding backend services
+- Support for load balancing
+- Path rewriting
+
+### 2. JWT Authentication
+- JWT Token-based authentication
+- Token validation and expiration checking
+- User context propagation
+
+### 3. Rate Limiting
+- Request rate limiting (if configured)
+- Prevent API abuse
+
+## Quick Start
+
+### Prerequisites
+- JDK 21+
+- Maven 3.8+
+- Nacos service (if using service discovery)
+
+### Build
+```bash
+cd backend/api-gateway
+mvn clean install
+```
+
+### Run
+```bash
+cd backend/api-gateway
+mvn spring-boot:run
+```
+
+## Development
+
+### Adding New Routes
+Configure route rules in `application.yml` or via Nacos:
+
+```yaml
+spring:
+  cloud:
+    gateway:
+      routes:
+        - id: data-management
+          uri: lb://data-management-service
+          predicates:
+            - Path=/api/data-management/**
+          filters:
+            - StripPrefix=3
+```
+
+### Adding Custom Filters
+Create a `GlobalFilter` or `GatewayFilter`:
+
+```java
+@Component
+public class AuthFilter implements GlobalFilter {
+    @Override
+    public Mono<Void> filter(ServerWebExchange exchange, GatewayFilterChain chain) {
+        // Filter logic
+        return chain.filter(exchange);
+    }
+}
+```
+
+## Testing
+
+### Test Route Forwarding
+```bash
+curl http://localhost:8080/api/data-management/datasets
+```
+
+### Test JWT Authentication
+```bash
+curl -H "Authorization: Bearer <token>" http://localhost:8080/api/protected-endpoint
+```
+
+## Documentation
+
+- **Spring Cloud Gateway Docs**: https://docs.spring.io/spring-cloud-gateway/
+- **Nacos Discovery**: https://nacos.io/
+
+## Related Links
+
+- [Backend README](../README.md)
+- [Main Application README](../services/main-application/README.md)
diff --git a/backend/services/main-application/README-zh.md b/backend/services/main-application/README-zh.md
new file mode 100644
index 000000000..1568c5a20
--- /dev/null
+++ b/backend/services/main-application/README-zh.md
@@ -0,0 +1,112 @@
+# 主应用
+
+## 概述
+
+主应用是 DataMate 的核心 Spring Boot 服务，包含数据管理、数据清洗、算子市场、数据收集等主要功能模块。
+
+## 架构
+
+```
+backend/services/main-application/
+├── src/main/java/com/datamate/main/
+│   ├── interfaces/
+│   │   ├── rest/       # Controllers
+│   │   ├── dto/        # Request/Response DTOs
+│   │   └── converter/   # MapStruct converters
+│   ├── application/     # Application services
+│   ├── domain/
+│   │   ├── model/       # Entities
+│   │   └── repository/  # Repository interfaces
+│   └── infrastructure/
+│       ├── persistence/  # Repository implementations
+│       ├── client/       # External API clients
+│       └── config/       # Service configuration
+└── src/main/resources/
+    ├── application.yml                # 主配置
+    ├── config/application-datamanagement.yml  # 数据管理配置
+    └── config/application-datacollection.yml   # 数据收集配置
+```
+
+## 模块
+
+### 1. 数据管理
+- 数据集 CRUD 操作
+- 文件上传/下载
+- 标签管理
+- 数据集版本控制
+
+### 2. 数据收集
+- 数据源配置
+- 定时数据收集任务
+- 数据同步
+- 数据导入/导出
+
+## 配置
+
+### 端口
+- **默认**: 8080
+- **上下文路径**: `/api`
+
+### 关键配置
+```yaml
+server:
+  port: 8080
+  servlet:
+    context-path: /api
+
+datamate:
+  data-management:
+    base-path: /dataset
+```
+
+## 快速开始
+
+### 前置条件
+- JDK 21+
+- Maven 3.8+
+- PostgreSQL 12+
+- Redis 6+
+
+### 构建
+```bash
+cd backend/services/main-application
+mvn clean install
+```
+
+### 运行
+```bash
+cd backend/services/main-application
+mvn spring-boot:run
+```
+
+## 开发
+
+### 添加新模块
+1. 在 `domain/model/` 创建实体类
+2. 在 `domain/repository/` 创建 repository 接口
+3. 在 `infrastructure/persistence/` 实现 repository
+4. 在 `application/` 创建 application service
+5. 在 `interfaces/rest/` 创建 controller
+
+## 测试
+
+### 运行测试
+```bash
+cd backend/services/main-application
+mvn test
+```
+
+### 运行特定测试
+```bash
+mvn test -Dtest=DatasetControllerTest
+```
+
+## 文档
+
+- **Spring Boot 文档**: https://docs.spring.io/spring-boot/
+- [AGENTS.md](../../shared/AGENTS.md)
+
+## 相关链接
+
+- [后端 README](../../README.md)
+- [API Gateway README](../../api-gateway/README.md)
diff --git a/backend/services/main-application/README.md b/backend/services/main-application/README.md
new file mode 100644
index 000000000..51b4c65c5
--- /dev/null
+++ b/backend/services/main-application/README.md
@@ -0,0 +1,112 @@
+# Main Application
+
+## Overview
+
+The Main Application is DataMate's core Spring Boot service, containing major functional modules including data management, data cleaning, operator marketplace, and data collection.
+
+## Architecture
+
+```
+backend/services/main-application/
+├── src/main/java/com/datamate/main/
+│   ├── interfaces/
+│   │   ├── rest/       # Controllers
+│   │   ├── dto/        # Request/Response DTOs
+│   │   └── converter/   # MapStruct converters
+│   ├── application/     # Application services
+│   ├── domain/
+│   │   ├── model/       # Entities
+│   │   └── repository/  # Repository interfaces
+│   └── infrastructure/
+│       ├── persistence/  # Repository implementations
+│       ├── client/       # External API clients
+│       └── config/       # Service configuration
+└── src/main/resources/
+    ├── application.yml                # Main configuration
+    ├── config/application-datamanagement.yml  # Data management config
+    └── config/application-datacollection.yml   # Data collection config
+```
+
+## Modules
+
+### 1. Data Management
+- Dataset CRUD operations
+- File upload/download
+- Tag management
+- Dataset versioning
+
+### 2. Data Collection
+- Data source configuration
+- Scheduled data collection tasks
+- Data synchronization
+- Data import/export
+
+## Configuration
+
+### Port
+- **Default**: 8080
+- **Context Path**: `/api`
+
+### Key Configuration
+```yaml
+server:
+  port: 8080
+  servlet:
+    context-path: /api
+
+datamate:
+  data-management:
+    base-path: /dataset
+```
+
+## Quick Start
+
+### Prerequisites
+- JDK 21+
+- Maven 3.8+
+- PostgreSQL 12+
+- Redis 6+
+
+### Build
+```bash
+cd backend/services/main-application
+mvn clean install
+```
+
+### Run
+```bash
+cd backend/services/main-application
+mvn spring-boot:run
+```
+
+## Development
+
+### Adding a New Module
+1. Create entity class in `domain/model/`
+2. Create repository interface in `domain/repository/`
+3. Implement repository in `infrastructure/persistence/`
+4. Create application service in `application/`
+5. Create controller in `interfaces/rest/`
+
+## Testing
+
+### Run Tests
+```bash
+cd backend/services/main-application
+mvn test
+```
+
+### Run Specific Test
+```bash
+mvn test -Dtest=DatasetControllerTest
+```
+
+## Documentation
+
+- **Spring Boot Docs**: https://docs.spring.io/spring-boot/
+- [AGENTS.md](../../shared/AGENTS.md)
+
+## Related Links
+
+- [Backend README](../../README.md)
+- [API Gateway README](../../api-gateway/README.md)
diff --git a/backend/shared/README-zh.md b/backend/shared/README-zh.md
new file mode 100644
index 000000000..d2dc48abf
--- /dev/null
+++ b/backend/shared/README-zh.md
@@ -0,0 +1,144 @@
+# 共享库
+
+## 概述
+
+共享库包含所有后端服务共用的代码和工具，包括领域构建块、异常处理、JWT 工具等。
+
+## 架构
+
+```
+backend/shared/
+├── domain-common/          # DDD 构建块、异常处理
+│   └── src/main/java/com/datamate/common/
+│       ├── infrastructure/exception/  # BusinessException, ErrorCode
+│       ├── setting/                   # 系统参数、模型配置
+│       └── domain/                    # Base entities, repositories
+└── security-common/        # JWT 工具、认证辅助
+    └── src/main/java/com/datamate/security/
+```
+
+## 库
+
+### 1. domain-common
+
+#### BusinessException
+统一的业务异常处理机制：
+
+```java
+// 抛出业务异常
+throw BusinessException.of(ErrorCode.DATASET_NOT_FOUND)
+    .withDetail("dataset_id", datasetId);
+
+// 带上下文的异常
+throw BusinessException.of(ErrorCode.VALIDATION_FAILED)
+    .withDetail("field", "email")
+    .withDetail("reason", "Invalid format");
+```
+
+#### ErrorCode
+错误码枚举接口：
+
+```java
+public interface ErrorCode {
+    String getCode();
+    String getMessage();
+    HttpStatus getHttpStatus();
+}
+
+// 示例
+public enum CommonErrorCode implements ErrorCode {
+    SUCCESS("0000", "Success", HttpStatus.OK),
+    DATABASE_NOT_FOUND("4001", "Database not found", HttpStatus.NOT_FOUND);
+}
+```
+
+#### BaseEntity
+所有实体的基类，包含审计字段：
+
+```java
+@Data
+@EqualsAndHashCode(callSuper = true)
+public class BaseEntity<T> implements Serializable {
+    @TableId(type = IdType.ASSIGN_ID)
+    private String id;
+    
+    @TableField(fill = FieldFill.INSERT)
+    private LocalDateTime createdAt;
+    
+    @TableField(fill = FieldFill.INSERT_UPDATE)
+    private LocalDateTime updatedAt;
+    
+    @TableField(fill = FieldFill.INSERT)
+    private String createdBy;
+    
+    @TableField(fill = FieldFill.INSERT_UPDATE)
+    private String updatedBy;
+}
+```
+
+### 2. security-common
+
+#### JWT 工具
+JWT Token 生成和验证：
+
+```java
+// 生成 Token
+String token = JwtUtil.generateToken(userId, secret, expiration);
+
+// 验证 Token
+Claims claims = JwtUtil.validateToken(token, secret);
+String userId = claims.getSubject();
+```
+
+## 使用
+
+### 在服务中使用共享库
+
+#### Maven 依赖
+```xml
+<dependency>
+    <groupId>com.datamate</groupId>
+    <artifactId>domain-common</artifactId>
+    <version>1.0.0-SNAPSHOT</version>
+</dependency>
+<dependency>
+    <groupId>com.datamate</groupId>
+    <artifactId>security-common</artifactId>
+    <version>1.0.0-SNAPSHOT</version>
+</dependency>
+```
+
+#### 使用 BusinessException
+```java
+@RestController
+@RequiredArgsConstructor
+public class DatasetController {
+    
+    public ResponseEntity<DatasetResponse> getDataset(String id) {
+        Dataset dataset = datasetService.findById(id);
+        if (dataset == null) {
+            throw BusinessException.of(ErrorCode.DATASET_NOT_FOUND);
+        }
+        return ResponseEntity.ok(DatasetResponse.from(dataset));
+    }
+}
+```
+
+## 快速开始
+
+### 构建共享库
+```bash
+cd backend
+mvn clean install
+```
+
+### 在服务中使用
+共享库会自动被所有后端服务继承。
+
+## 文档
+
+- [AGENTS.md](./AGENTS.md)
+
+## 相关链接
+
+- [后端 README](../README.md)
diff --git a/backend/shared/README.md b/backend/shared/README.md
new file mode 100644
index 000000000..eb8c13630
--- /dev/null
+++ b/backend/shared/README.md
@@ -0,0 +1,144 @@
+# Shared Libraries
+
+## Overview
+
+Shared Libraries contain code and utilities shared across all backend services, including domain building blocks, exception handling, JWT utilities, and more.
+
+## Architecture
+
+```
+backend/shared/
+├── domain-common/          # DDD building blocks, exception handling
+│   └── src/main/java/com/datamate/common/
+│       ├── infrastructure/exception/  # BusinessException, ErrorCode
+│       ├── setting/                   # System params, model configs
+│       └── domain/                    # Base entities, repositories
+└── security-common/        # JWT utilities, auth helpers
+    └── src/main/java/com/datamate/security/
+```
+
+## Libraries
+
+### 1. domain-common
+
+#### BusinessException
+Unified business exception handling mechanism:
+
+```java
+// Throw business exception
+throw BusinessException.of(ErrorCode.DATASET_NOT_FOUND)
+    .withDetail("dataset_id", datasetId);
+
+// Exception with context
+throw BusinessException.of(ErrorCode.VALIDATION_FAILED)
+    .withDetail("field", "email")
+    .withDetail("reason", "Invalid format");
+```
+
+#### ErrorCode
+Error code enumeration interface:
+
+```java
+public interface ErrorCode {
+    String getCode();
+    String getMessage();
+    HttpStatus getHttpStatus();
+}
+
+// Example
+public enum CommonErrorCode implements ErrorCode {
+    SUCCESS("0000", "Success", HttpStatus.OK),
+    DATABASE_NOT_FOUND("4001", "Database not found", HttpStatus.NOT_FOUND);
+}
+```
+
+#### BaseEntity
+Base class for all entities, including audit fields:
+
+```java
+@Data
+@EqualsAndHashCode(callSuper = true)
+public class BaseEntity<T> implements Serializable {
+    @TableId(type = IdType.ASSIGN_ID)
+    private String id;
+    
+    @TableField(fill = FieldFill.INSERT)
+    private LocalDateTime createdAt;
+    
+    @TableField(fill = FieldFill.INSERT_UPDATE)
+    private LocalDateTime updatedAt;
+    
+    @TableField(fill = FieldFill.INSERT)
+    private String createdBy;
+    
+    @TableField(fill = FieldFill.INSERT_UPDATE)
+    private String updatedBy;
+}
+```
+
+### 2. security-common
+
+#### JWT Utilities
+JWT Token generation and validation:
+
+```java
+// Generate Token
+String token = JwtUtil.generateToken(userId, secret, expiration);
+
+// Validate Token
+Claims claims = JwtUtil.validateToken(token, secret);
+String userId = claims.getSubject();
+```
+
+## Usage
+
+### Using Shared Libraries in Services
+
+#### Maven Dependencies
+```xml
+<dependency>
+    <groupId>com.datamate</groupId>
+    <artifactId>domain-common</artifactId>
+    <version>1.0.0-SNAPSHOT</version>
+</dependency>
+<dependency>
+    <groupId>com.datamate</groupId>
+    <artifactId>security-common</artifactId>
+    <version>1.0.0-SNAPSHOT</version>
+</dependency>
+```
+
+#### Using BusinessException
+```java
+@RestController
+@RequiredArgsConstructor
+public class DatasetController {
+    
+    public ResponseEntity<DatasetResponse> getDataset(String id) {
+        Dataset dataset = datasetService.findById(id);
+        if (dataset == null) {
+            throw BusinessException.of(ErrorCode.DATASET_NOT_FOUND);
+        }
+        return ResponseEntity.ok(DatasetResponse.from(dataset));
+    }
+}
+```
+
+## Quick Start
+
+### Build Shared Libraries
+```bash
+cd backend
+mvn clean install
+```
+
+### Use in Services
+Shared libraries are automatically inherited by all backend services.
+
+## Documentation
+
+- [AGENTS.md](./AGENTS.md)
+
+## Related Links
+
+- [Backend README](../README.md)
diff --git a/runtime/README-zh.md b/runtime/README-zh.md
new file mode 100644
index 000000000..5aa180ddd
--- /dev/null
+++ b/runtime/README-zh.md
@@ -0,0 +1,146 @@
+# DataMate 运行时
+
+## 概述
+
+DataMate 运行时提供数据处理、算子执行、数据收集等核心功能，基于 Python 3.12+ 和 FastAPI 框架。
+
+## 架构
+
+```
+runtime/
+├── datamate-python/      # FastAPI 后端服务（端口 18000）
+├── python-executor/      # Ray 分布式执行器
+├── ops/                 # 算子生态
+├── datax/               # DataX 数据读写框架
+└── deer-flow/            # DeerFlow 服务
+```
+
+## 组件
+
+### 1. datamate-python (FastAPI 后端)
+**端口**: 18000
+
+核心 Python 服务，提供以下功能：
+- **数据合成**: QA 生成、文档处理
+- **数据标注**: Label Studio 集成、自动标注
+- **数据评估**: 模型评估、质量检查
+- **数据清洗**: 数据清洗管道
+- **算子市场**: 算子管理、上传
+- **RAG 索引**: 向量索引、知识库管理
+- **数据收集**: 定时任务、数据源集成
+
+**技术栈**:
+- FastAPI 0.124+
+- SQLAlchemy 2.0+ (async)
+- Pydantic 2.12+
+- PostgreSQL (via asyncpg)
+- Milvus (via pymilvus)
+- APScheduler (定时任务)
+
+### 2. python-executor (Ray 执行器)
+Ray 分布式执行框架，负责：
+- **算子执行**: 执行数据处理算子
+- **任务调度**: 异步任务管理
+- **分布式计算**: 多节点并行处理
+
+**技术栈**:
+- Ray 2.7.0
+- FastAPI (执行器 API)
+- Data-Juicer (数据处理)
+
+### 3. ops (算子生态)
+算子生态，包含：
+- **filter**: 数据过滤（去重、敏感内容、质量过滤）
+- **mapper**: 数据转换（清洗、归一化）
+- **slicer**: 数据切片（文本分割、幻灯片提取）
+- **formatter**: 格式转换（PDF → text, slide → JSON）
+- **llms**: LLM 算子（质量评估、条件检查）
+- **annotation**: 标注算子（目标检测、分割）
+
+**见**: `runtime/ops/README.md` 获取算子开发指南
+
+### 4. datax (DataX 框架)
+DataX 数据读写框架，支持多种数据源：
+- **Readers**: MySQL, PostgreSQL, Oracle, MongoDB, Elasticsearch, HDFS, S3, NFS, GlusterFS, API, 等
+- **Writers**: 同上，支持写入目标
+
+**技术栈**: Java (Maven 构建)
+
+### 5. deer-flow (DeerFlow 服务)
+DeerFlow 服务（配置见 `conf.yaml`）。
+
+## 快速开始
+
+### 前置条件
+- Python 3.12+
+- Poetry (for datamate-python)
+- Ray 2.7.0+ (for python-executor)
+
+### 运行 datamate-python
+```bash
+cd runtime/datamate-python
+poetry install
+poetry run uvicorn app.main:app --reload --port 18000
+```
+
+### 运行 python-executor
+```bash
+cd runtime/python-executor
+poetry install
+ray start --head
+```
+
+## 开发
+
+### datamate-python 模块结构
+```
+app/
+├── core/              # 日志、异常、配置
+├── db/
+│   ├── models/        # SQLAlchemy 模型
+│   └── session.py     # 异步会话
+├── module/
+│   ├── annotation/    # Label Studio 集成
+│   ├── collection/    # 数据收集
+│   ├── cleaning/      # 数据清洗
+│   ├── dataset/       # 数据集管理
+│   ├── evaluation/    # 模型评估
+│   ├── generation/    # QA 合成
+│   ├── operator/      # 算子市场
+│   ├── rag/           # RAG 索引
+│   └── shared/        # 共享 schemas
+└── main.py            # FastAPI 入口
+```
+
+### 代码约定
+- **路由**: `APIRouter` 在 `interface/*.py`
+- **依赖注入**: `Depends(get_db)` 获取会话
+- **错误**: `raise BusinessError(ErrorCode.XXX, context)`
+- **事务**: `async with transaction(db):`
+- **模型**: Extend `BaseEntity` (审计字段自动填充)
+
+## 测试
+
+```bash
+cd runtime/datamate-python
+poetry run pytest
+```
+
+## 配置
+
+### 环境变量
+- `DATABASE_URL`: PostgreSQL 连接字符串
+- `LABEL_STUDIO_BASE_URL`: Label Studio URL
+- `RAY_ENABLED`: 启用 Ray 执行器
+- `RAY_ADDRESS`: Ray 集群地址
+
+## 文档
+
+- **API 文档**: http://localhost:18000/redoc
+- **算子指南**: 见 `runtime/ops/README.md` 获取算子开发
+
+## 相关链接
+
+- [FastAPI 文档](https://fastapi.tiangolo.com/)
+- [Ray 文档](https://docs.ray.io/)
+- [SQLAlchemy 文档](https://docs.sqlalchemy.org/)
diff --git a/runtime/README.md b/runtime/README.md
new file mode 100644
index 000000000..8d3a5621c
--- /dev/null
+++ b/runtime/README.md
@@ -0,0 +1,146 @@
+# DataMate Runtime
+
+## Overview
+
+DataMate Runtime provides core functionality for data processing, operator execution, and data collection, built on Python 3.12+ and the FastAPI framework.
+
+## Architecture
+
+```
+runtime/
+├── datamate-python/      # FastAPI backend service (port 18000)
+├── python-executor/      # Ray distributed executor
+├── ops/                 # Operator ecosystem
+├── datax/               # DataX data read/write framework
+└── deer-flow/            # DeerFlow service
+```
+
+## Components
+
+### 1. datamate-python (FastAPI Backend)
+**Port**: 18000
+
+Core Python service providing:
+- **Data Synthesis**: QA generation, document processing
+- **Data Annotation**: Label Studio integration, auto-annotation
+- **Data Evaluation**: Model evaluation, quality checks
+- **Data Cleaning**: Data cleaning pipelines
+- **Operator Marketplace**: Operator management, upload
+- **RAG Indexing**: Vector indexing, knowledge base management
+- **Data Collection**: Scheduled tasks, data source integration
+
+**Technology Stack**:
+- FastAPI 0.124+
+- SQLAlchemy 2.0+ (async)
+- Pydantic 2.12+
+- PostgreSQL (via asyncpg)
+- Milvus (via pymilvus)
+- APScheduler (scheduled tasks)
+
+### 2. python-executor (Ray Executor)
+Ray distributed execution framework responsible for:
+- **Operator Execution**: Execute data processing operators
+- **Task Scheduling**: Async task management
+- **Distributed Computing**: Multi-node parallel processing
+
+**Technology Stack**:
+- Ray 2.7.0
+- FastAPI (executor API)
+- Data-Juicer (data processing)
+
+### 3. ops (Operator Ecosystem)
+Operator ecosystem including:
+- **filter**: Data filtering (deduplication, sensitive content, quality filtering)
+- **mapper**: Data transformation (cleaning, normalization)
+- **slicer**: Data slicing (text splitting, slide extraction)
+- **formatter**: Format conversion (PDF → text, slide → JSON)
+- **llms**: LLM operators (quality evaluation, condition checking)
+- **annotation**: Annotation operators (object detection, segmentation)
+
+**See**: `runtime/ops/README.md` for operator development guide
+
+### 4. datax (DataX Framework)
+DataX data read/write framework supporting multiple data sources:
+- **Readers**: MySQL, PostgreSQL, Oracle, MongoDB, Elasticsearch, HDFS, S3, NFS, GlusterFS, API, etc.
+- **Writers**: Same as above, supports writing to targets
+
+**Technology Stack**: Java (Maven build)
+
+### 5. deer-flow (DeerFlow Service)
+DeerFlow service (see `conf.yaml` for configuration).
+
+## Quick Start
+
+### Prerequisites
+- Python 3.12+
+- Poetry (for datamate-python)
+- Ray 2.7.0+ (for python-executor)
+
+### Run datamate-python
+```bash
+cd runtime/datamate-python
+poetry install
+poetry run uvicorn app.main:app --reload --port 18000
+```
+
+### Run python-executor
+```bash
+cd runtime/python-executor
+poetry install
+ray start --head
+```
+
+## Development
+
+### datamate-python Module Structure
+```
+app/
+├── core/              # Logging, exception, config
+├── db/
+│   ├── models/        # SQLAlchemy models
+│   └── session.py     # Async session
+├── module/
+│   ├── annotation/    # Label Studio integration
+│   ├── collection/    # Data collection
+│   ├── cleaning/      # Data cleaning
+│   ├── dataset/       # Dataset management
+│   ├── evaluation/    # Model evaluation
+│   ├── generation/    # QA synthesis
+│   ├── operator/      # Operator marketplace
+│   ├── rag/           # RAG indexing
+│   └── shared/        # Shared schemas
+└── main.py            # FastAPI entry
+```
+
+### Code Conventions
+- **Routes**: `APIRouter` in `interface/*.py`
+- **Dependency Injection**: `Depends(get_db)` for session
+- **Error Handling**: `raise BusinessError(ErrorCodes.XXX, context)`
+- **Transactions**: `async with transaction(db):`
+- **Models**: Extend `BaseEntity` (audit fields auto-filled)
+
+## Testing
+
+```bash
+cd runtime/datamate-python
+poetry run pytest
+```
+
+## Configuration
+
+### Environment Variables
+- `DATABASE_URL`: PostgreSQL connection string
+- `LABEL_STUDIO_BASE_URL`: Label Studio URL
+- `RAY_ENABLED`: Enable Ray executor
+- `RAY_ADDRESS`: Ray cluster address
+
+## Documentation
+
+- **API Docs**: http://localhost:18000/redoc
+- **Operator Guide**: See `runtime/ops/README.md` for operator development
+
+## Related Links
+
+- [FastAPI Documentation](https://fastapi.tiangolo.com/)
+- [Ray Documentation](https://docs.ray.io/)
+- [SQLAlchemy Documentation](https://docs.sqlalchemy.org/)
diff --git a/runtime/datax/README-zh.md b/runtime/datax/README-zh.md
new file mode 100644
index 000000000..40d3c8e0a
--- /dev/null
+++ b/runtime/datax/README-zh.md
@@ -0,0 +1,151 @@
+# DataX 框架
+
+## 概述
+
+DataX 是一个数据传输框架，支持多种数据源和数据目标之间的数据传输，用于数据收集和同步。
+
+## 架构
+
+```
+runtime/datax/
+├── core/           # DataX 核心组件
+├── transformer/     # 数据转换器
+├── readers/        # 数据读取器
+│   ├── mysqlreader/
+│   ├── postgresqlreader/
+│   ├── oracleReader/
+│   ├── mongodbreader/
+│   ├── hdfsreader/
+│   ├── s3rader/
+│   ├── nfsreader/
+│   ├── glusterfsreader/
+│   └── apireader/
+└── writers/        # 数据写入器
+    ├── mysqlwriter/
+    ├── postgresqlwriter/
+    ├── oraclewriter/
+    ├── mongodbwriter/
+    ├── hdfswriter/
+    ├── s3writer/
+    ├── nfswriter/
+    ├── glusterfswriter/
+    └── txtfilewriter/
+```
+
+## 支持的数据源
+
+### 关系型数据库
+- MySQL
+- PostgreSQL
+- Oracle
+- SQL Server
+- DB2
+- KingbaseES
+- GaussDB
+
+### NoSQL 数据库
+- MongoDB
+- Elasticsearch
+- Cassandra
+- HBase
+- Redis
+
+### 文件系统
+- HDFS
+- S3 (AWS S3, MinIO, 阿里云 OSS)
+- NFS
+- GlusterFS
+- 本地文件系统
+
+### 其他
+- API 接口
+- Kafka
+- Pulsar
+- DataHub
+- LogHub
+
+## 使用
+
+### 基本配置
+```json
+{
+  "job": {
+    "content": [
+      {
+        "reader": {
+          "name": "mysqlreader",
+          "parameter": {
+            "username": "root",
+            "password": "password",
+            "column": ["id", "name", "email"],
+            "connection": [
+              {
+                "jdbcUrl": "jdbc:mysql://localhost:3306/database",
+                "table": ["users"]
+              }
+            ]
+          }
+        },
+        "writer": {
+          "name": "txtfilewriter",
+          "parameter": {
+            "path": "/output/users.txt",
+            "fileName": "users",
+            "writeMode": "truncate"
+          }
+        }
+      }
+    ]
+  }
+}
+```
+
+### 运行 DataX
+```bash
+# 构建 DataX
+cd runtime/datax
+mvn clean package
+
+# 运行
+python datax.py -j job.json
+```
+
+## 快速开始
+
+### 前置条件
+- JDK 8+
+- Maven 3.8+
+- Python 3.6+
+
+### 构建
+```bash
+cd runtime/datax
+mvn clean package
+```
+
+### 运行示例
+```bash
+python datax.py -j examples/mysql2text.json
+```
+
+## 开发
+
+### 添加新的 Reader
+1. 在 `readers/` 创建新模块
+2. 实现 Reader 接口
+3. 配置 reader 参数
+4. 添加到 package.xml
+
+### 添加新的 Writer
+1. 在 `writers/` 创建新模块
+2. 实现 Writer 接口
+3. 配置 writer 参数
+4. 添加到 package.xml
+
+## 文档
+
+- [DataX 官方文档](https://github.com/alibaba/DataX)
+
+## 相关链接
+
+- [运行时 README](../README.md)
diff --git a/runtime/datax/README.md b/runtime/datax/README.md
new file mode 100644
index 000000000..af2366255
--- /dev/null
+++ b/runtime/datax/README.md
@@ -0,0 +1,151 @@
+# DataX Framework
+
+## Overview
+
+DataX is a data transfer framework that supports data transmission between various data sources and targets, used for data collection and synchronization.
+
+## Architecture
+
+```
+runtime/datax/
+├── core/           # DataX core components
+├── transformer/     # Data transformers
+├── readers/        # Data readers
+│   ├── mysqlreader/
+│   ├── postgresqlreader/
+│   ├── oracleReader/
+│   ├── mongodbreader/
+│   ├── hdfsreader/
+│   ├── s3rader/
+│   ├── nfsreader/
+│   ├── glusterfsreader/
+│   └── apireader/
+└── writers/        # Data writers
+    ├── mysqlwriter/
+    ├── postgresqlwriter/
+    ├── oraclewriter/
+    ├── mongodbwriter/
+    ├── hdfswriter/
+    ├── s3writer/
+    ├── nfswriter/
+    ├── glusterfswriter/
+    └── txtfilewriter/
+```
+
+## Supported Data Sources
+
+### Relational Databases
+- MySQL
+- PostgreSQL
+- Oracle
+- SQL Server
+- DB2
+- KingbaseES
+- GaussDB
+
+### NoSQL Databases
+- MongoDB
+- Elasticsearch
+- Cassandra
+- HBase
+- Redis
+
+### File Systems
+- HDFS
+- S3 (AWS S3, MinIO, Alibaba Cloud OSS)
+- NFS
+- GlusterFS
+- Local file system
+
+### Others
+- API interfaces
+- Kafka
+- Pulsar
+- DataHub
+- LogHub
+
+## Usage
+
+### Basic Configuration
+```json
+{
+  "job": {
+    "content": [
+      {
+        "reader": {
+          "name": "mysqlreader",
+          "parameter": {
+            "username": "root",
+            "password": "password",
+            "column": ["id", "name", "email"],
+            "connection": [
+              {
+                "jdbcUrl": "jdbc:mysql://localhost:3306/database",
+                "table": ["users"]
+              }
+            ]
+          }
+        },
+        "writer": {
+          "name": "txtfilewriter",
+          "parameter": {
+            "path": "/output/users.txt",
+            "fileName": "users",
+            "writeMode": "truncate"
+          }
+        }
+      }
+    ]
+  }
+}
+```
+
+### Run DataX
+```bash
+# Build DataX
+cd runtime/datax
+mvn clean package
+
+# Run
+python datax.py -j job.json
+```
+
+## Quick Start
+
+### Prerequisites
+- JDK 8+
+- Maven 3.8+
+- Python 3.6+
+
+### Build
+```bash
+cd runtime/datax
+mvn clean package
+```
+
+### Run Example
+```bash
+python datax.py -j examples/mysql2text.json
+```
+
+## Development
+
+### Adding a New Reader
+1. Create new module in `readers/`
+2. Implement Reader interface
+3. Configure reader parameters
+4. Add to package.xml
+
+### Adding a New Writer
+1. Create new module in `writers/`
+2. Implement Writer interface
+3. Configure writer parameters
+4. Add to package.xml
+
+## Documentation
+
+- [DataX Official Documentation](https://github.com/alibaba/DataX)
+
+## Related Links
+
+- [Runtime README](../README.md)
diff --git a/runtime/deer-flow/README-zh.md b/runtime/deer-flow/README-zh.md
new file mode 100644
index 000000000..209d436de
--- /dev/null
+++ b/runtime/deer-flow/README-zh.md
@@ -0,0 +1,97 @@
+# DeerFlow 服务
+
+## 概述
+
+DeerFlow 是一个 LLM 驱动的服务，用于规划和推理任务，支持多种 LLM 提供商。
+
+## 架构
+
+```
+runtime/deer-flow/
+├── conf.yaml       # DeerFlow 配置文件
+├── .env            # 环境变量
+└── (其他源代码）
+```
+
+## 配置
+
+### 基本配置 (conf.yaml)
+
+```yaml
+# 基础模型配置
+BASIC_MODEL:
+  base_url: https://api.example.com/v1
+  model: "model-name"
+  api_key: your_api_key
+  max_retries: 3
+  verify_ssl: false  # 如果使用自签名证书，设为 false
+
+# 推理模型配置（可选）
+REASONING_MODEL:
+  base_url: https://api.example.com/v1
+  model: "reasoning-model-name"
+  api_key: your_api_key
+  max_retries: 3
+
+# 搜索引擎配置（可选）
+SEARCH_ENGINE:
+  engine: tavily
+  include_domains:
+    - example.com
+    - trusted-news.com
+  exclude_domains:
+    - spam-site.com
+  search_depth: "advanced"
+  include_raw_content: true
+  include_images: true
+  include_image_descriptions: true
+  min_score_threshold: 0.0
+  max_content_length_per_page: 4000
+```
+
+## 支的 LLM 提供商
+
+#### OpenAI
+```yaml
+BASIC_MODEL:
+  base_url: https://api.openai.com/v1
+  model: "gpt-4"
+  api_key: sk-...
+```
+
+#### Ollama (本地部署）
+```yaml
+BASIC_MODEL:
+  base_url: "http://localhost:11434/v1"
+  model: "qwen2:7b"
+  api_key: "ollama"
+  verify_ssl: false
+```
+
+#### Google AI Studio
+```yaml
+BASIC_MODEL:
+  platform: "google_aistudio"
+  model: "gemini-2.5-flash"
+  api_key: your_gemini_api_key
+```
+
+## 开发
+
+### 添加新的 LLM 提供商
+1. 在 `conf.yaml` 添加新的模型配置
+2. 实现对应的 API 调用逻辑
+3. 测试连接和推理
+
+### 自定义提示词模板
+1. 创建提示词模板文件
+2. 在 `conf.yaml` 引用模板
+3. 测试提示词效果
+
+## 文档
+
+- [DeerFlow 官方文档](https://github.com/ModelEngine-Group/DeerFlow)
+
+## 相关链接
+
+- [运行时 README](../README.md)
diff --git a/runtime/deer-flow/README.md b/runtime/deer-flow/README.md
new file mode 100644
index 000000000..ee7642ab2
--- /dev/null
+++ b/runtime/deer-flow/README.md
@@ -0,0 +1,97 @@
+# DeerFlow Service
+
+## Overview
+
+DeerFlow is an LLM-driven service for planning and reasoning tasks, supporting multiple LLM providers.
+
+## Architecture
+
+```
+runtime/deer-flow/
+├── conf.yaml       # DeerFlow configuration file
+├── .env            # Environment variables
+└── (other source code)
+```
+
+## Configuration
+
+### Basic Configuration (conf.yaml)
+
+```yaml
+# Basic model configuration
+BASIC_MODEL:
+  base_url: https://api.example.com/v1
+  model: "model-name"
+  api_key: your_api_key
+  max_retries: 3
+  verify_ssl: false  # Set to false if using self-signed certificates
+
+# Reasoning model configuration (optional)
+REASONING_MODEL:
+  base_url: https://api.example.com/v1
+  model: "reasoning-model-name"
+  api_key: your_api_key
+  max_retries: 3
+
+# Search engine configuration (optional)
+SEARCH_ENGINE:
+  engine: tavily
+  include_domains:
+    - example.com
+    - trusted-news.com
+  exclude_domains:
+    - spam-site.com
+  search_depth: "advanced"
+  include_raw_content: true
+  include_images: true
+  include_image_descriptions: true
+  min_score_threshold: 0.0
+  max_content_length_per_page: 4000
+```
+
+## Supported LLM Providers
+
+#### OpenAI
+```yaml
+BASIC_MODEL:
+  base_url: https://api.openai.com/v1
+  model: "gpt-4"
+  api_key: sk-...
+```
+
+#### Ollama (Local Deployment)
+```yaml
+BASIC_MODEL:
+  base_url: "http://localhost:11434/v1"
+  model: "qwen2:7b"
+  api_key: "ollama"
+  verify_ssl: false
+```
+
+#### Google AI Studio
+```yaml
+BASIC_MODEL:
+  platform: "google_aistudio"
+  model: "gemini-2.5-flash"
+  api_key: your_gemini_api_key
+```
+
+## Development
+
+### Adding a New LLM Provider
+1. Add new model configuration in `conf.yaml`
+2. Implement corresponding API call logic
+3. Test connection and inference
+
+### Customizing Prompt Templates
+1. Create a prompt template file
+2. Reference the template in `conf.yaml`
+3. Test prompt effectiveness
+
+## Documentation
+
+- [DeerFlow Official Documentation](https://github.com/ModelEngine-Group/DeerFlow)
+
+## Related Links
+
+- [Runtime README](../README.md)
diff --git a/runtime/python-executor/README-zh.md b/runtime/python-executor/README-zh.md
new file mode 100644
index 000000000..b833aece6
--- /dev/null
+++ b/runtime/python-executor/README-zh.md
@@ -0,0 +1,221 @@
+# Ray 执行器
+
+## 概述
+
+Ray 执行器是基于 Ray 的分布式执行框架，负责执行数据处理算子、任务调度和分布式计算。
+
+## 架构
+
+```
+runtime/python-executor/
+└── datamate/
+    ├── core/
+    │   ├── base_op.py      # BaseOp, Mapper, Filter, Slicer, LLM
+    │   ├── dataset.py      # Dataset 处理
+    │   └── constant.py     # 常量定义
+    ├── scheduler/
+    │   ├── scheduler.py    # TaskScheduler, Task, TaskStatus
+    │   ├── func_task_scheduler.py   # 函数任务调度
+    │   └── cmd_task_scheduler.py    # 命令任务调度
+    ├── wrappers/
+    │   ├── executor.py     # Ray 执行器入口
+    │   ├── datamate_wrapper.py      # DataMate 任务包装
+    │   └── data_juicer_wrapper.py   # DataJuicer 集成
+    └── common/utils/       # 工具函数
+        ├── bytes_transform.py
+        ├── file_scanner.py
+        ├── lazy_loader.py
+        └── text_splitter.py
+```
+
+## 组件
+
+### 1. Base 类
+
+#### BaseOp
+所有算子的基类：
+
+```python
+class BaseOp:
+    def __init__(self, *args, **kwargs):
+        self.accelerator = kwargs.get('accelerator', "cpu")
+        self.text_key = kwargs.get('text_key', "text")
+        # ... 其他配置
+    
+    def execute(self, sample):
+        raise NotImplementedError
+```
+
+#### Mapper
+数据转换算子基类（1:1）：
+
+```python
+class Mapper(BaseOp):
+    def execute(self, sample: Dict) -> Dict:
+        # 转换逻辑
+        return processed_sample
+```
+
+#### Filter
+数据过滤算子基类（返回 bool）：
+
+```python
+class Filter(BaseOp):
+    def execute(self, sample: Dict) -> bool:
+        # 过滤逻辑
+        return True  # 保留或过滤
+```
+
+#### Slicer
+数据切片算子基类（1:N）：
+
+```python
+class Slicer(BaseOp):
+    def execute(self, sample: Dict) -> List[Dict]:
+        # 切片逻辑
+        return [sample1, sample2, ...]
+```
+
+#### LLM
+LLM 算子基类：
+
+```python
+class LLM(Mapper):
+    def __init__(self, *args, **kwargs):
+        super().__init__(*args, **kwargs)
+        self.llm = self.get_llm(*args, **kwargs)
+    
+    def build_llm_prompt(self, *args, **kwargs):
+        raise NotImplementedError
+```
+
+### 2. Task Scheduler
+
+异步任务调度器：
+
+```python
+class TaskScheduler:
+    def __init__(self, max_concurrent: int = 10):
+        self.tasks: Dict[str, Task] = {}
+        self.semaphore = asyncio.Semaphore(max_concurrent)
+    
+    async def submit(self, task_id, task, *args, **kwargs):
+        # 提交任务
+        pass
+    
+    def get_task_status(self, task_id: str) -> Optional[TaskResult]:
+        # 获取任务状态
+        pass
+    
+    def cancel_task(self, task_id: str) -> bool:
+        # 取消任务
+        pass
+```
+
+### 3. 算子执行
+
+#### 算子注册
+```python
+from datamate.core.base_op import OPERATORS
+
+OPERATORS.register_module(
+    module_name='YourOperatorName',
+    module_path="ops.user.operator_package.process"
+)
+```
+
+#### 执行算子
+```python
+from datamate.core.base_op import Mapper
+
+class MyMapper(Mapper):
+    def execute(self, sample):
+        text = sample.get('text', '')
+        processed = text.upper()
+        sample['text'] = processed
+        return sample
+```
+
+## 快速开始
+
+### 前置条件
+- Python 3.11+
+- Ray 2.7.0+
+- Poetry
+
+### 安装
+```bash
+cd runtime/python-executor
+poetry install
+```
+
+### 启动 Ray Head
+```bash
+ray start --head
+```
+
+### 启动 Ray Worker
+```bash
+ray start --head-address=<head-ip>:6379
+```
+
+## 使用
+
+### 提交任务到 Ray
+```python
+from ray import remote
+
+@remote
+def execute_operator(sample, operator_config):
+    # 执行算子逻辑
+    return result
+
+# 提交任务
+result_ref = execute_operator.remote(sample, config)
+result = ray.get(result_ref)
+```
+
+### 使用 Task Scheduler
+```python
+from datamate.scheduler.scheduler import TaskScheduler
+
+scheduler = TaskScheduler(max_concurrent=10)
+task_id = "task-001"
+scheduler.submit(task_id, my_function, arg1, arg2)
+status = scheduler.get_task_status(task_id)
+```
+
+## 开发
+
+### 添加新算子
+1. 在 `runtime/ops/` 创建算子目录
+2. 实现 `process.py` 和 `__init__.py`
+3. 在 `__init__.py` 注册算子
+4. 测试算子
+
+### 调试算子
+```bash
+# 本地测试
+python -c "from ops.user.operator_package.process import YourOperatorName; op = YourOperatorName(); print(op.execute({'text': 'test'}))"
+```
+
+## 性能
+
+### 并行执行
+Ray 自动处理并行执行和资源分配。
+
+### 容错
+Ray 提供自动任务重试和故障转移。
+
+### 资源管理
+Ray 动态分配 CPU、GPU、内存资源。
+
+## 文档
+
+- [Ray 文档](https://docs.ray.io/)
+- [AGENTS.md](./AGENTS.md)
+
+## 相关链接
+
+- [运行时 README](../README.md)
+- [算子生态](../ops/README.md)
diff --git a/runtime/python-executor/README.md b/runtime/python-executor/README.md
new file mode 100644
index 000000000..9cee6c708
--- /dev/null
+++ b/runtime/python-executor/README.md
@@ -0,0 +1,221 @@
+# Ray Executor
+
+## Overview
+
+Ray Executor is a Ray-based distributed execution framework responsible for executing data processing operators, task scheduling, and distributed computing.
+
+## Architecture
+
+```
+runtime/python-executor/
+└── datamate/
+    ├── core/
+    │   ├── base_op.py      # BaseOp, Mapper, Filter, Slicer, LLM
+    │   ├── dataset.py      # Dataset processing
+    │   └── constant.py     # Constant definitions
+    ├── scheduler/
+    │   ├── scheduler.py    # TaskScheduler, Task, TaskStatus
+    │   ├── func_task_scheduler.py   # Function task scheduling
+    │   └── cmd_task_scheduler.py    # Command task scheduling
+    ├── wrappers/
+    │   ├── executor.py     # Ray executor entry point
+    │   ├── datamate_wrapper.py      # DataMate task wrapper
+    │   └── data_juicer_wrapper.py   # DataJuicer integration
+    └── common/utils/       # Utility functions
+        ├── bytes_transform.py
+        ├── file_scanner.py
+        ├── lazy_loader.py
+        └── text_splitter.py
+```
+
+## Components
+
+### 1. Base Classes
+
+#### BaseOp
+Base class for all operators:
+
+```python
+class BaseOp:
+    def __init__(self, *args, **kwargs):
+        self.accelerator = kwargs.get('accelerator', "cpu")
+        self.text_key = kwargs.get('text_key', "text")
+        # ... other configuration
+    
+    def execute(self, sample):
+        raise NotImplementedError
+```
+
+#### Mapper
+Base class for data transformation operators (1:1):
+
+```python
+class Mapper(BaseOp):
+    def execute(self, sample: Dict) -> Dict:
+        # Transformation logic
+        return processed_sample
+```
+
+#### Filter
+Base class for data filtering operators (returns bool):
+
+```python
+class Filter(BaseOp):
+    def execute(self, sample: Dict) -> bool:
+        # Filtering logic
+        return True  # Keep or filter out
+```
+
+#### Slicer
+Base class for data slicing operators (1:N):
+
+```python
+class Slicer(BaseOp):
+    def execute(self, sample: Dict) -> List[Dict]:
+        # Slicing logic
+        return [sample1, sample2, ...]
+```
+
+#### LLM
+Base class for LLM operators:
+
+```python
+class LLM(Mapper):
+    def __init__(self, *args, **kwargs):
+        super().__init__(*args, **kwargs)
+        self.llm = self.get_llm(*args, **kwargs)
+    
+    def build_llm_prompt(self, *args, **kwargs):
+        raise NotImplementedError
+```
+
+### 2. Task Scheduler
+
+Async task scheduler:
+
+```python
+class TaskScheduler:
+    def __init__(self, max_concurrent: int = 10):
+        self.tasks: Dict[str, Task] = {}
+        self.semaphore = asyncio.Semaphore(max_concurrent)
+    
+    async def submit(self, task_id, task, *args, **kwargs):
+        # Submit task
+        pass
+    
+    def get_task_status(self, task_id: str) -> Optional[TaskResult]:
+        # Get task status
+        pass
+    
+    def cancel_task(self, task_id: str) -> bool:
+        # Cancel task
+        pass
+```
+
+### 3. Operator Execution
+
+#### Operator Registration
+```python
+from datamate.core.base_op import OPERATORS
+
+OPERATORS.register_module(
+    module_name='YourOperatorName',
+    module_path="ops.user.operator_package.process"
+)
+```
+
+#### Execute Operator
+```python
+from datamate.core.base_op import Mapper
+
+class MyMapper(Mapper):
+    def execute(self, sample):
+        text = sample.get('text', '')
+        processed = text.upper()
+        sample['text'] = processed
+        return sample
+```
+
+## Quick Start
+
+### Prerequisites
+- Python 3.11+
+- Ray 2.7.0+
+- Poetry
+
+### Installation
+```bash
+cd runtime/python-executor
+poetry install
+```
+
+### Start Ray Head
+```bash
+ray start --head
+```
+
+### Start Ray Worker
+```bash
+ray start --head-address=<head-ip>:6379
+```
+
+## Usage
+
+### Submit Task to Ray
+```python
+from ray import remote
+
+@remote
+def execute_operator(sample, operator_config):
+    # Execute operator logic
+    return result
+
+# Submit task
+result_ref = execute_operator.remote(sample, config)
+result = ray.get(result_ref)
+```
+
+### Use Task Scheduler
+```python
+from datamate.scheduler.scheduler import TaskScheduler
+
+scheduler = TaskScheduler(max_concurrent=10)
+task_id = "task-001"
+scheduler.submit(task_id, my_function, arg1, arg2)
+status = scheduler.get_task_status(task_id)
+```
+
+## Development
+
+### Adding a New Operator
+1. Create operator directory in `runtime/ops/`
+2. Implement `process.py` and `__init__.py`
+3. Register operator in `__init__.py`
+4. Test the operator
+
+### Debugging Operators
+```bash
+# Local test
+python -c "from ops.user.operator_package.process import YourOperatorName; op = YourOperatorName(); print(op.execute({'text': 'test'}))"
+```
+
+## Performance
+
+### Parallel Execution
+Ray automatically handles parallel execution and resource allocation.
+
+### Fault Tolerance
+Ray provides automatic task retry and failover.
+
+### Resource Management
+Ray dynamically allocates CPU, GPU, and memory resources.
+
+## Documentation
+
+- [Ray Documentation](https://docs.ray.io/)
+- [AGENTS.md](./AGENTS.md)
+
+## Related Links
+
+- [Runtime README](../README.md)
+- [Operator Ecosystem](../ops/README.md)