Skip to content

Commit 85ff51c

Browse files
committed
[update] update readme.
- overview update - add Reasoning pipeline demo
1 parent f614e36 commit 85ff51c

2 files changed

Lines changed: 14 additions & 5 deletions

File tree

README.md

Lines changed: 14 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -13,15 +13,21 @@
1313
[简体中文](./README.zh-CN.md) | English
1414

1515

16-
**[Features](#Features)[Quick Start](#Quick_Start)[Documentation]()[Contributing](#贡献)[License](#许可证)**
16+
**[Features](#Features)[Quick Start](#Quick_Start)[Documentation](https://open-dataflow.github.io/DataFlow-Doc/)[Contributing](#贡献)[License](#许可证)**
1717

1818

1919
</div>
2020

2121
## Overview
22-
DataFlow-Eval-Process is a data evaluation and processing system designed to evaluate data quality from multiple dimensions and filter out high-quality data. We mainly support SOTA algorithms within academic papers with strong theoretical support.
22+
DataFlow is a data evaluation and processing system designed to 1) evaluate data quality from multiple dimensions; 2) filter out high-quality data and 3) generate chain-of-thought or other types of augmentation. We mainly support SOTA algorithms within academic papers with strong theoretical support.
23+
24+
<!-- We now support text, image, video, and multimodality data types. -->
25+
Specifically, we first build various `operators` based on rules, LLMs, and LLM APIs, which are then assembled into six `pipelines`. These pipelines form the complete `Dataflow` system. Further, We also build an `agent` that can flexibly compose new pipelines with existing `operators` on demand.
26+
27+
Current Pipelines in Dataflow are as follows:
28+
- **Reasoning Pipeline**: Enhances existing question–answer pairs with (1) extended chain-of-thought, (2) category classification, and (3) difficulty estimation.
29+
- **Text2SQL Pipeline**: Translates natural language questions into SQL queries, supplemented with explanations, chain-of-thought reasoning, and contextual schema information.
2330

24-
We now support text, image, video, and multimodality data types.
2531

2632
## News
2733
- [2025-07-25] 🎉 We release the dataflow-agent.
@@ -35,13 +41,16 @@ We now support text, image, video, and multimodality data types.
3541
## Installation
3642
For environment setup, please using the following commands👇
3743

38-
```
44+
```shell
3945
conda create -n dataflow python=3.10
4046
conda activate dataflow
4147
pip install -e .
4248
```
4349

4450
## Features
4551
### 1. Reasoning Pipeline
52+
![](./static/images/demo_reasoning.png)
53+
54+
For demo inputs and outputs, you can refence our [Reasoning Pipeline sample](https://huggingface.co/datasets/Open-Dataflow/dataflow-demo-Reasonning/) on Huggingface.
4655

47-
### 2. NL2SQL
56+
### 2. Text2SQL

static/images/demo_reasoning.png

33.3 KB
Loading

0 commit comments

Comments
 (0)