A curated list of awesome projects, research works, and applications built with or on top of DataFlow — an LLM-driven framework for data preparation, synthesis, evaluation, and workflow automation in the era of Data-Centric AI.
Full repositories that directly use DataFlow as a core dependency.
- Project Name
Short description of the project.
e.g., Uses DataFlow pipelines to curate high-quality SFT and RLHF data.
- Fork this repository
- Add your project to the appropriate section
- Keep descriptions concise (1–2 lines)
- Submit a Pull Request
- The project must use DataFlow (core framework or ecosystem)
- Open-source repositories are preferred
- Avoid promotional or closed-source-only entries
- Keep descriptions objective and factual
- DataFlow: https://github.com/OpenDCAI/DataFlow
- DataFlow Documentation: https://opendcai.github.io/DataFlow-Doc/
- DataFlow Ecosystem Guide: https://opendcai.github.io/DataFlow-Doc/en/guide/df_ecosystem/
This list is released under the CC0-1.0 License (public domain).