An interactive, agentic data analysis application that leverages advanced LLM reasoning to help users explore, visualize, and understand their data using NVIDIA Llama-3.1-Nemotron-Ultra-253B-v1 and NVIDIA Llama-3.3-Nemotron-Super-49B-v1.5.
This repository contains a Streamlit application that demonstrates a complete workflow for data analysis:
- Data Upload: Upload CSV files for analysis
- Natural Language Queries: Ask questions about your data in plain English
- Automated Visualization: Generate relevant plots and charts
- Transparent Reasoning: Get detailed explanations of the analysis process
The implementation leverages the powerful Llama-3.1-Nemotron-Ultra-253B-v1 and Llama-3.3-Nemotron-Super-49B-v1.5 models through NVIDIA's API, enabling sophisticated data analysis and reasoning.
Learn more about the models here.
- Agentic Architecture: Modular agents for data insight, code generation, execution, and reasoning
- Natural Language Queries: Ask questions about your data—no coding required
- Automated Visualization: Instantly generate and display relevant plots
- Transparent Reasoning: Get clear, LLM-generated explanations for every result
- Powered by NVIDIA Llama-3.1-Nemotron-Ultra-253B-v1 and NVIDIA Llama-3.3-Nemotron-Super-49B-v1.5: State-of-the-art reasoning and interpretability
- Python 3.10+
- Streamlit
- NVIDIA API Key (see Installation section for setup instructions)
- Required Python packages:
- pandas
- matplotlib
- streamlit
- requests
-
Clone this repository:
git clone https://github.com/NVIDIA/GenerativeAIExamples.git cd GenerativeAIExamples/community/data-analysis-agent -
Install dependencies:
pip install -r requirements.txt
-
Set up your NVIDIA API key:
- Sign up or log in at NVIDIA Build
- Generate an API key
- Set the API key in your environment:
export NVIDIA_API_KEY=your_nvidia_api_key_here - Or add it to your
.envfile if you use one
-
Run the Streamlit app:
streamlit run data_analysis_agent.py
-
Download example dataset (optional):
wget https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv
-
Use the application:
- Select a model from the dropdown menu
- Upload a CSV file (e.g., the Titanic dataset)
- Ask questions in natural language
- View results, visualizations, and detailed reasoning
The Llama-3.1-Nemotron-Ultra-253B-v1 model used in this project has the following specifications:
- Parameters: 253B
- Features: Advanced reasoning capabilities
- Use Cases: Complex data analysis, multi-agent systems
- Enterprise Ready: Optimized for production deployment
The Llama-3.3-Nemotron-Super-49B-v1.5 model used in this project has the following specifications:
- Parameters: 49B
- Features: Efficient reasoning and and chat model
- Use Cases: AI Agent systems, chatbots, RAG systems, and other AI-powered applications. Also suitable for typical instruction-following tasks
- Enterprise Ready: Optimized for production deployment
- NVIDIA Llama-3.1-Nemotron-Ultra-253B-v1
- NVIDIA Llama-3.3-Nemotron-Super-49B-v1.5
- Streamlit
- Pandas
- Matplotlib
Contributions are welcome! Please open an issue or submit a pull request.

