E Commerce Analytics AI Agent

A full-stack analytics assistant for e-commerce support teams, built on Next.js. The app accepts natural-language questions and dynamically routes each request to either:

SQL mode for structured analytics and aggregations from Databricks tables.
RAG mode for semantic, context-driven answers from Databricks vector search.

The system also captures user feedback, tracks request-level telemetry in MLflow, and preserves conversational context in Redis to make follow-up questions more useful.

Project Goals

Give business users a single conversational interface for order analytics.
Improve response reliability by classifying each question as SQL-first or retrieval-first.
Keep the UX responsive with streaming responses and guided follow-up prompts.
Collect explicit quality feedback and run telemetry for continuous improvement.

Feature Set

Hybrid SQL and RAG routing with confidence-based decisioning.
Databricks SQL execution for structured analytics.
Databricks vector retrieval across order and analytics indexes.
Session-aware context carryover through Redis.
Automatic chat title generation from first user message.
Streaming assistant output for retrieval responses.
Per-message like/dislike feedback with optional comments.
MLflow request instrumentation for timing, route, and quality metrics.
Local chat history with search, pin, rename, and delete in sidebar.

System Architecture

High-level flow:

User sends a question from the chat interface.
API classifies query intent and confidence.
Router decides SQL or RAG, with confidence thresholds and fallback logic.
SQL path executes Databricks query and summarizes results with LLM.
RAG path retrieves vectors and synthesizes answer with LLM.
API returns structured JSON (SQL) or streamed envelope (RAG).
UI renders answer, table (if available), and suggested follow-ups.
Metrics and metadata are logged to MLflow; context persisted in Redis.

Detailed architecture and sequence diagrams are in docs/02-architecture-and-flow.md.

Repository Structure

Core directories and what they contain:

app: App Router pages, global layout/styles, and API routes.
components: Chat UI primitives for conversation, message rendering, and sidebar history.
lib: Integration layer for Databricks, OpenAI, Redis, and MLflow.
prompts: Routing/classifier prompt templates.
public: Static assets.
docs: Full technical documentation set (added in this project).

Current API route files:

app/api/chat/route.ts
app/api/feedback/route.ts
app/api/generate-title/route.ts

Ingest route folders currently exist as placeholders and do not yet include handlers:

app/api/ingest/databricks-sync
app/api/ingest/eventbridge

Tech Stack

Frontend:

Next.js 16 App Router
React 19
Tailwind CSS 4

Backend and integrations:

Databricks SQL Statement API
Databricks Vector Search API
OpenAI Chat Completions and Embeddings APIs
Upstash Redis
Databricks MLflow API

Utilities:

Axios for HTTP clients
TypeScript strict mode
ESLint 9 flat config

End-To-End Request Lifecycle

For each incoming question:

Session context and recent messages are loaded from Redis.
Classifier prompt from prompts/sql_rag_classifier.txt is filled with history and question.
Classifier output is parsed as route, confidence, reasoning, and optional SQL.
Route decision rules are applied:
- High confidence favors classifier output.
- Mid confidence SQL can execute with fallback.
- Low confidence defaults to retrieval.
If SQL path:
- SQL cleaned and executed in Databricks.
- Results summarized by LLM.
- Top rows returned as table payload.
If RAG path:
- Query embedding created/cached.
- Vector search runs against two indexes.
- Retrieved context passed to LLM for answer generation.
- Streaming envelope returned with follow-ups and metadata.
Context extracted from SQL rows is saved for follow-up questions.
Timings and route metrics logged to MLflow.

API Surface

Quick reference:

POST /api/chat
- Input: question, history, sessionId
- Output: JSON in SQL mode, text stream envelope in RAG mode
POST /api/feedback
- Input: messageId, sessionId, rating, optional comment and message texts
- Output: success or error JSON
POST /api/generate-title
- Input: message
- Output: plain-text short title

Full request and response schemas are in docs/03-api-reference.md.

Environment Variables

Create .env.local with the following values:

OPENAI_API_KEY
DATABRICKS_HOST
DATABRICKS_TOKEN
DATABRICKS_WAREHOUSE_ID
DATABRICKS_SQL_URL
DATABRICKS_MLFLOW_EXPERIMENT_ID
UPSTASH_REDIS_REST_URL
UPSTASH_REDIS_REST_TOKEN

Detailed variable meanings, defaults, and failure behavior are in docs/05-operations-and-deployment.md.

Setup And Run

Install dependencies.

npm install
Add environment variables in .env.local.
Start the development server.

npm run dev
Open http://localhost:3000.

Production commands:

npm run build
npm run start

Linting:

npm run lint

Monitoring And Observability

The app records request telemetry in MLflow when configured:

Route metadata: SQL vs RAG, confidence, forced route flags.
Timings: classifier, SQL execution, retrieval, summarization, total latency.
Outcomes: fallback events, row count, failure flags, answer length.
Feedback signals: like/dislike and optional comment presence.

MLflow failures do not crash the request path; logging degrades gracefully.

Data Sources And Query Routing

Primary structured table:

customer_suppport_agent.raw.orders

Vector indexes:

customer_suppport_agent.raw.orders_index
customer_suppport_agent.raw.analytics_index

Feedback table:

customer_suppport_agent.raw.message_feedback

Additional details about schema usage and classifier behavior are in:

Documentation Index

Known Gaps And Future Improvements

Ingest API folders are present but currently unimplemented.
Classifier prompt file contains legacy sections and duplicated instruction blocks that should be consolidated.
Chat history persists in browser local storage only; there is no server-side archival.
SQL result pagination is not exposed in UI beyond first 10 displayed rows.

License And Internal Usage

This repository currently has no explicit open-source license file. Treat it as internal unless a license is added.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
app		app
components		components
docs		docs
lib		lib
prompts		prompts
public		public
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
DEMOGRAPHIC_ANALYSIS_GUIDE.md		DEMOGRAPHIC_ANALYSIS_GUIDE.md
README.md		README.md
eslint.config.mjs		eslint.config.mjs
next.config.ts		next.config.ts
package-lock.json		package-lock.json
package.json		package.json
postcss.config.mjs		postcss.config.mjs
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

E Commerce Analytics AI Agent

Table Of Contents

Project Goals

Feature Set

System Architecture

Repository Structure

Tech Stack

End-To-End Request Lifecycle

API Surface

Environment Variables

Setup And Run

Monitoring And Observability

Data Sources And Query Routing

Documentation Index

Known Gaps And Future Improvements

License And Internal Usage

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

E Commerce Analytics AI Agent

Table Of Contents

Project Goals

Feature Set

System Architecture

Repository Structure

Tech Stack

End-To-End Request Lifecycle

API Surface

Environment Variables

Setup And Run

Monitoring And Observability

Data Sources And Query Routing

Documentation Index

Known Gaps And Future Improvements

License And Internal Usage

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages