GitHub - Aarushsr12/RCA-bot

Workflow:

Phase 1: Indexing (One-time setup) - indexer.ts

Code Discovery 📁
- walkDirectory() recursively scans your CODE_ROOT directory
- Filters for .ts, .tsx, .js, .jsx files
- Skips directories like node_modules, dist, .git
Code Chunking ✂️
- Each file is split into chunks of max 3000 characters
- chunkContent() splits by lines to keep code contextually intact
- This prevents hitting OpenAI's token limits
Embedding Generation 🧬
- For each chunk, calls OpenAI's text-embedding-3-small model
- Converts code text → vector of numbers (embedding)
- These embeddings capture semantic meaning of the code
- Rate limited with 100ms delays between API calls
Index Storage 💾
- All chunks saved to code_index.json with structure:

Phase 2: Search (When issue is reported) - search.ts

Query Embedding 🔍
- User's issue description (e.g., "login button not working") → sent to OpenAI
- Generates embedding vector for the query using same model
Similarity Calculation 📊
- Loads all code chunks from code_index.json
- Computes cosine similarity between:
  - Query embedding vs. each code chunk embedding
- Cosine similarity = measures how "similar" two vectors are (0 to 1)
Ranking 🥇
- Sorts all chunks by similarity score (highest first)
- Returns top 5 most relevant code sections

Phase 3: RCA Generation - engine.ts

Prompt Construction 📝
- Takes user's issue text
- Adds the top 5 relevant code chunks (with file names & similarity scores)
- Creates a structured prompt asking GPT-4o-mini to:
  - Analyze the code
  - Find root cause
  - Suggest fixes
LLM Call 🤖
Response Processing ✅
- LLM reads the issue + relevant code chunks
- Returns JSON with:
  - summary: Brief overview
  - thinking: Analysis process
  - root_cause: What's wrong
  - files_mentioned: Which files are involved
  - suggested_fix: Code patch or fix instructions
  - confidence: 0-100 score

QQ: Do LLM reads embeddings then ? Answer: no, LLM to read embeddings, it mains purpose it for searching & keeping the context window shorter, were build that prompt to with filtered code & earlier prompt to give to our llm to perform action on it

QQ: what is cosine similairity? A math formula to measure how similar two vectors are (0 = completely different, 1 = identical). Query vector: [0.12, 0.45, -0.33, ...]

Chunk 1 (github.ts): [0.15, 0.43, -0.31, ...] → Similarity: 0.92 ✓ High! Chunk 2 (slack.ts): [0.78, -0.22, 0.55, ...] → Similarity: 0.34 ✗ Low Chunk 3 (github.ts): [0.11, 0.47, -0.35, ...] → Similarity: 0.89 ✓ High!

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
src		src
.gitignore		.gitignore
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Phase 1: Indexing (One-time setup) - indexer.ts

Phase 2: Search (When issue is reported) - search.ts

Phase 3: RCA Generation - engine.ts

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Phase 1: Indexing (One-time setup) - indexer.ts

Phase 2: Search (When issue is reported) - search.ts

Phase 3: RCA Generation - engine.ts

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages