This is a simple Python project I made that lets you ask questions about your PDF files.
It reads a PDF, breaks the text down into smaller chunks, and uses LLM to answer your questions. The best part is that it ONLY uses the information inside the PDF to answer. If the answer isn't in the PDF, it will tell you it doesn't know!
- Reads the PDF: Uses
pymupdfto grab all the text from the file. - Splits the text: Chops the text into smaller blocks so the LLM can process it easily.
- Creates Embeddings: Converts the text chunks into numbers using
sentence-transformers. - Searches: Uses
faissto find the most relevant chunks of text based on your question. - Answers: Sends the relevant text to a large language model (Llama 3 via the Groq API) to generate the final answer.
Make sure you have a PDF file you want to ask questions about. Rename your PDF file to context.pdf and place it in the exact same folder as the app.py script.
This project uses Groq to run the AI model. You will need a API key from them.
- Rename the
.env.examplefile to.env. - Open the
.envfile and replaceYOUR_GROQ_API_KEYwith your actual Groq API key. It should look like this:
GROQ_API_KEY=YOUR_GROQ_API_KEYOpen your terminal or command prompt, make sure you are in the project folder, and run this command to download all the necessary tools:
pip install -r requirements.txtOnce everything is set up, just run the Python script:
python main.pyIt will load the PDF, process it, and then ask you to enter a question. When you are done chatting, just type exit or quit to end the program.