From b12bb8ff0b8accecb621fd58d97c60aaf363246a Mon Sep 17 00:00:00 2001
From: Manuel Moreno Delgado <manuj243@gmail.com>
Date: Tue, 20 Jan 2026 23:31:52 +0100
Subject: [PATCH 1/2] Add pdf2md-ai: AI-powered PDF to Markdown converter

---
 src/pdf2md-ai/README.md | 156 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 156 insertions(+)
 create mode 100644 src/pdf2md-ai/README.md

diff --git a/src/pdf2md-ai/README.md b/src/pdf2md-ai/README.md
new file mode 100644
index 0000000000..5a988f4c9f
--- /dev/null
+++ b/src/pdf2md-ai/README.md
@@ -0,0 +1,156 @@
+# PDF to Markdown (pdf2md-ai)
+
+AI-powered PDF to Markdown converter using advanced AI. Preserves document structure, tables, and formatting with intelligent content extraction.
+
+## Features
+
+- **Intelligent Extraction**: Uses advanced AI (Gemini) for accurate content extraction
+- **Structure Preservation**: Maintains headings, tables, lists, and formatting
+- **Multi-language Support**: Processes documents in any language
+- **Credit-based System**: Transparent usage tracking
+- **Fast Processing**: Typical 1-page PDF converted in seconds
+
+## Installation
+
+### Via NPX (Recommended)
+
+Add to your MCP settings file:
+
+#### Claude Desktop
+
+On MacOS: `~/Library/Application Support/Claude/claude_desktop_config.json`
+On Windows: `%APPDATA%\Claude\claude_desktop_config.json`
+
+```json
+{
+  "mcpServers": {
+    "pdf2md-ai": {
+      "command": "npx",
+      "args": ["-y", "pdf2md-ai"],
+      "env": {
+        "PDF_TO_MARKDOWN_API_KEY": "your-api-key-here"
+      }
+    }
+  }
+}
+```
+
+#### Cursor
+
+Add to your Cursor MCP settings:
+
+```json
+{
+  "mcpServers": {
+    "pdf2md-ai": {
+      "command": "npx",
+      "args": ["-y", "pdf2md-ai"],
+      "env": {
+        "PDF_TO_MARKDOWN_API_KEY": "your-api-key-here"
+      }
+    }
+  }
+}
+```
+
+### Getting an API Key
+
+1. Visit [pdf-to-markdown-pro.onrender.com](https://pdf-to-markdown-pro.onrender.com)
+2. Sign up for a free account
+3. Copy your API key from the dashboard
+4. Add it to your MCP configuration as shown above
+
+## Usage
+
+Once configured, simply ask your AI assistant:
+
+```
+Convert this PDF to markdown: /path/to/your/document.pdf
+```
+
+The server will:
+1. Read the PDF file from your local system
+2. Process it using advanced AI
+3. Return formatted Markdown with statistics
+
+### Example
+
+**Request:**
+```
+Convert this contract: C:\Documents\agreement.pdf
+```
+
+**Response:**
+```
+✅ Conversion Completed Successfully
+
+📊 Statistics:
+- Pages processed: 8
+- Credits used: 8
+- Credits remaining: 492
+
+## Contract Content:
+
+[Full markdown content here with preserved structure, tables, and formatting...]
+```
+
+## Tools
+
+### convert_pdf_to_markdown
+
+Converts a PDF file to Markdown format.
+
+**Arguments:**
+- `filePath` (string, required): Absolute path to the PDF file on your local system
+
+**Returns:**
+- Markdown-formatted content
+- Document statistics (pages, file size)
+- Credit usage information
+
+## Configuration
+
+### Environment Variables
+
+- `PDF_TO_MARKDOWN_API_KEY` (required): Your API key from the service
+- `PDF_API_URL` (optional): Custom API endpoint (defaults to production)
+
+## Use Cases
+
+- **Document Analysis**: Extract text from contracts, reports, invoices
+- **RAG Pipelines**: Convert PDFs to Markdown for vector databases and embeddings
+- **Content Migration**: Batch convert PDF documentation to Markdown format
+- **Research**: Extract academic papers and technical documents
+- **Data Extraction**: Pull structured data from forms and tables
+- **Archiving**: Create searchable text versions of PDF archives
+
+## Requirements
+
+- Node.js 18 or higher
+- Internet connection for API access
+- Valid API key with available credits
+
+## Limitations
+
+- Maximum file size: 50 MB recommended
+- Request timeout: 5 minutes per file
+- Credit-based: Each page consumes 1 credit
+- Requires network access to processing API
+
+## Pricing
+
+- Free tier available with limited credits
+- Pay-as-you-go model: 1 credit per page
+- Enterprise plans available for high-volume usage
+
+Visit [pdf-to-markdown-pro.onrender.com](https://pdf-to-markdown-pro.onrender.com) for current pricing.
+
+## Links
+
+- [NPM Package](https://www.npmjs.com/package/pdf2md-ai)
+- [Get API Key](https://pdf-to-markdown-pro.onrender.com)
+- [GitHub Issues](https://github.com/MANUJ243/pdf2md-ai/issues)
+
+## License
+
+MIT

From ca3a7fc4905478cacfe6a9545b95dec95ec8df59 Mon Sep 17 00:00:00 2001
From: Manuel Moreno Delgado <manuj243@gmail.com>
Date: Wed, 21 Jan 2026 00:01:28 +0100
Subject: [PATCH 2/2] Update README: emphasize context preservation (images,
 tables, code)

---
 src/pdf2md-ai/README.md | 55 +++++++++++++++++++++++++----------------
 1 file changed, 34 insertions(+), 21 deletions(-)

diff --git a/src/pdf2md-ai/README.md b/src/pdf2md-ai/README.md
index 5a988f4c9f..d50ac1f151 100644
--- a/src/pdf2md-ai/README.md
+++ b/src/pdf2md-ai/README.md
@@ -1,14 +1,20 @@
-# PDF to Markdown (pdf2md-ai)
+﻿# PDF to Markdown (pdf2md-ai)
 
-AI-powered PDF to Markdown converter using advanced AI. Preserves document structure, tables, and formatting with intelligent content extraction.
+AI-powered PDF to Markdown converter that **preserves complete context**: images (analyzed and described with AI), complex tables (including merged cells), code blocks (with original formatting), and document structure. Uses Gemini and LlamaParse for intelligent processing.
 
-## Features
+## Key Features
 
-- **Intelligent Extraction**: Uses advanced AI (Gemini) for accurate content extraction
-- **Structure Preservation**: Maintains headings, tables, lists, and formatting
-- **Multi-language Support**: Processes documents in any language
-- **Credit-based System**: Transparent usage tracking
-- **Fast Processing**: Typical 1-page PDF converted in seconds
+This is not just a simple PDF text extractor. pdf2md-ai **preserves complete visual and structural context**:
+
+- 📸 **Images with Context**: Each image is analyzed with AI (Gemini) and described in detail, maintaining its context within the document
+- 📊 **Complex Tables**: Preserves complete table structure including merged cells, alignment, and formatting
+- 💻 **Source Code**: Maintains code blocks with original syntax and formatting intact
+- 📝 **Document Structure**: Hierarchies, lists, quotes, and special formatting preserved
+- 🌍 **Multi-language Support**: Processes documents in any language
+- ⚡ **Fast Processing**: Typical 1-page PDF converted in seconds
+- 💳 **Credit-based System**: Transparent usage tracking (1 credit per page)
+
+This means when you convert a technical PDF, a report with graphics, or documentation with code examples, **you don't lose any visual or structural information**.
 
 ## Installation
 
@@ -70,14 +76,16 @@ Convert this PDF to markdown: /path/to/your/document.pdf
 
 The server will:
 1. Read the PDF file from your local system
-2. Process it using advanced AI
-3. Return formatted Markdown with statistics
+2. Analyze images with AI and extract descriptions
+3. Preserve complete table structures
+4. Maintain code blocks with original formatting
+5. Return formatted Markdown with full context preserved
 
 ### Example
 
 **Request:**
 ```
-Convert this contract: C:\Documents\agreement.pdf
+Convert this technical document: C:\Documents\api-guide.pdf
 ```
 
 **Response:**
@@ -89,22 +97,26 @@ Convert this contract: C:\Documents\agreement.pdf
 - Credits used: 8
 - Credits remaining: 492
 
-## Contract Content:
+## API Guide Content:
 
-[Full markdown content here with preserved structure, tables, and formatting...]
+[Full markdown content here with:
+ - Image descriptions in context
+ - Complex tables fully preserved
+ - Code examples with syntax highlighting
+ - Complete document structure maintained...]
 ```
 
 ## Tools
 
 ### convert_pdf_to_markdown
 
-Converts a PDF file to Markdown format.
+Converts a PDF file to Markdown format preserving complete context: images, tables, code blocks, and structure.
 
 **Arguments:**
 - `filePath` (string, required): Absolute path to the PDF file on your local system
 
 **Returns:**
-- Markdown-formatted content
+- Markdown-formatted content with complete context preservation
 - Document statistics (pages, file size)
 - Credit usage information
 
@@ -117,12 +129,13 @@ Converts a PDF file to Markdown format.
 
 ## Use Cases
 
-- **Document Analysis**: Extract text from contracts, reports, invoices
-- **RAG Pipelines**: Convert PDFs to Markdown for vector databases and embeddings
-- **Content Migration**: Batch convert PDF documentation to Markdown format
-- **Research**: Extract academic papers and technical documents
-- **Data Extraction**: Pull structured data from forms and tables
-- **Archiving**: Create searchable text versions of PDF archives
+- **Technical Documentation**: Convert docs with diagrams, tables, and code while preserving all context
+- **Research Papers**: Extract academic papers with figures, complex tables, and references
+- **RAG Pipelines**: Create context-rich markdown for vector databases and embeddings
+- **Contract Analysis**: Process legal documents with tables and structured information
+- **Data Extraction**: Pull structured data from forms and complex tables
+- **Code Documentation**: Extract programming guides with code examples intact
+- **Report Processing**: Convert business reports maintaining charts and table context
 
 ## Requirements