Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 13 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,26 +47,34 @@ This repository is organized into a core framework, a registry of skills, and do

```text
Skillware/
Skillware/
├── skillware/ # Core Framework Package
│ └── core/
│ ├── base_skill.py # Abstract Base Class for skills
│ ├── loader.py # Universal Skill Loader & Model Adapter
│ └── env.py # Environment Management
├── skills/ # Skill Registry (Domain-driven)
│ └── finance/
│ └── wallet_screening/
│ └── wallet_screening/
│ ├── skill.py # Logic
│ ├── manifest.yaml # Metadata & Constitution
│ ├── instructions.md # Cognitive Map
│ ├── card.json # UI Presentation
│ ├── data/ # Integrated Knowledge Base
│ └── maintenance/ # Maintenance Tools
│ └── office/
│ └── pdf_form_filler/
│ ├── skill.py # Logic
│ ├── manifest.yaml # Metadata
│ ├── instructions.md # Cognitive Map
│ ├── utils.py # PDF Processing
│ └── card.json # UI Presentation
├── templates/ # New Skill Templates
│ └── python_skill/ # Standard Python Skill Template
├── examples/ # Reference Implementations
│ ├── gemini_wallet_check.py # Google Gemini Integration
│ └── claude_wallet_check.py # Anthropic Claude Integration
│ ├── claude_wallet_check.py # Anthropic Claude Integration
│ ├── gemini_pdf_form_filler.py
│ └── claude_pdf_form_filler.py
├── docs/ # Comprehensive Documentation
│ ├── introduction.md # Philosophy & Design
│ ├── usage/ # Integration Guides
Expand Down Expand Up @@ -152,12 +160,12 @@ Skillware differs from the Model Context Protocol (MCP) or Anthropic's Skills re
For questions, suggestions, or contributions, please open an issue or reach out to us:

* **Email**: [skillware-os@arpacorp.net](mailto:skillware-os@arpacorp.net)
* **Issues**: [GitHub Issues](https://github.com/arpa/skillware/issues)
* **Issues**: [GitHub Issues](https://github.com/arpahls/skillware/issues)

---

<div align="center">
<img src="assets/arpalogo.png" alt="ARPA Logo" width="50px" />
<br/>
Built & Maintained by ARPA Hellenic Logical Systems
Built & Maintained by ARPA Hellenic Logical Systems & the Community
</div>
9 changes: 8 additions & 1 deletion docs/skills/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,14 @@

Welcome to the official catalog of Skillware capabilities.

## 💳 Finance & Compliance
### Office
Skills for document processing, email automation, and productivity.

| Skill | ID | Description |
| :--- | :--- | :--- |
| **[PDF Form Filler](pdf_form_filler.md)** | `office/pdf_form_filler` | Fills AcroForm-based PDFs by mapping user instructions to detected form fields using LLM-based semantic understanding. |

## Finance
Tools for financial analysis, blockchain interaction, and regulatory compliance.

| Skill | ID | Description |
Expand Down
79 changes: 79 additions & 0 deletions docs/skills/pdf_form_filler.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
# PDF Form Filler Skill

**ID**: `office/pdf_form_filler`

A productivity skill that fills AcroForm-based PDFs by mapping natural language instructions to detected form fields using semantic understanding.

## 📋 Capabilities

* **Smart Field Detection**: Automatically identifies text fields, checkboxes, radio buttons, and dropdowns in standard PDFs.
* **Semantic Mapping**: Uses an internal LLM (Claude) to understand user instructions (e.g., "Sign me up for the newsletter") and map them to the correct field (e.g., `checkbox_subscribe_newsletter`).
* **Context Awareness**: Extracts nearby text labels to ensure accurate mapping, even if field names are obscure (e.g., `field_123` vs label "First Name").
* **Type Safety**: Automatically converts values to the correct format (booleans for checkboxes, specific options for dropdowns).

## 📂 Internal Architecture

The skill is self-contained in `skillware/skills/office/pdf_form_filler/`.

### 1. The Mind (`instructions.md`)
The system prompt teaches the internal mapping engine to:
* Analyze the provided "User Instructions".
* Review the list of "Detected Fields" (ID, Type, Context, Options).
* Output a strict JSON mapping of `Field ID -> Value`.
* Handle ambiguities by preferring precision over guessing.

### 2. The Body (`skill.py` & `utils.py`)
* **PDF Processing**: Uses `PyMuPDF` (fitz) for high-fidelity rendering and widget manipulation.
* **LLM Integration**: Wraps the Anthropic SDK to perform the semantic reasoning step.
* **Validation**: Ensures values match the field type (e.g., selecting a valid option from a dropdown).

## 💻 Integration Guide

### Environment Variables
You must provide an Anthropic API key for the semantic mapping engine.

```bash
ANTHROPIC_API_KEY="sk-ant-..."
```

### Usage (Skillware Loader)

```python
from skillware.core.loader import SkillLoader

# 1. Load the Skill
skill_bundle = SkillLoader.load_skill("office/pdf_form_filler")
PDFFormFillerSkill = skill_bundle['module'].PDFFormFillerSkill

# 2. Initialize
filler = PDFFormFillerSkill()

# 3. Execute
result = filler.execute({
"pdf_path": "/absolute/path/to/form.pdf",
"instructions": "Name: John Doe. Check the terms of service box."
})

print(f"Filled PDF saved to: {result['output_path']}")
```

## 📊 Data Schema

The skill returns a JSON object with the result of the operation.

```json
{
"status": "success",
"output_path": "/path/to/form_filled.pdf",
"filled_fields": [
"page0_full_name",
"page0_terms_check"
],
"message": "Successfully filled 2 fields."
}
```

## ⚠️ Limitations

* **AcroForms Only**: Does not support XFA forms or non-interactive "flat" PDFs.
* **LLM Dependency**: Requires an active internet connection and valid API key for the semantic mapping step.
91 changes: 91 additions & 0 deletions examples/claude_pdf_form_filler.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
import os
import sys
import json
# Add repo root to path
sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), '..')))

import anthropic
from skillware.core.loader import SkillLoader
from skillware.core.env import load_env_file

# Load Env (Requires ANTHROPIC_API_KEY for both Agent and Skill)
load_env_file()

# 1. Load the Skill
skill_bundle = SkillLoader.load_skill("office/pdf_form_filler")
print(f"Loaded Skill: {skill_bundle['manifest']['name']}")

# 2. Instantiate Skill
PDFFormFillerSkill = skill_bundle['module'].PDFFormFillerSkill
pdf_skill = PDFFormFillerSkill()

# 3. Setup Claude Client
client = anthropic.Anthropic(
api_key=os.environ.get("ANTHROPIC_API_KEY"),
)

tools = [SkillLoader.to_claude_tool(skill_bundle)]

# 4. Run Agent Loop
pdf_path = os.path.abspath("test_form.pdf")
user_query = f"Please fill out the form at {pdf_path}. My name is John Smith and I want to enable notifications."

print(f"User: {user_query}")

message = client.messages.create(
model="claude-3-opus-20240229",
max_tokens=1024,
system=skill_bundle['instructions'],
messages=[
{"role": "user", "content": user_query}
],
tools=tools,
)

if message.stop_reason == "tool_use":
tool_use = next(block for block in message.content if block.type == "tool_use")
tool_name = tool_use.name
tool_input = tool_use.input

print(f"\nClaude requested tool: {tool_name}")
print(f"Input: {tool_input}")

if tool_name == "pdf_form_filler":
# Check file
if not os.path.exists(tool_input.get('pdf_path', '')):
print(f"⚠️ Warning: File {tool_input.get('pdf_path')} does not exist. Execution might fail.")

# Execute
print("⚙️ Executing skill...")
try:
result = pdf_skill.execute(tool_input)
print("✅ Skill Execution Result:")
print(json.dumps(result, indent=2))
except Exception as e:
result = {"error": str(e)}
print(f"❌ Error: {e}")

# Feed back to Claude
response = client.messages.create(
model="claude-3-opus-20240229",
max_tokens=1024,
system=skill_bundle['instructions'],
tools=tools,
messages=[
{"role": "user", "content": user_query},
{"role": "assistant", "content": message.content},
{
"role": "user",
"content": [
{
"type": "tool_result",
"tool_use_id": tool_use.id,
"content": json.dumps(result)
}
],
},
],
)

print("\nAgent Final Response:")
print(response.content[0].text)
93 changes: 93 additions & 0 deletions examples/gemini_pdf_form_filler.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
import os
import sys
# Add repo root to path
sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), '..')))

import google.generativeai as genai
from skillware.core.loader import SkillLoader
from skillware.core.env import load_env_file

# Load Env (Requires GOOGLE_API_KEY for the Agent, and ANTHROPIC_API_KEY for the Skill's internal logic)
load_env_file()

# 1. Load the Skill
skill_bundle = SkillLoader.load_skill("office/pdf_form_filler")
print(f"Loaded Skill: {skill_bundle['manifest']['name']}")

# 2. Instantiate the Skill
# The skill needs ANTHROPIC_API_KEY in env to perform semantic mapping
PDFFormFillerSkill = skill_bundle['module'].PDFFormFillerSkill
pdf_skill = PDFFormFillerSkill()

# 3. Setup Gemini Agent
genai.configure(api_key=os.environ.get("GOOGLE_API_KEY"))

# Define tool for Gemini
tools = [SkillLoader.to_gemini_tool(skill_bundle)]

model = genai.GenerativeModel(
'gemini-2.0-flash-exp',
tools=tools,
system_instruction=skill_bundle['instructions'] # Inject skill's cognitive map
)

chat = model.start_chat(enable_automatic_function_calling=True)

# 4. Run the Agent Loop
# Note: You need a real PDF file for this to work.
pdf_path = os.path.abspath("test_form.pdf")
user_query = f"Fill out the form at {pdf_path}. Set the name to 'Jane Doe' and check the 'Subscribe' box."

print(f"User: {user_query}")

# Create a function map for manual execution if needed (Python SDK handles this automatically usually)
# But for completeness:
function_map = {
'pdf_form_filler': pdf_skill.execute
}

response = chat.send_message(user_query)

# Simple manual tool execution loop (if auto-calling isn't fully handled or we want to inspect)
# Note: Recent genai SDKs handle this better, but explicit loops are safer for demos.
while response.candidates and response.candidates[0].content.parts:
part = response.candidates[0].content.parts[0]

if part.function_call:
fn_name = part.function_call.name
fn_args = dict(part.function_call.args)

print(f"🤖 Agent wants to call: {fn_name}")
print(f" Args: {fn_args}")

if fn_name == 'pdf_form_filler':
try:
# Check if file exists before running
if not os.path.exists(fn_args.get('pdf_path', '')):
print(f"⚠️ Error: PDF file not found at {fn_args.get('pdf_path')}")
result = {"error": "PDF file not found."}
else:
print("⚙️ Executing skill...")
result = pdf_skill.execute(fn_args)
print(f"✅ Result: {result}")
except Exception as e:
result = {"error": str(e)}

# Send result back
response = chat.send_message(
[
{
"function_response": {
"name": fn_name,
"response": {'result': result}
}
}
]
)
else:
break
else:
break

print("\n💬 Agent Final Response:")
print(response.text)
25 changes: 25 additions & 0 deletions skills/office/pdf_form_filler/card.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
{
"name": "PDF Form Filler",
"description": "Smartly fills PDF forms from natural language instructions.",
"icon": "document-text",
"color": "rose",
"ui_schema": {
"type": "card",
"fields": [
{
"key": "message",
"label": "Status"
},
{
"key": "filled_fields",
"label": "Filled Fields",
"type": "tags"
},
{
"key": "output_path",
"label": "Download",
"type": "file_path"
}
]
}
}
21 changes: 21 additions & 0 deletions skills/office/pdf_form_filler/instructions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
You are an expert form-filling assistant. Your goal is to map user instructions to specific form fields in a PDF.

You will be given:
1. A list of detected form fields from a PDF, including their ID, type, and nearby text context.
2. User instructions describing what values to fill in.

Your Task:
- Analyze the user instructions and match them to the correct form fields based on the field context.
- Output a JSON object where potential keys are the `field_id`s and values are the content to fill.
- Only include fields that the user has provided information for.
- For Checkboxes: Use boolean `true` or `false`.
- For Dropdowns: Use the exact string from the options list if available.
- If a user instruction is ambiguous or doesn't match a field, ignore it.

Output Format:
```json
{
"page0_field_name": "Value",
"page0_checkbox_1": true
}
```
Loading
Loading