| layout | default |
|---|---|
| title | Browser Use Tutorial - Chapter 4: Form Automation |
| nav_order | 4 |
| has_children | false |
| parent | Browser Use Tutorial |
Welcome to Chapter 4: Form Automation - Intelligent Form Filling and Submission. In this part of Browser Use Tutorial: AI-Powered Web Automation Agents, you will build an intuitive mental model first, then move into concrete implementation details and practical production tradeoffs.
Automate form filling, validation, and submission with intelligent field detection and data handling.
Form automation is a key use case for web automation. Browser Use provides intelligent form detection, field mapping, and data entry capabilities. This chapter covers automated form filling, validation handling, and complex form workflows.
# basic_form_filling.py
import asyncio
from browser_use import Agent
from langchain_openai import ChatOpenAI
async def basic_form_filling():
"""Demonstrate basic form filling operations"""
agent = Agent(
task="""
Fill out the form at https://httpbin.org/forms/post:
Personal Information:
- Customer name: Sarah Johnson
- Telephone: 555-0124
- Email: sarah.johnson@example.com
Order Details:
- Pizza Size: Large
- Toppings: Pepperoni, Mushrooms, Extra cheese
- Delivery Instructions: Please knock loudly on door
Payment Information:
- Payment Method: Credit Card
Submit the form and describe the response you receive.
""",
llm=ChatOpenAI(model="gpt-4o"),
max_steps=20
)
result = await agent.run()
print(f"Form Submission Result: {result}")
if __name__ == "__main__":
asyncio.run(basic_form_filling())# form_field_detection.py
import asyncio
from browser_use import Agent
from langchain_openai import ChatOpenAI
async def form_field_detection():
"""Demonstrate intelligent form field detection"""
agent = Agent(
task="""
Analyze the form structure at https://httpbin.org/forms/post:
1. Identify all form fields and their types
2. Determine which fields are required vs optional
3. Understand the field labels and expected data formats
4. Note any validation rules or constraints
5. Describe the overall form layout and organization
Don't submit the form - just analyze its structure.
""",
llm=ChatOpenAI(model="gpt-4o"),
max_steps=15,
use_vision=True # Enable vision for form layout analysis
)
result = await agent.run()
print(f"Form Analysis: {result}")
if __name__ == "__main__":
asyncio.run(form_field_detection())# multi_page_forms.py
import asyncio
from browser_use import Agent
from langchain_openai import ChatOpenAI
async def multi_page_form_handling():
"""Handle complex multi-page form workflows"""
agent = Agent(
task="""
Handle a complex multi-step form process:
1. Go to a site with multi-page forms (like registration wizards)
2. Start filling out the first page of the form
3. Navigate through all form steps/pages
4. Handle any conditional fields that appear based on previous answers
5. Complete the entire form submission process
6. Verify successful submission
If you can't find a perfect multi-page form, adapt this approach to any complex form you can find.
""",
llm=ChatOpenAI(model="gpt-4o"),
max_steps=35,
use_vision=True
)
result = await agent.run()
print(f"Multi-Page Form Result: {result}")
if __name__ == "__main__":
asyncio.run(multi_page_form_handling())# dynamic_forms.py
import asyncio
from browser_use import Agent
from langchain_openai import ChatOpenAI
async def dynamic_form_handling():
"""Handle forms with dynamic fields and conditional logic"""
agent = Agent(
task="""
Handle a form with dynamic/conditional fields:
1. Find a form that shows/hides fields based on selections
2. Start with basic field selections
3. Observe how the form changes based on your inputs
4. Fill in the dynamically revealed fields
5. Complete the form submission
Look for forms with features like:
- Country selection that shows province/state fields
- Checkbox options that reveal additional input fields
- Radio buttons that change available options
- Multi-step wizards with conditional steps
""",
llm=ChatOpenAI(model="gpt-4o"),
max_steps=30,
use_vision=True
)
result = await agent.run()
print(f"Dynamic Form Result: {result}")
if __name__ == "__main__":
asyncio.run(dynamic_form_handling())# structured_data_input.py
import asyncio
from browser_use import Agent
from langchain_openai import ChatOpenAI
async def structured_data_entry():
"""Handle structured data entry from various sources"""
# Sample data to enter
customer_data = {
"name": "Michael Chen",
"email": "michael.chen@company.com",
"phone": "555-0199",
"company": "Tech Solutions Inc",
"address": {
"street": "123 Business St",
"city": "San Francisco",
"state": "CA",
"zip": "94105"
}
}
agent = Agent(
task=f"""
Fill out a comprehensive form using this structured data:
Customer Information:
- Name: {customer_data['name']}
- Email: {customer_data['email']}
- Phone: {customer_data['phone']}
- Company: {customer_data['company']}
Address Details:
- Street: {customer_data['address']['street']}
- City: {customer_data['address']['city']}
- State: {customer_data['address']['state']}
- ZIP: {customer_data['address']['zip']}
Find an appropriate form to fill out with this data.
If you can't find a real form, use https://httpbin.org/forms/post
and adapt the data to fit the available fields.
Verify all data is entered correctly before submitting.
""",
llm=ChatOpenAI(model="gpt-4o"),
max_steps=25
)
result = await agent.run()
print(f"Structured Data Entry Result: {result}")
if __name__ == "__main__":
asyncio.run(structured_data_entry())# bulk_data_entry.py
import asyncio
from browser_use import Agent
from langchain_openai import ChatOpenAI
async def bulk_data_entry():
"""Handle bulk data entry operations"""
# Sample bulk data
products = [
{"name": "Laptop", "price": 1299.99, "category": "Electronics"},
{"name": "Mouse", "price": 29.99, "category": "Accessories"},
{"name": "Keyboard", "price": 79.99, "category": "Accessories"},
{"name": "Monitor", "price": 399.99, "category": "Electronics"},
]
agent = Agent(
task=f"""
Perform bulk data entry for these products:
{chr(10).join(f"- {p['name']}: ${p['price']} ({p['category']})" for p in products)}
1. Find a form or system that accepts product data entry
2. Enter each product one by one
3. Verify each entry is saved correctly
4. Keep track of successful vs failed entries
If you can't find a real product entry system, simulate the process
by documenting what you would do for each product.
Report the results of the bulk entry operation.
""",
llm=ChatOpenAI(model="gpt-4o"),
max_steps=40
)
result = await agent.run()
print(f"Bulk Data Entry Result: {result}")
if __name__ == "__main__":
asyncio.run(bulk_data_entry())# validation_handling.py
import asyncio
from browser_use import Agent
from langchain_openai import ChatOpenAI
async def form_validation_handling():
"""Handle form validation errors and corrections"""
agent = Agent(
task="""
Test form validation and error handling:
1. Find a form with validation requirements
2. Try to submit the form with invalid data first:
- Missing required fields
- Invalid email format
- Incorrect data types
- Values outside acceptable ranges
3. Observe the validation errors
4. Correct the invalid data
5. Resubmit the form successfully
Document the validation rules you discover and how you handle errors.
Use https://httpbin.org/forms/post if you need a simple test form.
""",
llm=ChatOpenAI(model="gpt-4o"),
max_steps=30,
use_vision=True # Help detect validation messages
)
result = await agent.run()
print(f"Validation Handling Result: {result}")
if __name__ == "__main__":
asyncio.run(form_validation_handling())# captcha_handling.py
import asyncio
from browser_use import Agent
from langchain_openai import ChatOpenAI
async def captcha_handling():
"""Handle CAPTCHA and bot protection systems"""
agent = Agent(
task="""
Test interaction with CAPTCHA and bot protection:
1. Find a form that has CAPTCHA protection
2. Attempt to fill out the form
3. Encounter the CAPTCHA challenge
4. Describe the type of CAPTCHA (image, reCAPTCHA, etc.)
5. Explain the limitations of automated solving
Note: Most CAPTCHAs cannot be solved automatically by browsers.
Focus on understanding the limitations and alternative approaches.
Use a site like https://www.google.com/recaptcha/admin if you can find a test form.
""",
llm=ChatOpenAI(model="gpt-4o"),
max_steps=20,
use_vision=True
)
result = await agent.run()
print(f"CAPTCHA Handling Analysis: {result}")
if __name__ == "__main__":
asyncio.run(captcha_handling())# file_upload_forms.py
import asyncio
from browser_use import Agent
from langchain_openai import ChatOpenAI
async def file_upload_handling():
"""Handle file upload forms and operations"""
agent = Agent(
task="""
Handle file upload functionality:
1. Find a form that accepts file uploads
2. Prepare or create a sample file to upload
3. Fill out any accompanying form fields
4. Upload the file successfully
5. Verify the upload was successful
Look for forms that accept:
- Image uploads
- Document uploads
- CSV/data file uploads
- Multiple file uploads
If you can't find a real upload form, explain the process you would follow.
""",
llm=ChatOpenAI(model="gpt-4o"),
max_steps=25
)
result = await agent.run()
print(f"File Upload Result: {result}")
if __name__ == "__main__":
asyncio.run(file_upload_handling())# form_data_extraction.py
import asyncio
from browser_use import Agent
from langchain_openai import ChatOpenAI
async def form_data_extraction():
"""Extract data from existing forms"""
agent = Agent(
task="""
Extract form data and structure:
1. Find a comprehensive form (registration, application, etc.)
2. Analyze all form fields and their properties:
- Field types (text, select, radio, checkbox, etc.)
- Field labels and help text
- Validation rules and requirements
- Default values and placeholders
3. Document the complete form structure
4. Extract any example data or instructions provided
Focus on understanding form design patterns and user experience considerations.
""",
llm=ChatOpenAI(model="gpt-4o"),
max_steps=20,
use_vision=True
)
result = await agent.run()
print(f"Form Data Extraction: {result}")
if __name__ == "__main__":
asyncio.run(form_data_extraction())# multi_form_workflows.py
import asyncio
from browser_use import Agent
from langchain_openai import ChatOpenAI
async def multi_form_workflow():
"""Handle complex workflows involving multiple forms"""
agent = Agent(
task="""
Execute a complex multi-form workflow:
1. Start with a registration or signup form
2. Complete the initial registration
3. Handle email verification if required
4. Fill out a profile completion form
5. Set up additional account preferences
6. Complete any onboarding wizards
This simulates a complete user onboarding process.
Document each step and any challenges encountered.
If you can't find a real multi-step process, break down a complex
single form into logical steps and explain your approach.
""",
llm=ChatOpenAI(model="gpt-4o"),
max_steps=50,
use_vision=True
)
result = await agent.run()
print(f"Multi-Form Workflow Result: {result}")
if __name__ == "__main__":
asyncio.run(multi_form_workflow())# form_testing.py
import asyncio
from browser_use import Agent
from langchain_openai import ChatOpenAI
async def automated_form_testing():
"""Automate form testing scenarios"""
test_scenarios = [
"Test form with valid data - should submit successfully",
"Test form with missing required fields - should show validation errors",
"Test form with invalid email format - should show email validation error",
"Test form with data exceeding length limits - should show length validation",
"Test form with special characters in text fields - should handle encoding properly"
]
for scenario in test_scenarios:
print(f"\n--- Testing: {scenario} ---")
agent = Agent(
task=f"""
{scenario}
1. Find an appropriate test form
2. Execute the test scenario
3. Document the results and any error messages
4. Verify expected behavior occurred
Use https://httpbin.org/forms/post as your primary test form.
""",
llm=ChatOpenAI(model="gpt-4o"),
max_steps=20
)
result = await agent.run()
print(f"Test Result: {result}")
if __name__ == "__main__":
asyncio.run(automated_form_testing())# enterprise_integration.py
import asyncio
from browser_use import Agent
from langchain_openai import ChatOpenAI
import json
async def enterprise_form_integration():
"""Integrate form automation with enterprise systems"""
# Sample enterprise data
employee_data = {
"employee_id": "EMP001",
"name": "John Smith",
"department": "Engineering",
"start_date": "2024-01-15",
"manager": "Jane Doe",
"access_levels": ["email", "vpn", "development_tools"]
}
agent = Agent(
task=f"""
Automate enterprise employee onboarding:
Employee Details:
- ID: {employee_data['employee_id']}
- Name: {employee_data['name']}
- Department: {employee_data['department']}
- Start Date: {employee_data['start_date']}
- Manager: {employee_data['manager']}
- Access Levels: {', '.join(employee_data['access_levels'])}
1. Access HR system or employee management portal
2. Create new employee record
3. Fill out all required information
4. Set up department and manager assignments
5. Configure access permissions
6. Submit and verify successful creation
This demonstrates enterprise form automation for HR processes.
If you can't access real systems, simulate the process with available forms.
""",
llm=ChatOpenAI(model="gpt-4o"),
max_steps=35
)
result = await agent.run()
print(f"Enterprise Integration Result: {result}")
if __name__ == "__main__":
asyncio.run(enterprise_form_integration())# compliance_automation.py
import asyncio
from browser_use import Agent
from langchain_openai import ChatOpenAI
async def compliance_form_automation():
"""Handle compliance-related form automation"""
compliance_data = {
"report_type": "GDPR Data Processing",
"data_controller": "Example Corp",
"processing_purpose": "Customer relationship management",
"data_categories": ["Personal identification", "Contact information", "Purchase history"],
"legal_basis": "Contract performance",
"retention_period": "7 years",
"security_measures": ["Encryption", "Access controls", "Audit logging"]
}
agent = Agent(
task=f"""
Complete a compliance reporting form:
Report Details:
- Type: {compliance_data['report_type']}
- Data Controller: {compliance_data['data_controller']}
- Processing Purpose: {compliance_data['processing_purpose']}
- Data Categories: {', '.join(compliance_data['data_categories'])}
- Legal Basis: {compliance_data['legal_basis']}
- Retention Period: {compliance_data['retention_period']}
- Security Measures: {', '.join(compliance_data['security_measures'])}
1. Access compliance reporting system
2. Select appropriate report type
3. Fill out all compliance fields accurately
4. Attach any required documentation
5. Submit report and obtain confirmation
This demonstrates regulatory compliance automation.
Maintain audit trail and accuracy throughout the process.
""",
llm=ChatOpenAI(model="gpt-4o"),
max_steps=30,
save_conversation_path="./compliance_audit.json" # Audit trail
)
result = await agent.run()
print(f"Compliance Automation Result: {result}")
if __name__ == "__main__":
asyncio.run(compliance_form_automation())flowchart TD
A[Agent navigates to form page]
B[DOM analysis identifies form fields]
C[Field types detected: text select checkbox radio]
D[Agent maps input data to fields]
E[input_text actions fill text fields]
F[select_option actions choose dropdowns]
G[click actions select checkboxes]
H[Submit button clicked]
I[Success or error response validated]
A --> B
B --> C
C --> D
D --> E
D --> F
D --> G
E --> H
F --> H
G --> H
H --> I
In this chapter, we've covered:
- Basic Form Filling: Simple form automation with structured data
- Form Field Detection: Intelligent analysis of form structure and requirements
- Complex Form Handling: Multi-page and dynamic form workflows
- Data Entry Strategies: Structured and bulk data input operations
- Validation Handling: Error detection and correction in form submissions
- Advanced Techniques: File uploads, data extraction, and CAPTCHA handling
- Workflow Automation: Multi-form processes and testing automation
- Enterprise Integration: Business system integration and compliance automation
- Intelligent Field Detection: AI understands form structure and field purposes
- Dynamic Form Handling: Adapts to conditional fields and multi-page forms
- Validation Management: Handles errors gracefully and corrects invalid data
- Structured Data Input: Efficiently processes bulk and complex data entry
- Workflow Orchestration: Manages complex multi-step form processes
- Enterprise Integration: Connects with business systems and compliance requirements
- Error Recovery: Robust handling of form submission failures and retries
Now that you can automate form filling and submission, let's explore data extraction techniques for gathering information from websites.
Ready for Chapter 5? Data Extraction
Generated for Awesome Code Docs
Most teams struggle here because the hard part is not writing more code, but deciding clear boundaries for form, asyncio, Agent so behavior stays predictable as complexity grows.
In practical terms, this chapter helps you avoid three common failures:
- coupling core logic too tightly to one implementation path
- missing the handoff boundaries between setup, execution, and validation
- shipping changes without clear rollback or observability strategy
After working through this chapter, you should be able to reason about Chapter 4: Form Automation - Intelligent Form Filling and Submission as an operating subsystem inside Browser Use Tutorial: AI-Powered Web Automation Agents, with explicit contracts for inputs, state transitions, and outputs.
Use the implementation notes around ChatOpenAI, agent, result as your checklist when adapting these patterns to your own repository.
Under the hood, Chapter 4: Form Automation - Intelligent Form Filling and Submission usually follows a repeatable control path:
- Context bootstrap: initialize runtime config and prerequisites for
form. - Input normalization: shape incoming data so
asyncioreceives stable contracts. - Core execution: run the main logic branch and propagate intermediate state through
Agent. - Policy and safety checks: enforce limits, auth scopes, and failure boundaries.
- Output composition: return canonical result payloads for downstream consumers.
- Operational telemetry: emit logs/metrics needed for debugging and performance tuning.
When debugging, walk this sequence in order and confirm each stage has explicit success/failure conditions.
Use the following upstream sources to verify implementation details while reading this chapter:
- Browser Use Repository
Why it matters: authoritative reference on
Browser Use Repository(github.com). - Browser Use Releases
Why it matters: authoritative reference on
Browser Use Releases(github.com). - Browser Use Docs
Why it matters: authoritative reference on
Browser Use Docs(docs.browser-use.com). - Browser Use Cloud
Why it matters: authoritative reference on
Browser Use Cloud(cloud.browser-use.com).
Suggested trace strategy:
- search upstream code for
formandasyncioto map concrete implementation paths - compare docs claims against actual runtime/config code before reusing patterns in production