Skip to content

Experiment: DOMShell vs Raw HTML interface comparison #29

@apireno

Description

@apireno

Summary

Run an apples-to-apples comparison of DOMShell's AX-tree filesystem interface vs raw HTML scraping, using the same model (Qwen3-4B) on the same tasks.

Design

Core matrix — [nexa, ollama] x [domshell, html]:

DOMShell Raw HTML
Nexa serve agent.py via MCP raw_html_agent.py via requests+BS4
Ollama agent.py via MCP raw_html_agent.py via requests+BS4

All 4 cells use the same Qwen3-4B weights. Only variables: interface + backend.

Tasks (simplified for 4B model)

  1. Page title — extract the page title from a Wikipedia article
  2. First paragraph — extract the opening paragraph
  3. List headings — list all section headings

12 trials total (3 tasks x 4 matrix cells), max 10 turns each.

Goal

Validate whether DOMShell's structured interface actually helps small models extract web content, compared to feeding them raw HTML. This tests the interface design, not the model capability.

Files

  • experiments/nexa_interface/ — experiment infrastructure, prompts, runner script
  • experiments/nexa_interface/raw_html_agent.py — baseline agent using requests + BeautifulSoup
  • integrations/nexa/agent.py — DOMShell agent (already exists)

Related

  • Previous experiment: experiments/nexa_claude/ (model size comparison, 0/12 tasks completed)
  • Follows up on roadmap item in README.md

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions