crawl4ai version
0.8.6 (installed via pip install crawl4ai==0.8.6 as a dep of mcp-crawl4ai 0.3.1).
Expected Behavior
When crawl4ai is used inside any program that uses stdout as a structured
data channel — e.g. an MCP server using the stdio transport — log output
should go to stderr, leaving stdout clean for the host protocol.
Current Behavior
crawl4ai/async_logger.py:139 constructs self.console = Console() with no
file= argument, so rich.console.Console defaults to sys.stdout.
Every log line — including the progress markers from url_status() such as
[FETCH] ↓ ..., https://... | ✓ | ⏱: 1.52s, [SCRAPE] ◆, [COMPLETE] ● —
is written to stdout.
When crawl4ai is wrapped by an MCP stdio server (e.g.
wyattowalsh/mcp-crawl4ai),
the MCP transport spec requires stdout to contain only newline-delimited
JSON-RPC messages. crawl4ai's log lines corrupt the JSON-RPC stream and the
client's JSONRPCMessage.model_validate_json raises Pydantic validation
errors for every leaked line:
ERROR mcp.client.stdio: Failed to parse JSONRPC message from server
pydantic_core._pydantic_core.ValidationError: 1 validation error for JSONRPCMessage
Invalid JSON: expected value at line 1 column 2 [type=json_invalid,
input_value='[FETCH]... ↓ ', input_type=str]
...
input_value='https://www.example.com...path/to/page', input_type=str
input_value='age-infrastructure/ | ✓ | ⏱: 1.52s ', input_type=str
input_value='[SCRAPE].. ◆ ', input_type=str
input_value='[COMPLETE] ● ', input_type=str
Functionally the tool call still succeeds (the real JSON-RPC response is
also written to stdout and parses fine), but every scrape produces 6–10
spurious ERROR lines in the host's log and risks confusing some MCP
clients into closing the connection.
Note: even when downstream code passes BrowserConfig(verbose=False),
url_status() still emits these lines — verbose=False only gates a
subset of log calls.
Root cause
crawl4ai/async_logger.py:
from rich.console import Console
...
class AsyncLogger(AsyncLoggerBase):
...
def __init__(self, ...):
...
self.console = Console() # ← defaults to sys.stdout
Proposed fix
Default the logger console to sys.stderr:
import sys
from rich.console import Console
...
self.console = Console(file=sys.stderr)
This is the universal convention for library logging and matches what
logging.StreamHandler defaults to. Programs that genuinely want logs
on stdout can override by passing a custom Console via the existing
constructor.
Alternatively, expose a stream / console parameter on AsyncLogger
so downstream wrappers (mcp-crawl4ai, FastMCP integrations) can force
stderr without monkey-patching.
Reproduction
pip install crawl4ai==0.8.6 mcp-crawl4ai==0.3.1 mcp
- Run any MCP client (e.g. Claude Desktop, mcp-inspector) against
crawl4ai_mcp.server over stdio.
- Call the
scrape tool on any URL.
- Observe
Failed to parse JSONRPC message from server errors in the
client log for each scrape, with input_value matching crawl4ai's
progress markers.
Environment
- crawl4ai 0.8.6
- mcp-crawl4ai 0.3.1
- mcp (python-sdk) 1.x
- Python 3.14 / 3.11
- macOS (also reproduces on Linux per stdio spec)
crawl4ai version
0.8.6 (installed via
pip install crawl4ai==0.8.6as a dep of mcp-crawl4ai 0.3.1).Expected Behavior
When crawl4ai is used inside any program that uses stdout as a structured
data channel — e.g. an MCP server using the stdio transport — log output
should go to stderr, leaving stdout clean for the host protocol.
Current Behavior
crawl4ai/async_logger.py:139constructsself.console = Console()with nofile=argument, sorich.console.Consoledefaults tosys.stdout.Every log line — including the progress markers from
url_status()such as[FETCH] ↓ ...,https://... | ✓ | ⏱: 1.52s,[SCRAPE] ◆,[COMPLETE] ●—is written to stdout.
When crawl4ai is wrapped by an MCP stdio server (e.g.
wyattowalsh/mcp-crawl4ai),the MCP transport spec requires stdout to contain only newline-delimited
JSON-RPC messages. crawl4ai's log lines corrupt the JSON-RPC stream and the
client's
JSONRPCMessage.model_validate_jsonraises Pydantic validationerrors for every leaked line:
Functionally the tool call still succeeds (the real JSON-RPC response is
also written to stdout and parses fine), but every scrape produces 6–10
spurious ERROR lines in the host's log and risks confusing some MCP
clients into closing the connection.
Note: even when downstream code passes
BrowserConfig(verbose=False),url_status()still emits these lines —verbose=Falseonly gates asubset of log calls.
Root cause
crawl4ai/async_logger.py:Proposed fix
Default the logger console to
sys.stderr:This is the universal convention for library logging and matches what
logging.StreamHandlerdefaults to. Programs that genuinely want logson stdout can override by passing a custom
Consolevia the existingconstructor.
Alternatively, expose a
stream/consoleparameter onAsyncLoggerso downstream wrappers (mcp-crawl4ai, FastMCP integrations) can force
stderr without monkey-patching.
Reproduction
pip install crawl4ai==0.8.6 mcp-crawl4ai==0.3.1 mcpcrawl4ai_mcp.serverover stdio.scrapetool on any URL.Failed to parse JSONRPC message from servererrors in theclient log for each scrape, with
input_valuematching crawl4ai'sprogress markers.
Environment