Skip to content

[Bug]: Browser hangs indefinitely on WSL when system proxy is required #1930

@nihaoWX

Description

@nihaoWX

crawl4ai version

0.8.6

Expected Behavior

When proxy or proxy_config is set in BrowserConfig, crawl4ai should
successfully launch Chromium with the proxy and complete the crawl.

Current Behavior

On WSL2 where a proxy is required for internet access, AsyncWebCrawler hangs
indefinitely after printing "[INIT].... → Crawl4AI 0.8.6" and never completes.
Setting proxy via BrowserConfig(proxy=...), BrowserConfig(proxy_config=...),
or extra_args=["--proxy-server=..."] all result in the same hang.

Is this reproducible?

Yes

Inputs Causing the Bug

Any URL requiring internet access (e.g. https://example.com) when running on 
WSL2 where Chrome cannot connect to the internet directly.

Steps to Reproduce

1. Install crawl4ai on WSL2:
   pip install crawl4ai
   crawl4ai-setup

2. Verify that Python requests work fine with proxy (direct Chrome cannot reach internet):
   HOST_IP=$(ip route | awk '/default/ {print $3}')
   http_proxy=http://$HOST_IP:7892 python -c "import requests; print(requests.get('https://www.google.com', timeout=15).status_code)"
   # Output: 200

3. Verify that plain Playwright works fine with proxy:
   python -c "
   from playwright.sync_api import sync_playwright
   p = sync_playwright().start()
   b = p.chromium.launch()
   print('OK')
   b.close()
   p.stop()
   "
   # Output: OK

4. Run a basic crawl4ai crawl (no proxy):
   python -c "
   import asyncio
   from crawl4ai import AsyncWebCrawler
   async def test():
       async with AsyncWebCrawler() as crawler:
           result = await crawler.arun('https://example.com')
           print(result.success)
   asyncio.run(test())
   "
   # Hangs forever at: [INIT].... → Crawl4AI 0.8.6

5. Try with proxy set in BrowserConfig:
   python -c "
   import asyncio
   from crawl4ai import AsyncWebCrawler, BrowserConfig
   async def test():
       config = BrowserConfig(proxy='http://172.29.240.1:7892')
       async with AsyncWebCrawler(config=config) as crawler:
           result = await crawler.arun('https://example.com')
           print(result.success)
   asyncio.run(test())
   "
   # Also hangs forever at: [INIT].... → Crawl4AI 0.8.6

6. Confirmed root cause: crawl4ai uses subprocess.Popen to launch Chrome 
   without passing proxy env vars, so Chrome cannot reach the internet on WSL2.
   Plain Playwright with proxy={"server": ...} works correctly.

Code snippets

# Hangs forever
import asyncio
from crawl4ai import AsyncWebCrawler, BrowserConfig, CrawlerRunConfig

async def main():
    config = BrowserConfig(proxy="http://172.29.240.1:7892")
    async with AsyncWebCrawler(config=config) as crawler:
        result = await crawler.arun("https://example.com")

asyncio.run(main())

# Workaround: launch browser manually via Playwright and connect via CDP
import subprocess
from playwright.async_api import async_playwright
from crawl4ai import AsyncWebCrawler, BrowserConfig, CrawlerRunConfig

HOST_IP = subprocess.check_output(
    "ip route | awk '/default/ {print $3}'", shell=True).decode().strip()

async def main():
    async with async_playwright() as p:
        browser = await p.chromium.launch(
            headless=True,
            args=["--no-sandbox", "--disable-dev-shm-usage",
                  "--remote-debugging-port=9222"],
            proxy={"server": f"http://{HOST_IP}:7892"}
        )
        config = BrowserConfig(cdp_url="http://localhost:9222")
        async with AsyncWebCrawler(config=config) as crawler:
            result = await crawler.arun("https://example.com",
                config=CrawlerRunConfig(page_timeout=15000))
            print(result.success)
        await browser.close()

asyncio.run(main())

OS

OS: Windows 11 + WSL2 (Ubuntu 24.04)

Python version

3.12

Browser

Chromium (crawl4ai managed)

Browser version

Google Chrome for Testing 145.0.7632.6

Error logs & Screenshots (if applicable)

No response

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions