Skip to content

Scrapy-playwright tries to create same context for http and https handlers and fails. #355

@Vitalii-Kh95

Description

@Vitalii-Kh95

I am not sure whether it's a bug or just really weird behavior. When I have 2 handlers enabled: "http" and "https" scrapy playwright tries to create the same context for those handlers, even though the target uses only https protocol...
Here are the settings I used in settings.py:

DOWNLOAD_HANDLERS = {
    "http": "scrapy_playwright.handler.ScrapyPlaywrightDownloadHandler",
    "https": "scrapy_playwright.handler.ScrapyPlaywrightDownloadHandler",
}

PLAYWRIGHT_LAUNCH_OPTIONS = {
    "headless": False,
    "timeout": 60 * 1000,
}

PLAYWRIGHT_CONTEXTS = {
    "persistent": {
        "user_data_dir": "context",
        "channel": "chrome",
        "headless": False,
        "no_viewport": True,
    }
}

The spider is basic:

TEST_TARGETS = {
    "quotes.toscrape": "https://quotes.toscrape.com/"
}


class botSpider(scrapy.Spider):
    name = "bot"

    async def start(self):
        for label, url in TEST_TARGETS.items():
            yield scrapy.Request(
                url,
                callback=self.parse_result,
                cb_kwargs={"label": label},
                errback=self.close_context_on_error,
                meta={
                    "playwright": True,
                    "playwright_context": "persistent",
                    "playwright_include_page": True,
                    "playwright_page_methods": [
                        PageMethod("wait_for_load_state", "load"),
                    ],
                },
            )

    async def parse_result(self, response: Response, label: str):
        page: Page = response.meta["playwright_page"]
        await page.screenshot(path=f"{Path(label)}-screenshot.png")
        await page.close()
        await page.context.close()

    async def close_context_on_error(self, failure):
        page = failure.request.meta["playwright_page"]
        await page.close()
        await page.context.close()

And here's the error stack I get:

2025-12-01 05:25:35 [scrapy.utils.signal] ERROR: Error caught on signal handler: <bound method ScrapyPlaywrightDownloadHandler._engine_started of <scrapy_playwright.handler.ScrapyPlaywrightDownloadHandler object at 0x76ebda1e4590>>
Traceback (most recent call last):
  File "/home/vitalii/projects/learning/python/web scraping/playwright_check/.venv/lib/python3.13/site-packages/twisted/internet/defer.py", line 1257, in adapt
    extracted: _SelfResultT | Failure = result.result()
                                        ~~~~~~~~~~~~~^^
  File "/home/vitalii/projects/learning/python/web scraping/playwright_check/.venv/lib/python3.13/site-packages/scrapy_playwright/handler.py", line 192, in _launch
    await asyncio.gather(
    ...<4 lines>...
    )
  File "/home/vitalii/projects/learning/python/web scraping/playwright_check/.venv/lib/python3.13/site-packages/scrapy_playwright/handler.py", line 247, in _create_browser_context
    context = await self.browser_type.launch_persistent_context(**context_kwargs)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/vitalii/projects/learning/python/web scraping/playwright_check/.venv/lib/python3.13/site-packages/playwright/async_api/_generated.py", line 14795, in launch_persistent_context
    await self._impl_obj.launch_persistent_context(
    ...<51 lines>...
    )
  File "/home/vitalii/projects/learning/python/web scraping/playwright_check/.venv/lib/python3.13/site-packages/playwright/_impl/_browser_type.py", line 166, in launch_persistent_context
    result = await self._channel.send_return_as_dict(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        "launchPersistentContext", TimeoutSettings.launch_timeout, params
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/home/vitalii/projects/learning/python/web scraping/playwright_check/.venv/lib/python3.13/site-packages/playwright/_impl/_connection.py", line 83, in send_return_as_dict
    return await self._connection.wrap_api_call(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<3 lines>...
    )
    ^
  File "/home/vitalii/projects/learning/python/web scraping/playwright_check/.venv/lib/python3.13/site-packages/playwright/_impl/_connection.py", line 559, in wrap_api_call
    raise rewrite_error(error, f"{parsed_st['apiName']}: {error}") from None
playwright._impl._errors.Error: BrowserType.launch_persistent_context: Failed to create a ProcessSingleton for your profile directory. This usually means that the profile is already in use by another instance of Chromium.
Call log:
  - <launching> /opt/google/chrome/chrome --disable-field-trial-config --disable-background-networking --disable-background-timer-throttling --disable-backgrounding-occluded-windows --disable-back-forward-cache --disable-breakpad --disable-client-side-phishing-detection --disable-component-extensions-with-background-pages --disable-component-update --no-default-browser-check --disable-default-apps --disable-dev-shm-usage --disable-extensions --disable-features=AcceptCHFrame,AvoidUnnecessaryBeforeUnloadCheckSync,DestroyProfileOnBrowserClose,DialMediaRouteProvider,GlobalMediaControls,HttpsUpgrades,LensOverlay,MediaRouter,PaintHolding,ThirdPartyStoragePartitioning,Translate,AutoDeElevate,RenderDocument --enable-features=CDPScreenshotNewSurface --allow-pre-commit-input --disable-hang-monitor --disable-ipc-flooding-protection --disable-popup-blocking --disable-prompt-on-repost --disable-renderer-backgrounding --force-color-profile=srgb --metrics-recording-only --no-first-run --password-store=basic --use-mock-keychain --no-service-autorun --export-tagged-pdf --disable-search-engine-choice-screen --unsafely-disable-devtools-self-xss-warnings --edge-skip-compat-layer-relaunch --enable-automation --no-sandbox --user-data-dir=/home/vitalii/projects/learning/python/web scraping/playwright_check/context --remote-debugging-pipe about:blank
  - <launched> pid=480700
  - [pid=480700][err] [1201/052535.504352:WARNING:chrome/app/chrome_main_linux.cc:82] Read channel stable from /opt/google/chrome/CHROME_VERSION_EXTRA
  - [pid=480700][err] [480700:480700:1201/052535.927008:ERROR:chrome/browser/process_singleton_posix.cc:340] Failed to create /home/vitalii/projects/learning/python/web scraping/playwright_check/context/SingletonLock: File exists (17)
  - [pid=480700][err] [480700:480700:1201/052535.948926:ERROR:chrome/app/chrome_main_delegate.cc:514] Failed to create a ProcessSingleton for your profile directory. This means that running multiple instances would start multiple browser processes rather than opening a new window in the existing process. Aborting now to avoid profile corruption.

Metadata

Metadata

Assignees

Labels

documentationImprovements or additions to documentation

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions