I'm having trouble getting Scrapy + Playwright to respect caches when crawling, when using a persistent context. I've tried to get it down to a minimal example, which you can see here:
https://github.com/pjlsergeant/scrapy-playwright-cache-bug
app.py is a minimal Flask app to demonstrate; if you start it (flask run) and then run the scrape (scrapy crawl crawl), you can see that the PNG at /pixel doesn't get cached, both from the flask logs and by the final body output: <html><head></head><body>count:6</body></html>, signifying 6 hits.
Interestingly, if you then manually load up Playwright using the persistent config (something like browser_context = chromium.launch_persistent_context(userDataDir)), you'll see the image is already cached, so the image is being written to the cache during Playwright+Scrapy's run, it's just not being loaded from the cache when Playwright is being driven by Scrapy.
Any help gratefully received