Skip to content

Internal MemoryAdaptiveDispatcher ignores max_session_permit #1927

@DennisFederico

Description

@DennisFederico

Hi unclecode,

I've been strugling with lots of 429 error codes and noticed that DeepCrawlStrategy, by ignoring my dispatcher settings is creating an internal MemoryAdaptiveDispatcher but this only controls the delays and spawns 20 task right from the bat by default.

dispatcher = MemoryAdaptiveDispatcher(

I added this change on my side to control the concurrency and work good for me.

+++ b/.venv/lib/python3.13/site-packages/crawl4ai/async_webcrawler.py
@@
         if dispatcher is None:
             primary_cfg = config[0] if isinstance(config, list) else config
             mean_delay = getattr(primary_cfg, "mean_delay", 0.1)
             max_range = getattr(primary_cfg, "max_range", 0.3)
+            max_session_permit = max(1, int(getattr(primary_cfg, "semaphore_count", 20) or 20))
             dispatcher = MemoryAdaptiveDispatcher(
+                max_session_permit=max_session_permit,
                 rate_limiter=RateLimiter(
                     base_delay=(mean_delay, mean_delay + max_range),
                     max_delay=60.0,
                     max_retries=3,
                 ),
             )

Metadata

Metadata

Assignees

No one assigned

    Labels

    ⚙ DoneBug fix, enhancement, FR that's completed pending release🐞 BugSomething isn't working📌 Root causedidentified the root cause of bug

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions