Rate Limiting¶
Rate limiter implementation for pyfetcher.
- Purpose:
Provide configurable per-domain and global rate limiting using
pyrate_limiter. TheDomainRateLimitermaintains separate rate limit buckets per domain while optionally enforcing a global request rate.- Design:
Rate limits are defined via
RateLimitPolicywhich specifies requests-per-second and an optional burst allowance.The limiter uses an in-memory bucket store by default.
Both synchronous and asynchronous
acquiremethods are provided.Domain extraction is performed automatically from URLs.
Examples
>>> policy = RateLimitPolicy(requests_per_second=10.0)
>>> limiter = DomainRateLimiter(default_policy=policy)
>>> limiter.acquire("https://example.com/page1")
- class pyfetcher.ratelimit.limiter.RateLimitPolicy(requests_per_second=10.0, burst=1, per_domain=True)[source]¶
Rate limiting policy configuration.
Defines the rate at which requests may be made, with an optional burst allowance for short traffic spikes.
- Parameters:
requests_per_second (float) – Maximum sustained request rate. Set to
0to disable rate limiting.burst (int) – Maximum number of requests that can be made in a burst before throttling kicks in. Defaults to
1(no burst).per_domain (bool) – Whether this limit applies per-domain (
True) or globally (False).
Examples
>>> policy = RateLimitPolicy(requests_per_second=5.0, burst=10) >>> policy.interval 0.2
- class pyfetcher.ratelimit.limiter.DomainRateLimiter(*, default_policy=None, domain_policies=None, global_policy=None)[source]¶
Per-domain rate limiter with optional global rate limiting.
Maintains separate token buckets for each domain encountered, throttling requests to stay within the configured rate limits. An optional global limiter can enforce an overall request rate across all domains.
- Parameters:
default_policy (RateLimitPolicy | None) – The default rate limit policy for domains without a specific override.
domain_policies (dict[str, RateLimitPolicy] | None) – Optional mapping of domain names to specific
RateLimitPolicyinstances.global_policy (RateLimitPolicy | None) – Optional global rate limit applied across all domains in addition to per-domain limits.
Examples
>>> limiter = DomainRateLimiter( ... default_policy=RateLimitPolicy(requests_per_second=5.0), ... domain_policies={"api.example.com": RateLimitPolicy(requests_per_second=1.0)}, ... ) >>> limiter.acquire("https://api.example.com/data")
- acquire(url)[source]¶
Acquire permission to make a request, blocking if rate-limited.
Checks both the per-domain rate limit and the optional global rate limit. Blocks the calling thread until a token is available.
- Parameters:
url (str) – The target URL (domain is extracted automatically).
- Returns:
Total time in seconds spent waiting for rate limit tokens.
- Return type:
Examples
>>> limiter = DomainRateLimiter() >>> wait = limiter.acquire("https://example.com/page")
- async aacquire(url)[source]¶
Acquire permission to make a request asynchronously.
Checks both the per-domain rate limit and the optional global rate limit. Yields control while waiting for tokens.
- Parameters:
url (str) – The target URL (domain is extracted automatically).
- Returns:
Total time in seconds spent waiting for rate limit tokens.
- Return type:
Examples
>>> import asyncio >>> limiter = DomainRateLimiter() >>> # await limiter.aacquire("https://example.com/page")