Understand rate limits, headers, and how to monitor customer usage when integrating with the Mapademics Embedded API.
The Mapademics Embedded API enforces rate limits to ensure reliability and fair usage across all customers.
Rate limits apply per customer, not per API key or per endpoint. There are no plan tiers or usage levels—rate limiting behavior is consistent across all integrations.
How rate limits work
Rate limits are enforced based on:
The customer associated with the request
The request volume over time
The type of request (read vs processing-heavy operations)
All requests include standard HTTP headers that describe the current rate-limit state.
Rate limit headers
Every API response includes rate-limit headers similar to the following:
Header
Description
X-RateLimit-Limit
The maximum number of requests allowed in the current window
X-RateLimit-Remaining
The number of requests remaining in the current window
X-RateLimit-Reset
The UNIX timestamp when the rate limit window resets
You should treat these headers as the source of truth for remaining capacity.
When you exceed the rate limit
If a request exceeds the allowed rate, the API returns:
HTTP 429 — Too Many Requests
Example response:
The X-RateLimit-Reset header indicates when it is safe to retry.
Recommended retry behavior
When you receive a 429 response:
Do not retry immediately
Wait until the reset time indicated by X-RateLimit-Reset
Resume requests gradually
Exponential backoff (recommended)
If you are retrying programmatically:
Use exponential backoff
Add jitter to avoid synchronized retries
Never retry non-idempotent requests unless explicitly safe
Monitoring customer usage
You can inspect a customer's current rate-limit usage state using the following endpoint:
This endpoint returns the customer's current usage metrics and rate-limit status.
When to use this endpoint
This is especially useful for:
Multi-tenant platforms
Admin dashboards
Proactive throttling in your own system
Debugging unexpected 429 responses
Recommended usage
Call this endpoint server-side only
Use it for visibility and diagnostics, not on every request
Cache responses briefly if polling for monitoring purposes
Rate limits and integration patterns
Syllabus skills extraction
Extraction requests are more resource-intensive.
You should:
Avoid re-extracting unchanged syllabi
Persist and reuse extractionId
Batch requests thoughtfully for admin or bulk workflows
Labor market intelligence
Labor market data is highly cacheable.
You should:
Cache responses by (cipCodes, regionType, region)
Use long TTLs for catalog and discovery pages
Avoid repeated calls on every page render
Caching is the primary way to reduce rate-limit pressure for labor market use cases.
Common causes of rate-limit issues
Re-extracting the same syllabus repeatedly
Calling labor market endpoints on every page view
Not caching program-level labor market responses
Retrying immediately after a 429 response
Using a single customer key for background batch jobs without throttling
Best practices summary
Read and respect rate-limit headers on every response
Cache labor market data aggressively
Persist and reuse syllabus extraction jobs
Use exponential backoff for retries
Monitor usage via GET /customers/{customerId}/usage