This page describes practical, real-world patterns for integrating the Mapademics Embedded API into a product. It focuses on the parts that matter in production: ingestion, persistence, caching, and multi-tenant scoping.
Mapademics supports two primary capabilities:
Syllabus skills extraction (asynchronous job workflow)
Labor market intelligence (synchronous query workflow)
They’re typically integrated independently, but can be combined in certain products.
Core principle: keep your own durable records
Mapademics returns durable identifiers (e.g., an extractionId) and rich response objects. In production, you should:
Persist identifiers and results in your own database
Treat Mapademics as the system that computes the output, while you remain the system that stores the reference and serves the output in your product
This is especially important for syllabus skills extraction.
Pattern 1 — Syllabus skills extraction as an asynchronous job
Best for: authoring platforms, curriculum workflows, assessment tooling, admin/batch processing
Core idea: treat extraction as a job lifecycle: upload → processing → results
Common ingestion scenarios (no UI assumptions)
Most partners integrate syllabus extraction via one (or more) of these flows:
Your platform generates or stores the syllabus PDF
You call Mapademics extraction when the PDF is finalized or published
You attach extractionId back onto your internal syllabus/course record
B) Admin/batch flow
An admin selects a set of course syllabi (e.g., for a term or program)
Your backend loops through PDFs and starts extractions
You track jobs, progress, and results in your system
C) End-user upload flow (optional)
A user uploads a PDF in your UI
Your backend starts extraction and tracks the job
The key is that the ingestion trigger can be anything. The integration pattern stays the same.
Job lifecycle (recommended)
Create/identify a syllabus PDF in your system (authored, uploaded, or generated)
Start extraction with Mapademics (Upload step)
Persist the returned extractionId on your syllabus/course record
Poll or fetch results when needed (Retrieve step)
Persist extracted skills in your database for downstream use
Persistence requirement: save and reuse extractionId
extractionId should be treated as a durable, reusable reference.
You should:
Store extractionId in your database alongside your syllabus record
Reuse the same extractionId for as long as the underlying syllabus PDF has not changed
Only start a new extraction when the syllabus changes materially
This prevents:
Unnecessary costs / rate-limit pressure
Practical implementation tip
Compute a syllabusContentHash (or stable file fingerprint) in your system.
If the fingerprint hasn’t changed, reuse the existing extractionId.
Polling strategy (pragmatic defaults)
Extraction is asynchronous. Recommended polling pattern:
Poll every 3–5 seconds initially for a short period (e.g., up to ~30 seconds)
Then back off to 10–20 seconds
Stop polling after a reasonable timeout and allow a manual refresh or background job to complete later
If you support batch extraction, polling should be handled by backend workers rather than the UI.
Pattern 2 — Labor market intelligence as a synchronous lookup
Best for: catalogs, program pages, discovery, advising, planning tools
Core idea: labor market data is ideal for read-time lookups, but should be cached aggressively.
Recommended flow
Your program entity stores one or more CIP codes (required)
Your backend queries Mapademics using:
regionType (national | state | msa)
region (optional for national; required for state and msa)
You display matched occupations, demand signals, and skill requirements
You cache results keyed by CIP + region selection
Pattern 3 — Aggressive caching for labor market data (highly recommended)
Labor market responses are:
highly reusable across users
stable relative to request frequency
expensive to fetch repeatedly on catalog pages
Caching is not an optimization here — it’s the default integration pattern.
Cache the full response object for a given request key:
Cache key
region (normalized; empty/default when national)
Example cache keys:
cip:11.0701|regionType:national|region:(default)
cip:11.0701|regionType:state|region:California
cip:11.0701|regionType:msa|region:San Francisco-Oakland-Berkeley, CA
Recommended TTLs (starting point)
Pick based on your UX and volume. Practical defaults:
Catalog / public pages: 7–30 days
Advising / internal tools: 1–7 days
Planning dashboards: 1–7 days, optionally with a manual refresh
If you need a single default: cache 7 days and add a manual refresh capability for admins.
Warming the cache (power move)
If you have a known program catalog:
Precompute and cache labor market data for all CIP codes nightly/weekly
This makes your product feel instant and avoids bursty traffic
Persist vs cache
For most customers:
caching alone is sufficient
For high-scale catalogs:
persist the response to your database (and treat it like a cached artifact with refresh)
Pattern 4 — Server-side proxy (recommended default)
Core idea: your backend acts as a thin proxy between your frontend and Mapademics.
Frontend → Your API → Mapademics API → Your API → Frontend
Why this is the default
Keeps platform and customer keys out of the browser
Centralizes caching, retries, and logging
Lets you enforce tenant scoping
Pattern 5 — Multi-tenant scoping with customer keys
If you serve multiple institutions/customers:
Store customer keys securely in your backend
Resolve the correct key based on the authenticated tenant
Apply customer scoping consistently to all Mapademics calls
Pattern 6 — Region selection UX (labor market)
Region selection affects caching and user experience.
Recommended UX:
Default to regionType: national (region optional)
Offer selectors for state and msa
Only require a region value when the user selects state or msa
When using msa regions, projection data (growth and openings) may not be available for all metro areas. Set fallbackFromMsaToState: true in your request to automatically fall back to state-level projections. When this fallback is used, the response includes a MSA_TO_STATE_PROJECTION_FALLBACK warning.
Caching implication
Because region is part of the cache key, your region UX directly impacts cache hit rate.
Defaulting to national improves reuse dramatically.
Operational patterns (high leverage)
Logging & debuggability
Log response meta.requestId (when present)
Persist timestamps and the request key used (CIP + region + tenant)
Make it easy to replay a request during debugging
Error handling (practical)
401 typically indicates a platform key issue
403 typically indicates a customer key issue
Use exponential backoff for retryable failures
Avoid retrying non-idempotent operations unless explicitly safe
Using percentiles for ranking
When displaying multiple occupations, use growthPercentile and openingsPercentile to rank or compare occupations. These values indicate how an occupation compares to all other occupations nationally (0–100).