Integration patterns

Practical integration patterns for embedding Mapademics syllabus skills extraction and labor market intelligence into your product.

This page describes practical, real-world patterns for integrating the Mapademics Embedded API into a product. It focuses on the parts that matter in production: ingestion, persistence, caching, and multi-tenant scoping.

Mapademics supports two primary capabilities:

  • Syllabus skills extraction (asynchronous job workflow)

  • Labor market intelligence (synchronous query workflow)

They’re typically integrated independently, but can be combined in certain products.


Core principle: keep your own durable records

Mapademics returns durable identifiers (e.g., an extractionId) and rich response objects. In production, you should:

  • Persist identifiers and results in your own database

  • Treat Mapademics as the system that computes the output, while you remain the system that stores the reference and serves the output in your product

This is especially important for syllabus skills extraction.


Pattern 1 — Syllabus skills extraction as an asynchronous job

Best for: authoring platforms, curriculum workflows, assessment tooling, admin/batch processing Core idea: treat extraction as a job lifecycle: upload → processing → results

Common ingestion scenarios (no UI assumptions)

Most partners integrate syllabus extraction via one (or more) of these flows:

A) Authoring platform flow (most common for partners)

  • Your platform generates or stores the syllabus PDF

  • You call Mapademics extraction when the PDF is finalized or published

  • You attach extractionId back onto your internal syllabus/course record

B) Admin/batch flow

  • An admin selects a set of course syllabi (e.g., for a term or program)

  • Your backend loops through PDFs and starts extractions

  • You track jobs, progress, and results in your system

C) End-user upload flow (optional)

  • A user uploads a PDF in your UI

  • Your backend starts extraction and tracks the job

The key is that the ingestion trigger can be anything. The integration pattern stays the same.


  1. Create/identify a syllabus PDF in your system (authored, uploaded, or generated)

  2. Start extraction with Mapademics (Upload step)

  3. Persist the returned extractionId on your syllabus/course record

  4. Poll or fetch results when needed (Retrieve step)

  5. Persist extracted skills in your database for downstream use


Persistence requirement: save and reuse extractionId

extractionId should be treated as a durable, reusable reference.

You should:

  • Store extractionId in your database alongside your syllabus record

  • Reuse the same extractionId for as long as the underlying syllabus PDF has not changed

  • Only start a new extraction when the syllabus changes materially

This prevents:

  • Duplicative processing

  • Slower UX

  • Unnecessary costs / rate-limit pressure

Practical implementation tip

  • Compute a syllabusContentHash (or stable file fingerprint) in your system.

  • If the fingerprint hasn’t changed, reuse the existing extractionId.


Polling strategy (pragmatic defaults)

Extraction is asynchronous. Recommended polling pattern:

  • Poll every 3–5 seconds initially for a short period (e.g., up to ~30 seconds)

  • Then back off to 10–20 seconds

  • Stop polling after a reasonable timeout and allow a manual refresh or background job to complete later

If you support batch extraction, polling should be handled by backend workers rather than the UI.


Pattern 2 — Labor market intelligence as a synchronous lookup

Best for: catalogs, program pages, discovery, advising, planning tools Core idea: labor market data is ideal for read-time lookups, but should be cached aggressively.

  1. Your program entity stores one or more CIP codes (required)

  2. Your backend queries Mapademics using:

    • cipCodes (required)

    • regionType (national | state | msa)

    • region (optional for national; required for state and msa)

  3. You display matched occupations, demand signals, and skill requirements

  4. You cache results keyed by CIP + region selection


Labor market responses are:

  • highly reusable across users

  • stable relative to request frequency

  • expensive to fetch repeatedly on catalog pages

Caching is not an optimization here — it’s the default integration pattern.

What to cache

Cache the full response object for a given request key:

Cache key

  • cipCodes (sorted)

  • regionType

  • region (normalized; empty/default when national)

Example cache keys:

  • cip:11.0701|regionType:national|region:(default)

  • cip:11.0701|regionType:state|region:California

  • cip:11.0701|regionType:msa|region:San Francisco-Oakland-Berkeley, CA

Pick based on your UX and volume. Practical defaults:

  • Catalog / public pages: 7–30 days

  • Advising / internal tools: 1–7 days

  • Planning dashboards: 1–7 days, optionally with a manual refresh

If you need a single default: cache 7 days and add a manual refresh capability for admins.

Warming the cache (power move)

If you have a known program catalog:

  • Precompute and cache labor market data for all CIP codes nightly/weekly

  • This makes your product feel instant and avoids bursty traffic

Persist vs cache

For most customers:

  • caching alone is sufficient

For high-scale catalogs:

  • persist the response to your database (and treat it like a cached artifact with refresh)


Core idea: your backend acts as a thin proxy between your frontend and Mapademics.

Frontend → Your API → Mapademics API → Your API → Frontend

Why this is the default

  • Keeps platform and customer keys out of the browser

  • Centralizes caching, retries, and logging

  • Lets you enforce tenant scoping


Pattern 5 — Multi-tenant scoping with customer keys

If you serve multiple institutions/customers:

  • Store customer keys securely in your backend

  • Resolve the correct key based on the authenticated tenant

  • Apply customer scoping consistently to all Mapademics calls


Pattern 6 — Region selection UX (labor market)

Region selection affects caching and user experience.

Recommended UX:

  • Default to regionType: national (region optional)

  • Offer selectors for state and msa

  • Only require a region value when the user selects state or msa

MSA fallback

When using msa regions, projection data (growth and openings) may not be available for all metro areas. Set fallbackFromMsaToState: true in your request to automatically fall back to state-level projections. When this fallback is used, the response includes a MSA_TO_STATE_PROJECTION_FALLBACK warning.

Caching implication

Because region is part of the cache key, your region UX directly impacts cache hit rate.

  • Defaulting to national improves reuse dramatically.


Operational patterns (high leverage)

Logging & debuggability

  • Log response meta.requestId (when present)

  • Persist timestamps and the request key used (CIP + region + tenant)

  • Make it easy to replay a request during debugging

Error handling (practical)

  • 401 typically indicates a platform key issue

  • 403 typically indicates a customer key issue

  • Use exponential backoff for retryable failures

  • Avoid retrying non-idempotent operations unless explicitly safe


Using percentiles for ranking

When displaying multiple occupations, use growthPercentile and openingsPercentile to rank or compare occupations. These values indicate how an occupation compares to all other occupations nationally (0–100).


Next steps

Last updated