Integration patterns

Practical integration patterns for embedding Mapademics syllabus skills extraction and labor market intelligence into your product.

This page describes practical, real-world patterns for integrating the Mapademics Embedded API into a product. It focuses on the parts that matter in production: ingestion, persistence, caching, and multi-tenant scoping.

Mapademics supports two primary capabilities:

Syllabus skills extraction (asynchronous job workflow)
Labor market intelligence (synchronous query workflow)

They’re typically integrated independently, but can be combined in certain products.

Core principle: keep your own durable records

Mapademics returns durable identifiers (e.g., an extractionId) and rich response objects. In production, you should:

Persist identifiers and results in your own database
Treat Mapademics as the system that computes the output, while you remain the system that stores the reference and serves the output in your product

This is especially important for syllabus skills extraction.

Pattern 1 — Syllabus skills extraction as an asynchronous job

Best for: authoring platforms, curriculum workflows, assessment tooling, admin/batch processing Core idea: treat extraction as a job lifecycle: upload → processing → results

Common ingestion scenarios (no UI assumptions)

Most partners integrate syllabus extraction via one (or more) of these flows:

A) Authoring platform flow (most common for partners)

Your platform generates or stores the syllabus PDF
You call Mapademics extraction when the PDF is finalized or published
You attach extractionId back onto your internal syllabus/course record

B) Admin/batch flow

An admin selects a set of course syllabi (e.g., for a term or program)
Your backend loops through PDFs and starts extractions
You track jobs, progress, and results in your system

C) End-user upload flow (optional)

A user uploads a PDF in your UI
Your backend starts extraction and tracks the job

The key is that the ingestion trigger can be anything. The integration pattern stays the same.

Job lifecycle (recommended)

Create/identify a syllabus PDF in your system (authored, uploaded, or generated)
Start extraction with Mapademics (Upload step)
Persist the returned extractionId on your syllabus/course record
Poll or fetch results when needed (Retrieve step)
Persist extracted skills in your database for downstream use

Persistence requirement: save and reuse `extractionId`

extractionId should be treated as a durable, reusable reference.

You should:

Store extractionId in your database alongside your syllabus record
Reuse the same extractionId for as long as the underlying syllabus PDF has not changed
Only start a new extraction when the syllabus changes materially

This prevents:

Duplicative processing
Slower UX
Unnecessary costs / rate-limit pressure

Practical implementation tip

Compute a syllabusContentHash (or stable file fingerprint) in your system.
If the fingerprint hasn’t changed, reuse the existing extractionId.

Polling strategy (pragmatic defaults)

Extraction is asynchronous. Recommended polling pattern:

Poll every 3–5 seconds initially for a short period (e.g., up to ~30 seconds)
Then back off to 10–20 seconds
Stop polling after a reasonable timeout and allow a manual refresh or background job to complete later

If you support batch extraction, polling should be handled by backend workers rather than the UI.

Pattern 2 — Labor market intelligence as a synchronous lookup

Best for: catalogs, program pages, discovery, advising, planning tools Core idea: labor market data is ideal for read-time lookups, but should be cached aggressively.

Recommended flow

Your program entity stores one or more CIP codes (required)
Your backend queries Mapademics using:
- cipCodes (required)
- regionType (national | state | msa)
- region (optional for national; required for state and msa)
You display matched occupations, demand signals, and skill requirements
You cache results keyed by CIP + region selection

Pattern 3 — Aggressive caching for labor market data (highly recommended)

Labor market responses are:

highly reusable across users
stable relative to request frequency
expensive to fetch repeatedly on catalog pages

Caching is not an optimization here — it’s the default integration pattern.

What to cache

Cache the full response object for a given request key:

Cache key

cipCodes (sorted)
regionType
region (normalized; empty/default when national)

Example cache keys:

cip:11.0701|regionType:national|region:(default)
cip:11.0701|regionType:state|region:California
cip:11.0701|regionType:msa|region:San Francisco-Oakland-Berkeley, CA

Recommended TTLs (starting point)

Pick based on your UX and volume. Practical defaults:

Catalog / public pages: 7–30 days
Advising / internal tools: 1–7 days
Planning dashboards: 1–7 days, optionally with a manual refresh

If you need a single default: cache 7 days and add a manual refresh capability for admins.

Warming the cache (power move)

If you have a known program catalog:

Precompute and cache labor market data for all CIP codes nightly/weekly
This makes your product feel instant and avoids bursty traffic

Persist vs cache

For most customers:

caching alone is sufficient

For high-scale catalogs:

persist the response to your database (and treat it like a cached artifact with refresh)

Pattern 4 — Server-side proxy (recommended default)

Core idea: your backend acts as a thin proxy between your frontend and Mapademics.

Frontend → Your API → Mapademics API → Your API → Frontend

Why this is the default

Keeps platform and customer keys out of the browser
Centralizes caching, retries, and logging
Lets you enforce tenant scoping

Pattern 5 — Multi-tenant scoping with customer keys

If you serve multiple institutions/customers:

Store customer keys securely in your backend
Resolve the correct key based on the authenticated tenant
Apply customer scoping consistently to all Mapademics calls

Pattern 6 — Region selection UX (labor market)

Region selection affects caching and user experience.

Recommended UX:

Default to regionType: national (region optional)
Offer selectors for state and msa
Only require a region value when the user selects state or msa

MSA fallback

When using msa regions, projection data (growth and openings) may not be available for all metro areas. Set fallbackFromMsaToState: true in your request to automatically fall back to state-level projections. When this fallback is used, the response includes a MSA_TO_STATE_PROJECTION_FALLBACK warning.

Caching implication

Because region is part of the cache key, your region UX directly impacts cache hit rate.

Defaulting to national improves reuse dramatically.

Operational patterns (high leverage)

Logging & debuggability

Log response meta.requestId (when present)
Persist timestamps and the request key used (CIP + region + tenant)
Make it easy to replay a request during debugging

Error handling (practical)

401 typically indicates a platform key issue
403 typically indicates a customer key issue
Use exponential backoff for retryable failures
Avoid retrying non-idempotent operations unless explicitly safe

Using percentiles for ranking

When displaying multiple occupations, use growthPercentile and openingsPercentile to rank or compare occupations. These values indicate how an occupation compares to all other occupations nationally (0–100).

Next steps

Syllabus Skills Extraction — Async extraction workflow
Labor Market Intelligence — Synchronous query workflow
The Mapademics Skills Library — Browse and search skills
Core Concepts — CIP codes, SOC codes, and demand signals

PreviousCore concepts NextThe Mapademics Skills Library

Last updated 28 days ago

Good afternoon

hashtagCore principle: keep your own durable records

hashtagPattern 1 — Syllabus skills extraction as an asynchronous job

hashtagCommon ingestion scenarios (no UI assumptions)

hashtagA) Authoring platform flow (most common for partners)

hashtagB) Admin/batch flow

hashtagC) End-user upload flow (optional)

hashtagJob lifecycle (recommended)

hashtagPersistence requirement: save and reuse extractionId

hashtagPolling strategy (pragmatic defaults)

hashtagPattern 2 — Labor market intelligence as a synchronous lookup

hashtagRecommended flow

hashtagPattern 3 — Aggressive caching for labor market data (highly recommended)

hashtagWhat to cache

hashtagRecommended TTLs (starting point)

hashtagWarming the cache (power move)

hashtagPersist vs cache

hashtagPattern 4 — Server-side proxy (recommended default)

hashtagWhy this is the default

hashtagPattern 5 — Multi-tenant scoping with customer keys

hashtagPattern 6 — Region selection UX (labor market)

hashtagMSA fallback

hashtagCaching implication

hashtagOperational patterns (high leverage)

hashtagLogging & debuggability

hashtagError handling (practical)

hashtagUsing percentiles for ranking

hashtagNext steps