# idantitem — full reference for LLMs

> Commercial eKYC platform for Haiti and the Caribbean diaspora.
> This file is a flat-text reference designed for LLM consumption: every
> endpoint, every field, every integration recipe in one Markdown blob.
> Live OpenAPI spec is at `/openapi.json`.
>
> **Pricing**: 500 eKYC verifications / month free on individual accounts;
> business / enterprise pricing on request via <https://idantitem.com>.

## Principle

idantitem follows an "80% deterministic, 20% AI" doctrine:

- **Deterministic**: ICAO 9303 MRZ checksums, Luhn (cards), QR / CODE128 /
  PDF417 decoders, structured OCR with bounding-box label matching, YAML
  rules engine, SHA-256 chained audit log.
- **AI** is used only where no deterministic method replaces it: face
  recognition (SFace), liveness (MediaPipe FaceLandmarker blink
  detection), optional GOT-OCR 2.0 fallback for degraded scans.

Country / document coverage at v0.1:

| Kind | Detector | Structured parser | Source of truth |
|---|---|---|---|
| HT permis (DGCR) | yes | yes | `UU-NNNNN-XX` regex + label-aware OCR |
| HT CIN (Dermalog) | yes | yes | NINU regex + bilingual labels + QR |
| QC permis (SAAQ) | yes | yes | `[A-Z]NNNN-NNNNNN-NN` + positional names |
| FR CNI | yes | yes | TD2 MRZ + 9d+2l number + labels |
| DR Cédula | yes | yes | `3-7-1` regex + Spanish labels |
| US driver license | yes | yes | AAMVA-inspired generic labels |
| Passport (any) | yes | MRZ TD3 | first-line `P<` prefix + 2x 44-char run |

## Authentication

Two API key kinds are issued from the dashboard:

- `pk_live_…` / `pk_test_…` — **publishable**, sent from the browser
  widget. Bound to a project; origin allowlist enforced.
- `sk_live_…` / `sk_test_…` — **secret**, server-to-server. Required for
  cross-checking a verification id (`Authorization: Bearer sk_…`).

Both kinds are accepted via either header:
- `X-Idantitem-Key: pk_live_…` (preferred for the widget)
- `Authorization: Bearer sk_…`

## Auto-detect + structured extraction

### `POST /api/v1/tools/document/autodetect`
Multipart upload (`file`). Returns the detected kind + ranked candidates.

```json
{
  "kind": "qc_permis",
  "score": 0.95,
  "reasons": ["saaq_number_shape", "quebec_marker"],
  "ranked": [
    { "kind": "qc_permis", "score": 0.95, "reasons": ["saaq_number_shape"] }
  ],
  "ocr_lines_count": 22
}
```

### `POST /api/v1/tools/document/structured`
Multipart upload (`file`). Auto-detects then dispatches to the matching
parser. Response:

```json
{
  "kind": "qc_permis",
  "detection_score": 0.95,
  "detection_reasons": ["saaq_number_shape", "quebec_marker"],
  "parsed": {
    "status": "ok",
    "confidence_percent": 100,
    "numero_permis": "J2105-241299-02",
    "surname": "DOE",
    "given_names": "JOHN MARIE",
    "birth_date": "1990-01-01",
    "sex": "M",
    "issue_date": "2023-11-09",
    "expiry_date": "2031-12-24",
    "categories": ["5"],
    "address": "70A AV LEGRAND, MONTREAL, (QC) H7N 3T1"
  }
}
```

### Per-country dedicated endpoints

Each parser is also reachable directly (useful for tests / debugging):

- `POST /api/v1/tools/permis/structured` — HT or QC permis (auto)
- `POST /api/v1/tools/cin/structured` — HT Dermalog CIN
- `POST /api/v1/tools/fr-cni/structured` — French CNI
- `POST /api/v1/tools/dr-cedula/structured` — Dominican Cédula
- `POST /api/v1/tools/us-dl/structured` — US driver license

All accept `file` (multipart) and return `{status, confidence_percent,
fields…, fields_raw}`.

## OCR primitives (low-level)

Useful when you want raw OCR text without structured parsing.

- `POST /api/v1/tools/ocr/smart` — cascade Tesseract → RapidOCR → GOT-OCR.
  Set `min_chars=N` to force a higher threshold; default 20.
- `POST /api/v1/tools/ocr/rapid` — RapidOCR direct, with line bboxes +
  per-line confidence (best for layout reasoning).
- `POST /api/v1/tools/ocr/got` — GOT-OCR 2.0 (requires sidecar /
  `GOT_OCR_ENABLED=1`).
- `POST /api/v1/tools/mrz/ocr` — Tesseract tuned for MRZ band, returns
  parsed TD1/TD2/TD3 + ICAO 9303 checksum results.
- `POST /api/v1/tools/qr/decode` — QR + CODE128 + PDF417 reader (pyzbar).

## Sanctions / PEP screening

### `POST /api/v1/screening`
```json
{
  "name": "Jimmy Cherizier",
  "birth_date": "1977-02-23",
  "nationality": "HT"
}
```

Returns:

```json
{
  "verdict": "HARD_BLOCK",
  "top_score": 0.97,
  "hits": [
    {
      "source": "opensanctions",
      "source_id": "ofac-…",
      "canonical_name": "Jimmy Cherizier",
      "score": 0.97,
      "name_sim": 1.0,
      "dob_sim": 0.9,
      "nat_sim": 1.0,
      "sanctions": [{"program": "GLOMAG", "list_name": "OpenSanctions"}]
    }
  ]
}
```

Verdict thresholds:
- `CLEAR` — score < 0.5
- `ENRICH_INVESTIGATION` — 0.5 ≤ score < 0.8 (collect DOB / nationality)
- `MANUAL_REVIEW` — 0.8 ≤ score < 0.95
- `HARD_BLOCK` — score ≥ 0.95

### Hit resolution workflow

- `POST /api/v1/screening/decisions` — persist a hit as `pending`
- `GET /api/v1/screening/decisions?status=pending&project_id=N` — list
- `PATCH /api/v1/screening/decisions/{id}` — body `{status, decided_by, notes}`
  with status in `false_positive | true_match | escalated`

### Per-project auto-screening
Toggle `screening_enabled` on a project (`PATCH
/api/v1/accounts/projects/{id}/screening` body `{screening_enabled:
true}`). When ON, every successful widget verification fires an async
screening and persists hits with score ≥ 0.7 as pending decisions.

### Sources indexed
- `ofac_sdn` (US Treasury — XML)
- `un_sc` (UN Security Council Consolidated)
- `eu_cfsp` (EU financial sanctions)
- `uk_ofsi` (UK Office of Financial Sanctions)
- `opensanctions` (CC-BY-4.0 — covers PEPs + the four lists above)

Sync via `python scripts/sync_sanctions.py --source opensanctions`. The
APScheduler in-process runs daily at 03:00 UTC when `SCHEDULER_ENABLED=1`.

## Widget — session lifecycle

The widget posts to its own backend; the merchant only ships the
publishable key.

1. `POST /api/v1/widget/sessions`
   Headers: `X-Idantitem-Key: pk_live_…`, `Origin: https://shop.example`
   Body: `{document_type, ruleset?, reference?, country?}`
   Response: `{session_token, session_id, document_type, ruleset, expires_at,
   project_name, branding}` where `branding` is `{primary?, background?}` —
   the partner-configured widget colors (omitted keys = use defaults).

2. `POST /api/v1/widget/sessions/{id}/document`
   Multipart `file=…`, form `side=front|back`
   Header: `X-Idantitem-Session: sek_…`

3. `POST /api/v1/widget/sessions/{id}/selfie`
   Multipart `file=…`
   Form fields propagating MediaPipe liveness signals:
   - `liveness_passed` (`1` / `0`)
   - `liveness_blink_count` (int)
   - `liveness_duration_ms` (int)
   When the user picks the file-fallback path (camera blocked or
   unavailable), the widget still calls this endpoint with the still
   image but omits the liveness fields; the orchestrator records
   `liveness_passed=false` and the ruleset decides whether to reject
   or to keep the verdict on `MANUAL_REVIEW`.

4. `POST /api/v1/widget/sessions/{id}/finalize`
   Body: `{expose_identity: bool}` (default false — production: hide
   identity from the browser; dev / partner-server: include it)
   Response includes `verdict`, `confidence_percent`, `masked`, `signals`,
   `reasons`, `contributions`. With `expose_identity=true` the response
   also carries `identity` (full extracted fields). When the project has
   a `webhook_url` configured, finalize delivers a signed POST whose
   payload is documented in the "Webhooks" section below — the partner
   backend receives the same `identity` regardless of `expose_identity`,
   which only gates the browser-visible response.

5. `GET /api/v1/widget/verifications/{id}` — partner server cross-check.
   Authentication: `Authorization: Bearer sk_…` (publishable keys are
   refused — never call this from the browser). Always returns
   `verification_id`, `session_id`, `reference`, `status`, `verdict`,
   `confidence_percent`, `message`, `document_type`, `ruleset`,
   `created_at`, `completed_at`, the `masked` block, and
   `reasons` / `signals` (operational, no PII).

   When the matching session was finalized with `expose_identity=true`,
   the response *additionally* carries:

   - `identity` — `{surname, given_names, birth_date, document_number,
     country, sex, document_type}` extracted from the document.
   - `media` — short-lived HMAC-signed URLs the partner backend MUST
     re-download into its own storage, of the form
     `https://idantitem.com/api/v1/widget/sessions/{id}/files/{kind}?signature=…&expires=…`
     where `kind ∈ {doc_front, doc_back, selfie}`. URLs are valid for
     1 hour by default (configurable via `MEDIA_URL_TTL_SECONDS`,
     clamped to 60s–24h). They are minted fresh on every call to
     `/verifications/{id}` — a previously-issued URL stops working
     after expiry, so do NOT cache them in your DB. Hitting the URL
     returns `image/jpeg` bytes with `Cache-Control: private, no-store`.

   Each call that returns `identity` is recorded in the merchant audit
   log as `widget.partner.identity_accessed` so a leaked secret key
   surfaces in the dashboard.

   When `expose_identity=false`, the `identity` and `media` keys are
   simply absent — the partner still receives the verdict and the
   masked + signals + reasons blocks, enough for fraud-scoring and
   gating without ever holding raw PII.

### Bootstrap call: `GET /api/v1/widget/branding`

Headers: `X-Idantitem-Key: pk_live_…`, `Origin: https://shop.example`

Returns `{branding: {primary?, background?}}` where each value is a
`#rrggbb` hex string saved by the partner in the dashboard. The widget
JS hits this endpoint right after opening the modal so the partner
palette is applied from the first frame, with no flash of the default
green / white. Origin allowlist is enforced exactly like
`POST /sessions`. Either key may be absent (= keep the default).

### Widget JS SDK

`window.idantitem.verify({...})` accepts:

- `publicKey` (required) — `pk_live_…` / `pk_test_…`
- `ruleset` — overrides the project default
- `reference` — partner-side correlation id, echoed in webhook + dashboard
- `language` — `ht` / `fr` / `en` (defaults to localStorage > navigator > `ht`)
- `country` — ISO-3 preselect for the issuing-country picker
- `preselectDocType` — `passport | national_id | permis`
- `liveGuidance` (default true) — when false the widget skips the
  MediaPipe-driven document/selfie auto-capture and falls back to file
  upload immediately. Useful for test environments.
- `primaryColor`, `backgroundColor` — `#rrggbb` overrides applied to the
  modal at load time. Server-stored colors fetched via `/branding` take
  precedence; these JS options are mainly for previews / one-off styling.
- `exposeIdentity` (default false) — passed through to the finalize
  call, also gates whether the result screen displays raw identity
  fields. Treat as a dev / staging flag.
- `onComplete(result)` — called with the masked verdict object.
- `onError(err)` — called on network / validation errors. `err.code`
  carries the API detail string when present.
- `onClose({reason})` — called when the user dismisses the modal
  *before* a verdict is delivered (`reason: "user_closed"`). Not fired
  after `onComplete`.

### Hosted widget link

```
https://idantitem.com/w?pk=pk_live_xxx&country=HT&ruleset=ecommerce_signup&primary=%230b6b3a&bg=%23ffffff
```

Query params: `pk` (required), `ruleset`, `country` (ISO-3), `doc`
(`passport|national_id|permis`), `lang` (`fr|ht|en`), `primary`
(`#rrggbb`), `bg` / `background` (`#rrggbb`), `dev=1`. Color params are
URL-encoded — `#` becomes `%23`. They override the JS defaults for the
duration of the link; the server-stored project colors still win.

### Webhook payload

When a project has `webhook_url` set, `finalize` delivers a signed POST
with the following body (compact JSON, sorted keys; signature carried
in `X-Idantitem-Signature: sha256=…` and `X-Idantitem-Timestamp` to
guard against replay):

```json
{
  "event": "verification.completed",
  "reference": "partner-correlation-id",
  "project_id": 42,
  "project_name": "Acme",
  "session_id": 1234,
  "timestamp": "2026-04-15T14:11:09+00:00",
  "verdict": "APPROVED",
  "status": "approved",
  "confidence_percent": 87,
  "ruleset": "bank_onboarding",
  "masked": {
    "surname_suffix": "OE",
    "given_names_suffix": "OHN",
    "document_last4": "1234",
    "country": "HTI",
    "sex": "M",
    "birth_year": 1990,
    "verdict": "APPROVED",
    "confidence_percent": 87
  },
  "identity": {
    "surname": "DOE",
    "given_names": "JOHN MARIE",
    "birth_date": "1990-01-01",
    "document_number": "AB1234567",
    "country": "HTI",
    "sex": "M",
    "expiry_date": "2031-12-24",
    "identity_source": "deterministic"
  },
  "signals": { "...": "see finalize response" },
  "reasons": ["mrz_ok", "face_match_ok", "..."]
}
```

The full `identity` block ships in the webhook regardless of the
browser's `expose_identity` flag — partner servers always need the PII
to resolve the customer record. The browser path is gated separately so
casual integrators can't accidentally leak identity to the buyer.

### Partner backend recipe (storing the case file)

End-to-end flow when the partner wants to keep a full KYC dossier in
its own database (typical for regulated merchants):

1. Browser side: open the widget with `exposeIdentity: true`, after
   collecting an explicit consent from the end-user (the partner is the
   GDPR/RGPD data controller for this storage; idantitem is processor).
2. Browser side: on `onComplete`, post `{verdict, session_id}` to the
   partner backend.
3. Backend: call `GET /api/v1/widget/verifications/{session_id}` with
   `Authorization: Bearer sk_…`. Read `verdict`, `identity`, `signals`,
   `reasons`. If the verdict is acceptable, persist the `identity`
   fields to the partner DB.
4. Backend: for each URL in `media` (typically `document_front_url`,
   `document_back_url`, `selfie_url`), fetch with a plain HTTP GET (no
   Bearer needed — the URL itself is the credential). Write the JPEG
   bytes to the partner's own object storage (S3 / GCS / similar).
5. Backend: do NOT keep the idantitem URLs in the DB. They expire in
   1h. The partner is responsible for hosting and serving the images
   to its admin UI from now on.

Re-running step 3 mints fresh signed URLs for the same session, so
operators can always retrieve the bytes again before the merchant
flips `expose_identity` off or before the project's retention window
purges the row.

### Liveness gate (server-side)

The orchestrator never trusts the client claim blindly. A verification
is marked `liveness_passed=true` only when:
- the client claim is positive AND
- `blink_count >= 1` AND
- `duration_ms >= 1500`

A claim missing the new fields (legacy widgets) is rejected — no silent
downgrade path.

## WooCommerce integration

Single-file PHP plugin at `integrations/woocommerce/idantitem-verify/`,
distributed as a release ZIP at `/downloads/idantitem-verify-woocommerce.zip`.

Modes (chosen in WP admin):
- `always` — every checkout
- `threshold` — cart total ≥ configured amount
- `product_flagged` — checkbox per product

Settings:
- `api_base` — e.g. `https://idantitem.com`
- `public_key` (`pk_…`) — used by the browser widget
- `secret_key` (`sk_…`) — used by the WooCommerce server to call
  `GET /api/v1/widget/verifications/{id}` and confirm the verdict
- `ruleset`
- `threshold_amount` (HTG)
- `preselect_country` (ISO-3)

Hooks:
- `woocommerce_review_order_before_submit` — render the verify button
- `woocommerce_checkout_process` — block `Place order` until verified
- `woocommerce_checkout_update_order_meta` — persist `verification_id`
  as `_idantitem_verification_id` order meta
- `woocommerce_admin_order_data_after_billing_address` — display the id
  on the admin order page

The merchant site never receives the buyer's selfie or ID image — only
the verdict and the verification id.

## Shopify integration

Checkout UI extension at `integrations/shopify/idantitem-shopify-app/`.
Targets `purchase.checkout.block.render` so it works on every Shopify
plan (Plus not required).

Block settings:
- `api_base`, `public_key`, `ruleset`, `threshold_amount`

Webhook handler example at
`integrations/shopify/idantitem-shopify-app/webhooks/orders-create.example.mjs`:

- HMAC-verifies the Shopify signature with `SHOPIFY_WEBHOOK_SECRET`
- Reads `idantitem_verification_id` from `note_attributes`
- Calls `GET /api/v1/widget/verifications/{id}` with `Bearer sk_…`
- Marks order as approved / held / cancellable based on verdict

## Hosted widget URL

Merchants who don't want to embed JS share a link of the form
`/w?pk=pk_live_xxx&…`. See the **Hosted widget link** subsection of the
Widget chapter above for the full query parameter list (including the
`primary` / `bg` brand-color overrides).

## Dashboard playground

`/app#playground` runs a real verification end-to-end against one of the
operator's projects, useful for QA and demoing the product. Backed by:

- `GET /api/v1/playground/widget-key` — returns a publishable key the
  page can drop into the live widget (no manual key juggling).
- `POST /api/v1/playground/run` — multipart upload (`document_front`,
  optional `document_back`, `selfie`) plus form fields (`ruleset`,
  `document_type`, `country?`, `reference?`); runs the orchestration
  pipeline and returns the masked verdict alongside the full identity
  so the operator can compare expected vs. extracted fields.
- `POST /api/v1/playground/test-webhook` — `{url, payload}`; forwards
  the supplied JSON to the user-configured URL with
  `X-Idantitem-Source: playground-test` header. Useful to debug a
  partner webhook receiver without exposing the browser to CORS errors.
  The request is **not** HMAC-signed — production webhooks are still
  signed via the regular finalize → `webhook_url` path.

All three endpoints require a valid dashboard session cookie.

## Trust badge

`GET /badge.svg?style=light|dark` returns a 180×48 SVG with cache
headers `public, max-age=86400, immutable`. Embed:

```html
<a href="https://idantitem.com">
  <img src="https://idantitem.com/badge.svg" alt="Vérifié par idantitem">
</a>
```

## Rules engine

Each project runs against a YAML ruleset. Stock rulesets:

- `bank_onboarding` — strictest, hard rules on MRZ + face + liveness
- `mobile_money` — moderate, accepts MANUAL_REVIEW more easily
- `ecommerce_signup` — lenient, optimised for low-friction conversion

Signals consumed:
- `mrz_or_qr_structural` (0–1)
- `visual_ocr_consistency` (0–1)
- `document_not_expired` (bool)
- `face_match_similarity` (cosine 0–1, normalised between 0.40 / 0.70)
- `liveness_passed` (bool — server-validated, see above)
- `document_liveness` (0–1)
- `no_reuse_detected` (bool)
- `mrz_checksum_invalid_count` (int)
- `document_issuing_country` (ISO-3)
- `sanctions_top_score` (0–1)
- `sanctions_verdict` (`CLEAR | ENRICH_INVESTIGATION | MANUAL_REVIEW | HARD_BLOCK`)
- `country_risk_score` (0–1, FATF / OFAC-aligned — Haiti = 0.75)

Hard rules (any failure → REJECTED):
- `mrz_checksum_invalid_count_max`
- `face_similarity_min`
- `liveness_passed`
- `document_issuing_country` (whitelist)
- `document_not_expired`
- `sanctions_verdict_not` (e.g. exclude `HARD_BLOCK`)
- `sanctions_top_score_max`
- `country_risk_score_max`

Final score = sum of weighted signals (positive contributions) minus
`country_risk` penalty (when configured). Verdict thresholds (`approve`,
`manual_review`) live in the same YAML.

## Dashboard endpoints (for an in-merchant UI)

- `GET /api/v1/accounts/projects` — list projects for the logged-in user
- `GET /api/v1/accounts/projects/{id}` — single project
- `PATCH /api/v1/accounts/projects/{id}/webhook` — update webhook URL +
  origin allowlist + optionally rotate signing secret
- `PATCH /api/v1/accounts/projects/{id}/screening` — toggle AML auto-run
- `PATCH /api/v1/accounts/projects/{id}/branding` — set or clear the
  widget brand colors. Body: `{widget_primary_color, widget_background_color}`,
  each value a `#rrggbb` hex string or `null` to fall back to the
  default palette. The widget reads these via `GET /widget/branding`.
- `GET /api/v1/accounts/projects/{id}/api-keys` — list keys
- `POST /api/v1/accounts/projects/{id}/api-keys` — create
- `DELETE /api/v1/accounts/projects/{id}/api-keys/{key_id}` — revoke
- `GET /api/v1/accounts/projects/{id}/usage?window_days=30` — chart data
- `GET /api/v1/accounts/projects/{id}/verifications?limit=50` — recent
  widget sessions for the dashboard table

## Data minimisation

- The widget never persists raw images server-side — they're held in
  memory during the orchestration pipeline, base64-cached on the
  session row, then purged immediately when finalize completes.
- `MaskedVerification` row keeps only suffixes (`surname_suffix`,
  `given_names_suffix`, `document_last4`), birth year, country, sex,
  verdict, confidence — used for the merchant dashboard.
- `expose_identity=true` only affects what is *returned to the browser*.
  The orchestrator always has the identity in memory.

## Audit trail

`audit.append(actor, action, payload)` writes to `audit_log` with
SHA-256 chaining (`prev_hash` → `self_hash`). Enables tamper detection.
Actions covered: account creation / login / API key issuance / webhook
delivery / screening decisions / orchestration verdicts.

## Distribution

The backend, the widget, the WooCommerce plugin and the Shopify
extension are commercial products. Installable artefacts (widget JS,
WooCommerce ZIP) are served by the API host so integrators always pull
the latest version.

The Shopify and WooCommerce sources are delivered to Partner customers
under a commercial agreement — they are not on a public registry.

## Pricing

- **Individual** accounts get 500 eKYC verifications / month for free;
  no card required to sign up.
- **Business / Enterprise** plans cover higher volumes, SLA, dedicated
  support and on-premise deployment options. Contact via
  <https://idantitem.com>.

## Attribution

The platform code is closed-source commercial software. Where matches
are surfaced from the OpenSanctions feed (CC-BY 4.0), a "Powered by
OpenSanctions" attribution is shown in the dashboard. Integrators who
re-display screening results must preserve that attribution.
