Context fields with highest impact
Reason code glossary
Security model
All user-submitted text, webpage text, OCR text, filenames, chat messages, and browser-observed text are treated as hostile evidence. The browser collector never enters credentials, never submits forms, blocks private/local/metadata targets, rejects public hostnames that resolve to non-public addresses, validates fallback redirects before following them, and routes Playwright subrequests through the same guard with service workers disabled. It produces structured evidence instead of a final decision. File attachments are never executed by the browser collector; Cuvark records safe static metadata signals for executables, scripts, macro-enabled documents, archives, PDFs, and scriptable images.
Hosted model APIs
When OPENAI_API_KEY is configured and CUVARK_MODEL_PROVIDER is openai, Cuvark uses the OpenAI Responses API for a structured final judge pass.
The deterministic scorer still runs first and hard malicious signals keep conservative handling.
Image attachments with an HTTPS image URL, data URL, or base64 metadata can also use OpenAI
vision extraction before scoring. Hosted calls have bounded server-side timeouts and fall back
to deterministic scoring when the provider is slow or unavailable.
Default policy
Settings can maintain local threshold defaults plus allowlist and blocklist domains. Request policies are merged with the local default policy for scoring, so customer-specific context can add stricter controls without changing the public payload shape.
Cache behavior
Cuvark stores URL cache entries using a normalized URL path key. Fresh exact URL-path cache
hits set processing.cache_status to hit or partial,
contribute cached safe/bad signals, and avoid redundant deterministic/browser-agent work for
repeated URL-only cases. Cache evidence is not reused across different paths on the same domain,
and every request is still stored as a case so auditability and feedback history remain complete.
Threat intelligence
Local allowlists, blocklists, and cached verdicts always run first. When CUVARK_WEB_RISK_API_KEY is configured, Cuvark also checks public HTTP(S) URLs with
Google Web Risk Lookup before browser signal collection or Browser Use escalation. Matches
become threat_feed_hit and known_malicious_url signals with provider, threat type, and cache-expiry
evidence. Provider timeouts or errors are recorded as evidence only, and scoring continues.
Public URLs also get bounded DNS/TLS evidence when CUVARK_URL_NETWORK_ENABLED is not false, including A/AAAA record
counts, private-address resolution, TLS issuer, and certificate validity windows. Browser guard
DNS lookups can be tuned separately with CUVARK_BROWSER_DNS_TIMEOUT_MS.
Score artifacts
When S3-compatible bucket variables are configured, persisted scores also write a redacted JSON
artifact at tenants/{tenant_id}/cases/{case_id}/score-result.json. The
verdict records the object key in evidence.artifact_refs.score_result_json. If the
bucket write fails, scoring continues and the verdict records evidence.artifact_error.
Deleting a case also deletes recorded artifact refs on a best-effort basis.
Outcome metrics
The Evals dashboard combines synthetic cases with tenant-scoped outcome metrics from feedback
labels. Cuvark uses the latest decisive label per case, maps confirmed scam labels and
false-negative outcomes to scam truth, maps confirmed legit labels and false-positive outcomes to
legit truth, then reports precision, recall, false-positive rate, false-negative rate, human
disagreement, labeled-case latency, and cost units. insufficient_evidence labels are
kept for review history but ignored for precision and recall.
Privacy and retention
Stored-data redaction is enabled by default. Before Cuvark persists case input, evidence, and
model output, it redacts email addresses, phone numbers, long numeric tokens, and URL query
parameters. Direct POST /v1/score requests still score the original payload in
memory before storing the redacted copy.
For the advanced case flow, POST /v1/cases and POST /v1/cases/{case_id}/evidence also store redacted records. Use direct
scoring when full raw context is needed for the highest-fidelity one-shot verdict, or tune
redaction from Settings for controlled tenants.
Use DELETE /v1/cases/{case_id} to remove a case and linked evidence, signals,
model output, actions, feedback, and webhook deliveries for the authenticated organization.
Settings also includes retention purging for cases older than the saved retention window. Labeled
cases are preserved by default unless explicitly included.
Local auth
Development mode allows API requests without a key. Set CUVARK_REQUIRE_API_KEY=true and CUVARK_API_KEY to require bearer-token auth.
Production deployment
The current deployment target is the Hetzner origin under deploy/hetzner/, with
Cloudflare intended in front for DNS, proxying, WAF, caching, and rate-limit shielding once the
DNS record is configured. Railway config is still available under railway/ as an
alternate managed target. Postgres is used when DATABASE_URL is present; Redis queues
are used when REDIS_URL is present; object storage is used when S3-compatible bucket
variables are present.
Clerk organizations
When PUBLIC_CLERK_PUBLISHABLE_KEY and CLERK_SECRET_KEY are configured,
Cuvark enables Clerk sign-in and scopes API-created cases to the active organization.