Transparency

Not authorship evidence. A specificity and trust gate for agents.

Binary AI detection is brittle. VeracityAPI is built for the more useful workflow question: should this text or image be allowed, revised, reviewed, or rejected before an agent uses it?

Get API key Use cases Read docs Evals

What model is doing the scoring?

v0.1 uses a structured LLM scoring pass via Anthropic Haiku, temperature 0, with schema-constrained tool output and deterministic post-processing. It is not a fine-tuned classifier yet.

What does it measure?

Observable signals: specificity gaps, generic phrasing, unsupported or weakly sourced claims, low-value/slop patterns, and synthetic-looking texture. The useful target is workflow risk, not ground-truth authorship.

What changed after early audit?

Public positioning now says content trust/specificity scoring, not AI-content detection. Responses include additive derived fields: content_trust_score, specificity_risk, provenance_weakness, and synthetic_texture_risk.

What should agents do with it?

Use it as a quality gate. Good output: allow. Generic but fixable output: revise. High-risk source/citation/training input: human_review or reject depending on intended_use.

Traffic-light action matrix

Green = allowYellow = reviseOrange = human_reviewRed = reject

risk source	threshold	workflow route
low	text < 0.40 / audio < 0.30	allow for normal workflows
medium	text < 0.70 / audio < 0.55	revise for publish, human_review for cite/train
high	text ≥ 0.70 / audio ≥ 0.55	human_review or reject depending on intended_use

risk_level = max(synthetic_risk, slop_risk)
recommended_action = policy(risk_level, intended_use)

Image and audio scoring v0.1

POST /v1/analyze accepts {type:"image",content:"https://..."}, calls a vision LLM with a constrained visible-artifact rubric, and returns synthetic_image_risk, synthetic_risk alias, evidence, fixes, trust score, risk level, and recommended action. VeracityAPI stores no image bytes and logs only a URL hash plus hostname. C2PA/EXIF/provenance verification is roadmap, not claimed in v0.1.

POST /v1/analyze accepts {type:"audio",content:"https://..."} plus optional caller transcript; VeracityAPI returns a Gemini-generated transcript, sends capped audio bytes to Gemini for strict synthetic-audio workflow triage with transcript return, and returns synthetic_audio_risk, workflow_risk, evidence, fixes, trust score, risk level, and recommended action. VeracityAPI stores no audio bytes, base64, or full audio URLs. It is not proof of AI generation, voice cloning, speaker identity, or forensic determination.

Known limitations

Does not prove whether text was written by AI or a human.
Good AI-assisted writing with concrete details may pass.
Weak human writing may be flagged — intentionally, if it is generic or unsupported.
English-calibrated first; non-English scoring is experimental until evals exist.
Latency is LLM-bound; use /v1/analyze-batch for pipeline batches and /v1/balance for preflight spend checks.
Audio scoring is intentionally strict for triage and can produce false positives on compressed, edited, or unusually clean human recordings.

Near-term roadmap

Publish a labeled calibration set with false-positive slices.
Add configurable risk tolerance: lenient, standard, strict.
Add a fast heuristic prefilter before the full evidence pass.
Add async batch/webhooks after synchronous batch usage is proven.

Internal links

Not authorship evidence. A specificity and trust gate for agents.

What model is doing the scoring?

What does it measure?

What changed after early audit?

What should agents do with it?

Traffic-light action matrix

Image and audio scoring v0.1

Known limitations

Near-term roadmap

Related product paths

What VeracityAPI detects

Error handling

Copy-paste examples

MCP tools

Detector comparisons

Launch notes

AI detection API

AI content detector API

Synthetic media detection API