Diligence

Trust & diligence center

Every question a CTO, client, or investor should ask before trusting an AI-native appraisal firm — answered concretely, with links to where each claim is provable in the product.

Accuracy report →See the production line

4.3%

median error

88.3%

within 10%

92.5%

90% CI coverage

~$1

LLM cost / appraisal

Accuracy & reliability

Is it actually accurate, and do you know when it's wrong?

How accurate is the valuation?proven

Backtested on held-out ground-truth sales (leave-one-out): 4.3% median error, 88.3% within 10% of realized price — in the “strong AVM” band. Full accuracy report →

Is the confidence interval meaningful?calibrated

Yes — and we measure it. The 90% CI contains the realized price 92.5% of the time (target ≈ 90%). Most providers never disclose calibration; we publish the curve.

What happens when the model is uncertain?

Uncertainty widens the interval and lowers confidence; objective “hard-case” triggers (comp dispersion, thin sets, approach divergence) route the file to an adversarial review and then to the MAI. The system is built to surface doubt, not bury it.

Can it hallucinate a number?no

No. The LLM chooses inputs with justification; all arithmetic is deterministic. An always-on Verifier recomputes every figure from source records and checks grounding before any human sees it — a number that can't be traced to a record can't be emitted.

Is there bias?

A small +2.6% positive bias exists in the backtest; we disclose it openly and the MAI review is tuned to catch systematic over-valuation on individual files.

Defensibility & compliance

Will it hold up to a regulator, an auditor, or in litigation?

Is the output USPAP-compliant?

The full appraisal is developed and reported per USPAP, signed by an employed MAI who performs the inspection and takes responsibility. The indicative product is clearly labeled a screening estimate, not an appraisal.

Who is legally responsible for the value?MAI

A credentialed MAI appraiser of record signs every full appraisal and bears USPAP responsibility — Strata Valuation is a licensed appraisal firm, not a software vendor disclaiming the number. An independent Standard-3 reviewer concurs.

What do you hand a regulator?

A complete, immutable audit trail: every input, source, agent decision, human override (distinct from agent decisions), timestamp, and model version. See the workflow →

FIRREA / federally-related transactions?

The report carries the FIRREA de-minimis determination, appraiser-independence and competency statements, and intended-use/intended-user disclosures.

How are comps verified?

Each sale is confirmed as to conditions of sale; non-arm's-length (related-party, REO, bulk) transactions are flagged and excluded or adjusted, with the verification source retained in the work file.

The human-in-the-loop

Where does judgment enter, and is it real or a rubber stamp?

Is the MAI a rubber stamp?first-class

No. The pipeline pauses at needs_review; the MAI reviews every material decision and can approve, override (new value + rationale), or flag — and a revision re-runs through the Verifier. Sign-off only happens on explicit approval.

Do overrides change the agent's record?

Never. The agent's original decision is preserved; the human override is a new, linked record. Both are kept — that's what makes the trail defensible.

Does the firm get better over time?moat

Every MAI comment and correction is a labeled training example. Disagreements and “should-have-done” corrections are the highest-value signal — the agents learn the firm's judgment. That compounding reasoning dataset is the moat.

Security & data

Is our data safe, and where does it go?

Where does client data live?

Engagement documents, property data, and valuations are stored in the firm's Postgres with per-engagement isolation. The demo runs on synthetic data; production deployment supports a dedicated/VPC database and encryption at rest and in transit.

Does our data train a third-party model?no

No. Client data is not used to train external foundation models. The reasoning dataset that improves Strata Valuation's own agents is the firm's, governed by the engagement terms.

Do you have an audit log?

Yes — append-only audit events on every valuation and an engagement event timeline on every order, both queryable.

What's the access model?

Client and operations are separate, role-gated experiences sharing one substrate. Clients see status + deliverables; agent reasoning, overrides, and feedback are internal-only.

Unit economics & ops

Does this actually make money and scale?

What does one appraisal cost to produce?~$1

LLM cost is ~$0.30–1.20 per full appraisal (opus for judgment, haiku for structuring), hard-capped per run with a typed budget error. Against a $1–15K fee, agent cost is a rounding error; the MAI's time is the real input — and the agents do the 95% that doesn't require a license.

Why is this faster than a traditional firm?10×

Agents parallelize the analysis that takes a traditional appraiser days. Strata Valuation delivers in ~3 business days vs. the industry's 4–6 weeks — the MAI reviews and signs rather than building from scratch.

How does it scale?

The constraint is licensed MAI review time, not analysis. Because agents do the development and the verifier de-risks it, one MAI can review and sign many more files than they could author — that's the operating leverage.

What's the moat vs. an incumbent bolting on AI?

The structured reasoning dataset. “CoStar has transactions; Strata Valuation has reasoning chains.” Every decision and every MAI correction compounds into proprietary training data an incumbent can't replicate by buying a model.

Architecture & reliability

How is it built, and what breaks?

What's the stack?

Next.js + TypeScript, Postgres + Drizzle, the Anthropic SDK for agents, Zod-validated boundaries everywhere. CRE math is deterministic and unit-tested in src/domain; agents choose inputs, never do arithmetic.

What if the LLM is down or slow?resilient

The engine runs in a deterministic fallback mode with no LLM at all — real CRE math + heuristic comp scoring produce a complete, schema-valid result. Every step is labeled llm or deterministic.

How do you trust LLM output?

Every agent output is validated against a Zod schema with one retry, then a typed error. Then the deterministic Verifier independently recomputes the numbers. The model is never the source of truth for a figure.

Is it tested?

Deterministic unit tests for the CRE math, the verifier, the hard-case gate, and the accuracy metrics; a reproducible backtest harness; typed end-to-end engagement flows.

Demo runs on synthetic/curated market data; the methodology, calibration approach, and audit trail are production-grade. Accuracy figures regenerate via pnpm backtest. Data sources & production-readiness →