Every question a CTO, client, or investor should ask before trusting an AI-native appraisal firm — answered concretely, with links to where each claim is provable in the product.
Is it actually accurate, and do you know when it's wrong?
Backtested on held-out ground-truth sales (leave-one-out): 4.3% median error, 88.3% within 10% of realized price — in the “strong AVM” band. Full accuracy report →
Yes — and we measure it. The 90% CI contains the realized price 92.5% of the time (target ≈ 90%). Most providers never disclose calibration; we publish the curve.
Uncertainty widens the interval and lowers confidence; objective “hard-case” triggers (comp dispersion, thin sets, approach divergence) route the file to an adversarial review and then to the MAI. The system is built to surface doubt, not bury it.
No. The LLM chooses inputs with justification; all arithmetic is deterministic. An always-on Verifier recomputes every figure from source records and checks grounding before any human sees it — a number that can't be traced to a record can't be emitted.
A small +2.6% positive bias exists in the backtest; we disclose it openly and the MAI review is tuned to catch systematic over-valuation on individual files.
Will it hold up to a regulator, an auditor, or in litigation?
The full appraisal is developed and reported per USPAP, signed by an employed MAI who performs the inspection and takes responsibility. The indicative product is clearly labeled a screening estimate, not an appraisal.
A credentialed MAI appraiser of record signs every full appraisal and bears USPAP responsibility — Strata Valuation is a licensed appraisal firm, not a software vendor disclaiming the number. An independent Standard-3 reviewer concurs.
A complete, immutable audit trail: every input, source, agent decision, human override (distinct from agent decisions), timestamp, and model version. See the workflow →
The report carries the FIRREA de-minimis determination, appraiser-independence and competency statements, and intended-use/intended-user disclosures.
Each sale is confirmed as to conditions of sale; non-arm's-length (related-party, REO, bulk) transactions are flagged and excluded or adjusted, with the verification source retained in the work file.
Where does judgment enter, and is it real or a rubber stamp?
No. The pipeline pauses at needs_review; the MAI reviews every material decision and can approve, override (new value + rationale), or flag — and a revision re-runs through the Verifier. Sign-off only happens on explicit approval.
Never. The agent's original decision is preserved; the human override is a new, linked record. Both are kept — that's what makes the trail defensible.
Every MAI comment and correction is a labeled training example. Disagreements and “should-have-done” corrections are the highest-value signal — the agents learn the firm's judgment. That compounding reasoning dataset is the moat.
Is our data safe, and where does it go?
Engagement documents, property data, and valuations are stored in the firm's Postgres with per-engagement isolation. The demo runs on synthetic data; production deployment supports a dedicated/VPC database and encryption at rest and in transit.
No. Client data is not used to train external foundation models. The reasoning dataset that improves Strata Valuation's own agents is the firm's, governed by the engagement terms.
Yes — append-only audit events on every valuation and an engagement event timeline on every order, both queryable.
Client and operations are separate, role-gated experiences sharing one substrate. Clients see status + deliverables; agent reasoning, overrides, and feedback are internal-only.
Does this actually make money and scale?
LLM cost is ~$0.30–1.20 per full appraisal (opus for judgment, haiku for structuring), hard-capped per run with a typed budget error. Against a $1–15K fee, agent cost is a rounding error; the MAI's time is the real input — and the agents do the 95% that doesn't require a license.
Agents parallelize the analysis that takes a traditional appraiser days. Strata Valuation delivers in ~3 business days vs. the industry's 4–6 weeks — the MAI reviews and signs rather than building from scratch.
The constraint is licensed MAI review time, not analysis. Because agents do the development and the verifier de-risks it, one MAI can review and sign many more files than they could author — that's the operating leverage.
The structured reasoning dataset. “CoStar has transactions; Strata Valuation has reasoning chains.” Every decision and every MAI correction compounds into proprietary training data an incumbent can't replicate by buying a model.
How is it built, and what breaks?
Next.js + TypeScript, Postgres + Drizzle, the Anthropic SDK for agents, Zod-validated boundaries everywhere. CRE math is deterministic and unit-tested in src/domain; agents choose inputs, never do arithmetic.
The engine runs in a deterministic fallback mode with no LLM at all — real CRE math + heuristic comp scoring produce a complete, schema-valid result. Every step is labeled llm or deterministic.
Every agent output is validated against a Zod schema with one retry, then a typed error. Then the deterministic Verifier independently recomputes the numbers. The model is never the source of truth for a figure.
Deterministic unit tests for the CRE math, the verifier, the hard-case gate, and the accuracy metrics; a reproducible backtest harness; typed end-to-end engagement flows.
Demo runs on synthetic/curated market data; the methodology, calibration approach, and audit trail are production-grade. Accuracy figures regenerate via pnpm backtest. Data sources & production-readiness →