The Fair Housing analyses published by AVM vendors such as Veros and Clear Capital represent important early efforts to evaluate potential disparate impact in automated valuation models. These studies contribute useful perspective to an evolving area of the industry, but they are inherently constrained by scope, methodology, and—most importantly—objectivity. Their findings are self-assessments rather than independent evaluations: each vendor analyzes only its own model, using its own data and assumptions, and typically concludes that little to no bias exists, which limits their usefulness for broader risk management and supervisory purposes.
Regulated institutions, however, must operate under much more rigorous expectations. The new Interagency AVM Quality Control Standards require lenders to demonstrate that AVMs used in credit decisions are independently validated and fairly applied. This standard cannot be meaningfully satisfied by vendor-authored whitepapers alone.
AVMetrics’ methodology is designed specifically to meet these supervisory needs. Rather than focusing on individual model performance within internally defined samples, AVMetrics conducts standardized, national-level testing across 700,000 to 1 million transactions each quarter. This approach ensures that fairness conclusions reflect real-world market diversity and enables consistent evaluation across models, markets, and time.
AVMetrics independently tests eight different dimensions in which AVMs could potentially disadvantage protected classes, including coverage rates (hit rate), accuracy, precision, and other core performance measures. To support statistically meaningful comparisons, AVMetrics has invested in neighborhood-level demographic data, enabling analysis across comparison neighborhoods- avoiding the masking effects of county-level aggregation while preserving sufficient sample size beyond census-tract granularity.
Further, AVMetrics applies Standardized Mean Difference (SMD)—the same effect-size metric commonly used in fair-lending analytics—providing a clear measure of whether disparities are material, not simply detectable. In contrast, many model-specific analyses typically use raw accuracy differences or simple correlations, which offer no interpretive scale for examiners assessing practical significance. AVMetrics’ approach produces metrics that are grounded in established methodology, interpretable, and defensible.
As the next generation of AVMs incorporates increasingly complex machine learning and generative AI techniques, vendor-driven testing becomes even less transparent. AVMetrics’ methodology is intentionally model-agnostic: we can evaluate the fairness and performance of traditional hedonic models, GBDT-based systems, deep learning models, or hybrid AI architectures with equal rigor. As models become more opaque, the need for a neutral, independent evaluator becomes increasingly essential.
In contrast to analyses intended to provide general assurance around individual models, AVMetrics delivers regulatory-grade evidence. By identifying how model risk and policy risk can interact to generate disproportionate impacts—an expectation embedded in the new regulatory framework—our testing equips lenders with the actionable intelligence needed to inform, calibrate, and justify their risk-policy decisioning.
As regulatory expectations around AVM fairness continue to mature, institutions must move beyond model-specific assurances toward independent, repeatable, and scalable evaluation frameworks. AVMetrics’ fair housing methodology is purpose-built to meet these expectations, providing lenders with nationally consistent, statistically rigorous, and model-agnostic evidence of AVM performance and potential disparate impact. By aligning testing design with supervisory standards and real-world production environments, AVMetrics enables institutions not only to identify and manage fair-lending risk, but also to demonstrate compliance with confidence in increasingly complex valuation ecosystems.
