After running our legacy and PTM™ testing processes in parallel for several years, AVMetrics has formally completed its transition to Predictive Testing Methodology (PTM™) as our sole testing platform. The legacy testing process has been retired.
This transition was not abrupt, and it was not done in isolation. On the contrary, this milestone marks the completion of a multi-year effort that involved model-builders and AVM users.
Model vendors participated actively in PTM™ testing, recognizing the importance of delivering results that better reflect how their models perform for end users—lenders, servicers, and investors. Over the past several years, AVMetrics worked in close collaboration with AVM vendors and institutional clients to evaluate both approaches side by side. This parallel period gave our clients, AVM vendors, and our own team the opportunity to validate results side by side. That validation is now conclusive, and continuing to maintain the legacy system no longer serves the needs of the industry.
Why the legacy method is being retired
Traditional AVM testing compares model estimates to sale prices after those prices—and the listing data that precedes them—are already available to the models. As we’ve documented extensively, many AVMs incorporate MLS listing prices into their estimates, a phenomenon known as listing price anchoring, which inflates apparent accuracy in ways that don’t reflect real-world performance. When a model knows the listing price before producing its estimate, the resulting test measures something closer to consistency with known information than true predictive accuracy.
The anchoring effect isn’t theoretical—it’s quantified. Zillow reports a median error of approximately 1.7% for on-market homes but 7.2% for off-market homes. Redfin reports nearly identical figures: roughly 2.0% on-market and 7.6% off-market. That gap—a three- to four-fold difference in accuracy—represents the influence of listing price anchoring on AVM performance. Independent analysis by the AEI Housing Center corroborates this: their 2024 evaluation of five AVM providers found that “springiness”—how much a model’s estimate jumps when listing or sale prices become available—showed the widest variance in scores of any criterion tested, with some providers receiving failing grades. Since the vast majority of AVM use cases involve unlisted properties—refinances, home equity lending, portfolio assessments, loss mitigation—testing that includes listing data paints a misleading picture of how models will actually perform when it counts.
PTM™ solves this by using AVM estimates produced before properties are listed. Our Model Repository Database (MRD™) stores monthly valuations for every residential property in America from every participating AVM. By matching these pre-listing estimates against eventual arm’s-length sale prices, we isolate each model’s genuine predictive capability on a level playing field.
Aligned with regulatory expectations
This transition also reflects longstanding supervisory expectations for how AVMs should be tested, validated, and governed. While regulatory frameworks have evolved toward a more principles-based approach, the core elements of sound model oversight remain consistent: independent validation, outcomes-based testing, and a clear understanding of model limitations. Three authorities continue to frame these expectations:
Interagency Appraisal and Evaluation Guidelines (2010), Appendix B. The federal banking agencies direct that:
- To ensure unbiased test results, institutions should compare AVM results to actual sales data in a specified trade area or market prior to the information being available to the model. If more than one AVM is used, each should be validated.
- Institutions should evaluate the underlying data used in the model(s), including data sources and types, frequency of updates, quality control performed on the data, and the sources of data in states where public real estate sales data are not disclosed.
- Institutions should not rely solely on validation representations provided by an AVM vendor.
Model Risk Management Expectations. Current supervisory guidance reinforces the importance of:
- Independent validation and effective challenge
- Outcomes analysis comparing model outputs to real-world results
- Ongoing monitoring aligned to model use and materiality
AVM Quality Control Standards (2025). The QC Standards establish that AVMs must be accurate, reliable, and nondiscriminatory, with institutions responsible for testing, monitoring, and controlling for bias and performance risk. While the QC Standards do not prescribe a specific methodology, it places responsibility on institutions to demonstrate that their approach is reasonable and well-supported. PTM™ operationalizes these principles at scale: independent, pre-information testing against actual arm’s-length sales, conducted continuously across every participating model.
What this means for the industry
For lenders and AVM users: AVMetrics’ independent AVM testing now reflects how models actually perform in the scenarios that matter most—refinances, HELOCs, portfolio assessments, and other situations where no listing price exists. The results you receive are a more realistic measure of the accuracy you can expect in production.
For AVM vendors: This transition reflects a shared direction. Vendors have participated in and supported PTM™ testing, recognizing the importance of delivering performance transparency that aligns with client expectations. Vendors continue to receive anonymized comparative analysis showing where their models stand relative to the field.
For the industry: This marks a move toward a more consistent, independent, and forward-looking standard for AVM validation—one that supports better model selection, stronger governance, and more credible cascade optimization and compliance decisions.
What hasn’t changed
What hasn’t changed is our role.
AVMetrics continues to provide independent, transparent AVM testing and validation aligned with the expectations set forth in the Interagency Guidelines and AVM Quality Control Standards. This independence remains a key distinction. Many AVM performance reports available in the market are produced by model providers evaluating their own models. While these efforts provide useful monitoring insights, they do not offer the same level of objectivity, comparability, or cross-model evaluation. AVMetrics’ role continues to be providing a consistent, third-party view across all participating models, supporting a more complete and defensible understanding of performance. We rigorously evaluate model performance and deliver comparable, objective results across providers—supporting due diligence, model validation, and ongoing monitoring. We continue to provide Model Preference Tables™, cascade optimization, geographic performance rankings, and all of the analytical products our clients rely on.
Just as importantly, we continue to provide clear guidance on appropriate use:
- When an AVM is fit for purpose
- When one model is more appropriate than another
- And when an AVM should not be relied upon at all
The scope and rigor of our testing have only increased. Our focus remains the same: enabling clients to make informed, defensible decisions about AVM use through independent validation and consistent, evidence-based analysis.

