House price indices: Making sense of the data overload

House price indices: Making sense of the data overload

Christopher JoyeDecember 8, 2020

Why do we care about house prices? Where do we get the data from? How do we measure changes in house prices when all homes are different, and sold only every seven or eight years? Why do we sometimes see conflicting house price index results? And who does Australia’s most knowledgeable and respected economic authority, the Reserve Bank of Australia, rely on?

If you are going to invest in the housing market, it is crucial to get a better understanding of the information that purports to describe it. The only way you can quantify historical returns (capital growth rates and yields) and risks (i.e., the volatility of those returns) is by using house price indices. As I explained last week, we will also shortly see new financial markets emerging that will offer investment products tied to these indices.

Today I will try to provide a comprehensive “primer” on this subject of house price data and the indices that are developed from it. I hope this will be a resource that you can regularly return to when questions inevitably arise.

1. Why do we care about house prices?

Exclusive of our human capital, or income-generating potential, the largest component of Australian household wealth is bricks and mortar. That is, the assets we tend to take for granted nevertheless provide one of our most basic human needs to survive and contribute productively to society: well-located shelter.

Our best guess is that the total value of privately owned residential real estate is about $3.5 trillion. Understanding how house prices are changing over time is, therefore, crucial to getting a read on variations in household wealth, which in turn influence savings and investment decisions.

Accurate benchmarks of house price movements are also vital if one is to shed light on how much it costs to buy a home in various regions across the country, and the question of how housing affordability changes over time.

2. Why is it hard to measure house prices?

In contrast to, say, shares listed on the ASX, where every individual share is identical in legal form and “market prices” for shares in large companies are observed regularly during the trading day, houses are distinct in two fundamental respects: first, we only see sales transactions for individual homes on average every seven to eight years; and, secondly, each house tends to be different to the next. That is, it ordinarily has a unique location and physical characteristics, such as land size, aspect, number of bedrooms/bathrooms, build quality, number of car spaces, presence or otherwise of a pool, air-conditioning, tennis court, and so on.

Over the past 50 years the academic research literature has dedicated a great deal of time trying to come up with increasingly precise house price measurement technologies that overcome these two fundamental problems: namely, the “illiquidity” (i.e., infrequently traded) and “heterogeneity” (i.e., every home is different) characteristics of housing. In this context, the literature has made tremendous strides. And Australian researchers have been at the vanguard of these efforts.

Arguably the culmination of this work is the “hedonic house price index” method, which uses a statistical technique known as “regression analysis” to assess the relationship between the prices of homes and their individual attributes (e.g., location, land size, number of beds/baths, etc.), which are the so-called “explanatory variables”.

The rationale underlying “hedonic theory” is that the value of a composite good, such as a house, is the sum of its individual characteristics, much like the price of a computer can be worked out by adding up the value of the screen, hard-drive, RAM, processor, keyboard, etc. By deconstructing a house according to its attributes one can control for the physical differences across all homes. Moving beyond this description the technical literature becomes rather complex, and is categorically not the subject of today’s column. (For completeness's sake there are, in fact, three alternative hedonic methods: the “pooled time dummy”; “adjacent period”; and “imputation” techniques. RP Data-Rismark produces all three of them.)

As I discuss below, the single biggest impediment to publishing hedonic indices has been the demanding data requirements: you need accurate information on the essential characteristics of most homes, which, historically, few house price index providers have ever had. This is why hedonic indices are relatively rare, and today only produced widely in Australia and the United Kingdom. It is also why most countries have historically opted for much simpler and less exacting approaches, such as the “median price” and “repeat-sales” index methods.

Australian academic economist Professor Robert Hill, who is based at the University of Graz, is one of the world’s leading experts on how to best construct house price indices. And subject to getting access to the right data, Hill’s preferred index method is the hedonic approach.

In Australia, the property information company, RP Data, collects detailed housing “attribute” data on almost all new sales. This has facilitated the launch of hedonic house price indices for the first time in 2006.

3. What different house price indicators are out there?

Domestically, there are three major index providers – RP Data and Rismark (two separate companies that joined together to produce housing analytics), the Australian Bureau of Statistics, and Australian Property Monitors – that are most frequently quoted by industry and the media. A fourth, Residex, also publishes house price data and is less regularly referenced. Occasionally, one also sees information supplied by the different real estate institutes (particularly in Victoria).

The RBA, which is Australia’s most well-regarded, knowledgeable and experienced economic analyst, relies on the RP Data-Rismark, APM and ABS data, in that order of preference (this can also be seen from its quarterly Statement on Monetary Policy, which prioritises RP Data-Rismark results). Of these three sources, only RP Data-Rismark supply index data publicly on a monthly basis. This is because the hedonic index that RP Data-Rismark publishes has significantly lower “noise” and “revision bias” attributable to its monthly estimates in comparison to the stratified median measures used by APM and the ABS.

Recent RBA disclosures in its board minutes indicate that the RBA follows these preliminary monthly movements quite closely. The RBA also uses RP Data-Rismark’s monthly hedonic index results for the purposes of its monthly Chart Pack (see below).

The RBA likes having access to three independent house price indices, which it can use to sanity-test anomalous outcomes. That is to say, it does not rely exclusively on any one dataset. This is, quite understandably, the principle the RBA applies to all of its analysis. Having said that, there is evidence to suggest that even when the ABS and APM indices agree with one another, which they often do given the similarities in their methods, but conflict with the RP Data-Rismark findings, such as in the first quarter of 2009, the RBA places greater weight on the latter due to the fact that the hedonic approach is better equipped to overcome extreme “compositional biases” that sometimes afflict the ABS and APM proxies (such as with the unusual surge in first-time buyers at the start of 2009). Generally though, there are reasonably strong commonalities between these three alternatives over the medium-term.

Working out which index suppliers the RBA relies on is also important for the economics community that is paid to watch the central bank. In this respect, Westpac chief economist Bill Evans has commented: “RP Data-Rismark are the RBA’s ‘preferred data analysts' for house prices.”

4. What are the differences between these providers?

There are three key points of departure among the house price indices:

1) The data they collect;

2) The data they actually use; and

3) The accuracy/complexity of the index methodology they rely on.

All four index providers referenced above collect data from the Valuer Generals or Land Titles offices in each state and territory. In Australia, we have the important advantage that government agencies record data on pretty much all sales executed across the country. This is a function of our stamp duty system. These agencies then make the data available to a limited number of licensed contractors, such as RP Data, APM and the ABS. While some states report the data with up to a three-month lag, the timeliness of the information is always improving, with most agencies transmitting data within one to two months of the exchange (note, not settlement) of contracts. (The RBA is to be credited as a critical influence in motivating these improvements.)

This means that most Australian house price indices benefit from the “population” of all sales transactions – i.e., there is little to no “sample selectivity bias” wherein the index only employs a small subset of the overall population of information. US and UK house price indices suffer from exactly this problem. For example, the Case-Shiller index ignores aabout 40% of the US market, while the widely quoted Halifax measure in the UK captures less than 20% of all sales.

It is worthwhile summarising the contrasting techniques used by the publicly available information providers:

1) The Real Estate Institute indices are based on simple “median prices”, which are crude and quite unreliable (the RBA and the Treasury have explicitly recommended against using this approach). A median price index ranks all sales from high to low and plucks out the middle or 50th percentile observation. The REI indies are normally reported quarterly.

2) The ABS reports a “stratified median price method” that is broadly based on the methodology developed by two RBA economists, Richards and Prasad. The main author, Dr Anthony Richards, is head of economic analysis within the RBA and regarded as one of Australia’s most expert housing authorities.

The RBA has made the measurement of house prices a particular focus ever since former governor Ian Macfarlane, correctly argued in 2004 that “housing…is an extremely important asset class for most people, yet … [i]t really is probably the weakest link in all the price data in the country, so I think it is something that I would like to see resources put into”.

The launch of RP Data-Rismark's hedonic indices in 2006 were a response to the RBA’s “call to arms”.

Although the ABS numbers derive from a median price index, the stratification technique they use helps mitigate some of the severe “compositional biases” associated with simple medians such as those reported by REIs (I discuss these biases in more detail below).

The RBA believes that in a perfect world one would use more sophisticated “regression-based” methods, such as the hedonic indices produced by RP Data-Rismark. As noted above, however, hedonic indices are complex to compute, and have intensive data requirements on the unique attributes of every individual property included in the index. In the absence of the necessary data and the considerable statistical expertise needed to estimate hedonic measures, the stratified median price benchmark appears to be the RBA’s second-best preference (see also here for a summary of the ABS’s index method). It is noteworthy that the Reserve Bank of New Zealand has recently decided to follow Richards and Prasad’s stratified median recommendations.

Finally, the ABS reports on a quarterly basis and typically after APM and RP Data-Rismark’s numbers have come out, which makes for the impression of rolling waves of housing information that ordinarily, but not always, coincides in a directional sense.

3) The Fairfax-owned APM also publishes a stratified median price method that is based even more closely on the RBA’s stratification technique than its ABS cousin. APM reports quarterly.

4) Residex uses a “repeat sales” approach that is understood to be similar to the Case-Shiller technique published by S& P in the US (Residex does not disclose its method, in contrast to all the other suppliers).

A repeat-sales index only examines purchases and sales of the same properties over time. It therefore has the strength of measuring buy-and-hold returns, but suffers from the deficiency that it excludes all sales transactions that do not have a previous purchase price (e.g., new home sales). The repeat-sale index can also be biased towards homes that turn over more rapidly (e.g., distressed sales). Finally, the repeat-sales proxy can be artificially inflated by renovations or capital improvements to the property, which are hard for this measure to control for. (There has been some work done at Yale on developing repeat-sales proxies that account for non-linear changes in return, which are thought to be triggered by renovations to the home.)

Although Residex reports monthly, it does so shortly after the end of the subject month and must, one can infer, be rather limited in the amount of data the company can actually include in its index (i.e., have a small sample size). The RBA does not appear to focus on Residex’s results, which the RBA dropped from its Statement on Monetary Policy several years ago.

5) RP Data-Rismark produces all of the above methodologies and our preferred benchmark, the hedonic index. In total, RP Data-Rismark privately computes up to 15 alternative index measures, including several median and stratified median price indices, four repeat-sales constructs, and a number of hedonic benchmarks. All of these are available to the public on request.

Over and above the contrasting methodologies, there are some material differences in the data used by these organisations:

The ABS only examines detached house in capital cities and therefore excludes all “attached” forms of accommodation such as apartments, terraces and semis (which account for about one-quarter of the housing stock);

APM and RP Data-Rismark include all capital city data pertaining to all property types (i.e., detached and attached housing);

To the best of my knowledge, the only index provider that publishes an “all dwellings” proxy (i.e., an index that covers all property types) in addition to separate house and unit benchmarks is RP Data-Rismark; and

APM and RP Data-Rismark are the only organisations to publish regularly regional, or non-capital city, house price benchmarks, which we developed especially for the RBA.

5. Why are there different “median prices”? Which one is right?

There can only be one true median price, as it is a strict mathematical definition. The median is actually very simple: it is the middle or 50th percentile observation. The median of a sample of homes sales is the middle sales transaction if you lined up all those sales from low to high.

The median prices reported by RP Data-Rismark are based on close to 100% of all home sales executed across Australian and are believed to be absolutely accurate. However, these medians may differ from other data providers if they use smaller samples of sales, which they sometimes do, or if they are not actually calculating a true median.

For example, the medians reported by APM are not actually the true 50th percentile (or middle) transaction of all sales, but rather the median deriving from APM’s stratified index. The APM index divides all suburbs into 10 baskets (or deciles) ranked by their median price (from high to low). The median price sourced from this index is then presumably the median associated with the fifth decile. This can, of course, vary from the median of all sales in the absence of any stratification. (For the interested reader, the “index” that is produced by APM is an average of the growth rates in the median prices associated with each of the 10 baskets of suburbs referred to above.)

Another major point of distinction among median prices is what they relate to, which is not always obvious. For example:

The ABS disseminates medians that cover detached houses in capital cities only;

APM reports medians relating to all property types in capital cities dissected according to houses and units;

RP Data-Rismark publishes medians that cover all property types in all regions (i.e., not just capital cities). This is important since about 40% of all homes are not located in the capitals.

A final prospective difference is the time period during which the median price is measured (e.g., monthly, quarterly or annually).

RP Data-Rismark prefers to compute medians based on the previous three months’ worth of sales transactions. We do this because the medians can be very volatile on a month-to-month basis. This is why when we measure house price changes over time, RP Data-Rismark does not use a simple median price index.

As is well known, median price indices can be adversely affected by changes in the composition of buyers in the market, among other biases (such as capital improvements and variations in the type and quality of homes built over time).

The problems associated with median prices were illustrated in the first quarter of 2009, when APM and the ABS reported that house prices were falling – by a record margin in the case of the ABS – when in fact they were rising. The medians were being dragged down by a surge in first-time buyers purchasing cheap homes in the early months of 2009. RP Data-Rismark’s hedonic index, in contrast, reported strong growth during this period.

Following the first quarter of 2009, RP Data-Rismark’s index reported relatively stable quarterly growth. In comparison, the median price indices reported sometimes wild changes in value, which was evident again in the fourth quarter. These estimates were likely being artificially boosted by the fading of first timers and the return of upgraders purchasing more expensive homes, which automatically biases the medians upwards.

6. What's the value of looking at median prices when researching properties?

Medians are not very useful for measuring house price growth rates because the median is affected by a range of biases, including:

Different buyer types who happen to be dominating the market (first timers vs. upgraders);

Changes in the types of homes built over time (if we build bigger or smaller homes over time the median may rise or fall, suggesting house prices have appreciated or declined, when in fact they may not have);

Renovations (if homes are renovated this can push the median up when capital growth rates have actually been unchanged); and

The liquidity of different geographies (if more west Sydney homes trade than east Sydney homes, the median may fall when house prices could have been rising).

However, the median is useful if you want to simply know what the middle sales observation in, say, Melbourne, was over the past, say, quarter. This gives you a quick and easy-to-understand guide for the price of the homes being purchased in the market. That is why RP Data-Rismark continues to report the simple medians alongside our hedonic index. Median prices are also useful when seeking to address research questions that are targeted at identifying a “representative” price at any particular point in time.

7. How should property investors treat these data?

If investors want to work out bona fide capital growth rates, they should not use unadjusted median price data. They should try and rely on more sophisticated index methods that overcome the simple median price biases, such as those that have been reviewed above. My preferred approach is the RP Data-Rismark hedonic index. If that is not available, go with APM’s stratified median index.

8. What's the risk of comparing data from different index providers?

Since all the index techniques are different they cannot be compared directly. Having said that, they are all trying to quantify broadly the same thing: changes in the value of residential real estate over time. Accordingly, it is useful to be aware of how the different benchmarks behave, and to seek to understand what factors might be driving observed divergences. This is, I believe, exactly what the RBA does.

Christopher Joye is a leading financial economist and works with Rismark International. Rismark and RP Data provide house price analytics products, and solutions that enable investors to go long and/or short the housing market. The above article is not investment advice. You can follow Christopher on twitter at @cjoye or read his blog.

House price indices: Making sense of the data overload

Christopher Joye

Editor's Picks

Developments

New apartments

New townhouses

News