Real Estate Data Lake

Real Estate Data Lake

Fri 17 May 2019

Mass data…

The real estate sector is gradually going digital, and recent constructions now incorporate equipment and connectivity to allow both use and maintenance to be optimised. However, as the renovation of the stock is naturally slow, the majority of these assets are not “connected”.

Nevertheless, a great deal of data is available and constitutes an important tool for improving performance:

  • location (accessibility/transport, environment, etc.);
  • land (nature, constructibility, utilities/roads, diagnostics, etc.);
  • building (type, surface area, condition, energy consumption, etc.);
  • occupation (users, vacancy, uses, etc.);
  • financial performance (costs, profitability, etc.);
  • transactions (price, time to find a tenant, accompanying measures, etc.);
  • form of holding (acquisition/leasing, structuring, etc.).

Overall, little use is made of such data in comparison with other sectors (insurance, banking, retail, etc.)

…but still with little structure

While access to information is increasingly facilitated by the spread of databases (both open data and data provided by professional bodies), gathering and structuring your own data seems a more complicated matter.

In part this is because players in the sector, whatever their role (including users), have gradually developed their databases as and when required. Often these databases are infrequently shared and have scope for improvement. Data quality is often uneven in terms of completeness and update frequency, and its collection is not always exhaustive. Further, all the players have their own definition of key data, sometimes even within the same enterprise.

Progress has been made just recently: some associations have established data formats or precise definitions of ratios and cost breakdowns so as to provide relevant benchmarks for their members.

The emergence of data science: fiction or reality?

In the medium term, some aspects of real estate should be widely automated (asset valuation, visual due diligence and data rooms, etc.) necessitating a more consistent approach to data. It may well be the case that more reliable and better shared data across the sector will enable the development of artificial intelligence and predictive analysis, or even, like the social credit system used in China, make it possible to give a score to a developer, provider or tenant based on past performance, with a corresponding adjustment of fees, insurance premiums or rent.

Obviously, it is still hard to predict the future of the sector and new technologies will certainly emerge, but it is undeniable that players with good databases will be one step ahead and enjoy a decisive competitive advantage.


  • Itemising all the data available, and identifying possible uses, particularly in the transactions sector and the measurement of performance
  • Considering the value (or otherwise) of new data in terms of expected gains on the one hand and the cost and complexity of gathering on the other
  • Standardising the scope and format of useful data and then ensuring their reliability (com-pleteness, consistency, updating, etc.)
  • Optimising assets through benchmarking and predictive analyses, using the data as a tool for change