Big data is big news. Over recent years, the housing sector has fallen head over heels for data, citing its potential to help transform both organisations and communities.
If we are to believe the hype, then big data is the answer to a wide range of problems, from welfare reform to fuel poverty.
The birth of the internet of things has seen everything from self-learning thermostats to smartphones connecting tenants with their homes in new ways, while in the background massive datasets are generated for housing providers.
For some housing providers, their relationship with data is a new and exciting one, filled with endless possibilities. For others, the honeymoon period is over, as the realisation of the work needed to use this data in a meaningful way becomes apparent.
Many business leaders have lent their views on big data over the years, and all arrive at the same conclusion; large datasets on their own won’t change anything.
Big data needs to be mined. That’s why machine learning algorithms are so important in helping to go from large datasets to meaningful results.
They can spot patterns and recognise behaviours in huge collections of data in a matter of minutes, which otherwise could take years to analyse. Done well, this analysis can give you clear outcomes and suggest meaningful actions. Done poorly, it results in confusion or false confidence about what to do next.
For housing providers, data comes in all shapes and sizes; for example, on building stock, heating, rent and sensitive personal information.
At Guru Systems, our technology constantly collects data on the efficiency of on-site energy systems, such as district heat networks. That means we can pass on accurate information to the housing providers running the networks, allowing them to charge the correct tariff, while residents can see their energy use and costs in real time.
Over the last year, we’ve been taking our mining of data to the next level by developing algorithms that not only measure how effectively heat networks are working, but also identify any potential inefficiencies, which can carry a huge financial burden for housing providers, and suggest the most cost-effective improvements.
The ultimate outcome will be to minimise the risk for housing providers taking on the role of heat providers, bringing district heat into the 21st century through the gathering and analysis of data.
Set clear objectives
I have long been interested with the potential of machine learning. But to arrive at the best results, we must first be absolutely clear what we want to achieve from our data. Housing providers must ask themselves: what is the outcome am I looking for? How will this analysis better my organisation, improve a community or residents’ lives?
Often housing providers will set complex objectives and try to get too much out of the data they have. The analysis should result in increased human understanding, not create impenetrable mystery.
I recently spoke at an event looking at smart thermostats in housing and many housing providers were keen to explore the additional potential of the data they might gather, such as spotting over occupancy through excess energy usage.
Our advice is always that data should be used to solve real-world problems and improve people’s quality of life. If the algorithms used to analyse data become too complex and look for too many patterns then you run a greater risk of delivering misleading and ultimately meaningless results.
We believe that data should be generated and used in the same domain, meaning it should be used for the benefit of those using the system that generates the dataset. For us, that means that data from district heat networks should be used to improve the way we heat homes on those networks.
The common mistakes
Big data is an unruly beast. It can become impossible to control its quality once it’s been chopped up, reused then merged with other datasets; the resulting data then becomes impure.
Too often, the temptation is to try and use data from one domain to influence our actions in another area. That is how we end up with bizarre or unexpected Google or Facebook adverts on our web pages. In this instance, the consequences of using data in this way is relatively minimal, but when it comes to dealing with tenants or making crucial business decisions, there is far more at stake.
In many instances, when we think of big data we think about the suggestions we get on Google after a purchase, watching a TV programme or posting on social media. In advertising, big data means that companies can sell you things; in social housing, the objectives are and should be different.
If we improve the lives of tenants or save housing providers money through the data collected, whether it is rent or heating related, then that’s a success.
In many ways, we need to move away from big data and its connotations. Thankfully, big data has a younger cousin that’s more human, more pleasant and much more promising: medium data, if you like.
By using big data first hand, in manageable portions to achieve pre-stated outcomes – medium data – we can ensure that it’s both usable and, crucially, deletable if required.
Medium data allows us to take raw clean data, analyse it within its own domain, and learn from it.
Casey Cole is managing director of Guru Systems.