Exploring the predictability of occupancy using sensor data

Predicting the future use of your workspaces and meeting rooms can be very useful to manage the use of your building. It is also important for “lowering the curve” of the crowded times’ peak occupancy. Yet, predicting the future is difficult, so in this blog, we take our first steps into predicting occupancy. We will explore the structures of the sensor occupancy data of Measuremen’s headquarters in Amsterdam.

By mapping out the basics of our data (figure 1), we can see that there are evident weekly rhythms in the data. During the weekends the office is empty, but we see some peaks during the weeks. It is also good to mention that the occupancy on the Y-axis is so low because the sensors are measuring all day long and at every desk, every second. The net occupancy you get is, as at every company we have measured, very low. Think about all the hours it is empty at night, and all the moments you leave your desk.

To get a better insight into the weekly patterns, we laid them on top of each other and see whether these peaks are tied to specific days. We excluded Saturdays and Sundays and made two graphs. The first graph (figure 2), shows a bar chart with the averages per day. We can see clearly that Wednesdays and Tuesdays are the busiest days. But to get more sense of the variation and peaks of the data, we mapped out the occupancy of all the weeks in a year on a radar plot. We see that Wednesdays, Thursdays, and Fridays do have some peaks as well (Figure 3).

Variation in occupancy across the weeks

When looking at the variation of occupancy, we can see the stability of the occupancy across the days and weeks. The standard deviation is an interesting value for predicting occupancy because it captures the range of the deviation from the mean occupancy. The higher the standard deviation, the more variation there is within the occupancy on a particular day. Therefore, making it less easy to predict future occupancy. In our sample, we see (figure 4) that Wednesday knows quite some variation.

From aggregated variation to temporal variation

The problem with just taking the averages and the standard variation is that you stack all the measured days of occupancy into one statistic. To predict the future, we want to take the progress of time into account. Since previous variations might not say so much about the future. To do this, we will work with auto-correlations. Auto-correlations investigate whether the occupancy data relates to itself. Simply put, the auto-correlations ask the question in this case: if the occupancy today is 50%, will it then be tomorrow 50% as well? The data from our headquarters showed that today and tomorrow are quite correlated, but these relations do not last longer than a day. In other words, the occupancy of today is likely similar to the occupancy of tomorrow, but today’s occupancy has hardly a relationship with the occupancy in two days (and more). Across the weeks, we found a similar effect, this week’s Monday’s occupancy predicts next week’s Monday occupancy, but further than that it gets cloudy.

Predicting further in the future with data

The issue, or interesting fact of dynamic data is that it is often multi-valued. This means that there are different rhythms on different timescales. Our data marks clear rhythms on a weekly basis, but it is likely to show also rhythms on a yearly basis. To visualize this, we plotted a year of occupancy data in the picture below (figure 5). On the y-axis, you can see the time of the day, and across the x-axis, you can see each week plotted with the colours representing occupancy. The greener the cells, the higher the occupancy, the more red the lower the occupancy. Several interesting events can be captured directly, like our Christmas party in week 51. A data outage in week 14, some late-night stays, or sensor defects in week 19 and week 23. But more relevantly, we can see seasonal rhythms, such as earlier arrival times in the summer times, and later arrivals in the winter times. Also, the first months of the year seem to be rather busy (darker green), which is also a subjectively busy period within Measuremen. Such seasonal patterns are likely to repeat and could be used to account for when making predictions.

Figure 5. Year of occupancy data

Occupancy and data rhythms

Occupancy knows many rhythms on different scales, from hours to days, to weeks, to years. There is variation in all of them, which makes it difficult to predict the future exactly. Although, one could expect certain rhythms to sustain. Such rhythms are well-known phenomena, but they differ between organisations. Cultural aspects such as different regulations, shift works, or remote work policies might alter the rhythms heavily between organisations. Therefore, when it comes to predicting data, it is important to use your own historical data as a source. Next to organisation-specific historical data, data sources of other processes. Benchmark data, stock markets, weather data, etcetera could be used to understand the behaviour of your own organisation. As well as to predict the future of your own organisation.