VO2 max is a crucial measurement of overall fitness and an important predictor of heart disease and mortality risk. In general, tests that measure VO2max require expensive lab equipment and are usually limited to elite athletes.

Now scientists at the University of Cambridge have developed a method to accurately measure overall fitness on wearable devices. It uses machine learning to predict VO2max during daily activities without needing contextual information such as GPS readings. The method is more robust than current consumer smartwatches and fitness monitors.

The newly developed model is more transparent and only provides accurate predictions based on heart rate and accelerometer data. The model’s ability to track changes over time makes it valuable for measuring fitness levels across the population and determining how changing lifestyles affect them.

Co-author Dr. Soren Brage from Cambridge’s MRC Epidemiology Unit said: “VO2max is not the only measure of fitness, but it is an important one for endurance and is a strong predictor of diabetes, heart disease and other mortality risks. However, since most VO2max tests are performed on people who are reasonably fit, it’s difficult to get readings from those who are not as fit and may be at risk for cardiovascular disease.”

Co-lead author Dr. Dimitris Spathis from Cambridge’s Department of Computer Science and Technology said: “We wanted to know if it was possible to accurately predict VO2max using data from a wearable device so that an exercise test would not be necessary. Our central question was whether wearable devices can measure fitness in the wild. Most wearables provide metrics such as heart rate, steps or sleep time, which are proxies for health, but are not directly linked to health outcomes.”

The study was a collaboration between the two departments: the team from the Department of Computer Science and Technology contributed expertise in machine learning and artificial intelligence for mobile and wearable data, while the team from the MRC Epidemiology Unit contributed knowledge in the field of public health, cardiorespiratory fitness, and data from the Fenland Study, a long-running public health survey in the East of England.

For six days, study participants wore wearable technology nonstop. The sensors collected 60 readings every second, producing a huge amount of data to process.

Spathis said, “We had to design a pipeline of algorithms and suitable models that could compress this huge amount of data and use it to make an accurate prediction. The free-living nature of the data makes this prediction challenging, as we are trying to predict a high-level outcome (fitness) with noisy low-level data (wearable sensors).

To analyze and extract useful information from the unprocessed sensor data and construct predictions of VO2max from it, scientists used an AI model known as a deep neural network. In addition to making predictions, the trained models can be used to identify subpopulations that particularly require fitness-related intervention.

The baseline data from the Fenland study of 11,059 participants were compared seven years later with follow-up data collected from a subgroup of 2,675 of the original participants. To verify the correctness of the algorithm, the third group of 181 UK Biobank Validation Study participants undertook lab-based VO2max testing. Both the baseline measurement (82% agreement) and the follow-up tests (72% agreement) showed excellent agreement between the machine learning model and the measured VO2max scores.

Co-lead author Dr. Ignacio Perez-Pozuelo said: “This study is a perfect demonstration of how we can leverage expertise in epidemiology, public health, machine learning and signal processing.”

“Their results show how wearables can accurately measure fitness, but transparency needs to be improved if measurements from commercially available wearables are reliable.”

said Brage, “It is basically true that many fitness monitors and smartwatches provide a measurement of VO2 max, but it is very difficult to judge the veracity of those claims. The models are usually not published and the algorithms can change frequently, making it difficult for people to determine whether their condition has improved or is only being estimated by another algorithm.”

Spathis said, “Everything on your smartwatch related to health and fitness is an estimate. We’re transparent about our modeling and we’ve done it at scale. We can get better results using noisy data and traditional biomarkers. Moreover, all our algorithms and models are open source and anyone can use them.”

Senior author Professor Cecilia Mascolo from the Department of Computer Science and Technology said: “We have shown that you don’t need an expensive test in a lab to get a true measure of your fitness – the wearables we use every day can be just as powerful if they have the right algorithm. Cardio fitness is such an important health indicator, but until now we didn’t have the means to measure it on a large scale. These findings may have important implications for public health policy, allowing us to move beyond weaker health proxies such as Body Mass Index (BMI).”

Magazine reference:

  1. Dimitris Spathis et al. Longitudinal cardio-respiratory fitness prediction through wearables in free-living environments. npj digital medicine. DOI: 10.1038/s41746-022-00719-1