AI MELD logo

Developing a Multidisciplinary Ecosystem to study Lifecourse Determinants of Complex Mid-life Multimorbidity using Artificial Intelligence

Health & Wellbeing Artificial Intelligence

Project Vision

As with many countries we are facing challenges related to the growing number of people living with multiple longterm health conditions like diabetes, heart disease or dementia. All the way through peoples’ lives many things influence the chances of developing such conditions: things about people themselves, such as age and ethnicity, things that happen like infections or accidents, and behaviours like smoking and diet. Perhaps even more important, though hard to research, are broader issues throughout life such as the nvironment people grew up in, their education, work, income and so on. The uncomfortable truth is that people from more disadvantaged backgrounds are more likely to develop multiple conditions at an earlier age. There is also evidence that the order of developing conditions varies considerably and influences what then happens to people. This makes understanding these broader issues and how they affect that order vital to inform when and how we should intervene.

To achieve this, we need to study exceptionally large numbers of people over their whole lifetime, but such datasets do not exist. Very large health datasets collected from NHS GPs are helpful but haven’t been running long enough to track from birth to later life. They do include lots of information on long-term conditions but not much about broader issues. We have access to one such dataset of about 700,000 people which we can use to identify health conditions .

We also have access to data from the ‘1970s birth cohort’ – a research study of about 17,000 people born in the same week of 1970 followed throughout their lives (currently 50 years old) who have provided detailed information about many broader issues every few years.

The aim of this research is to safely and ethically establish the necessary environment, systems and methods to allow artificial intelligence techniques to ‘connect’ birth cohort data with large GP datasets. This will allow us to connect information on the broader, lifecourse issues with the GP information on long-term conditions. Then we can:

  1. Identify the kinds of people who develop combinations of burdensome long-term conditions by the time they are middle-aged.

  2. Understand the order of developing long-term conditions through life and which ones develop first.

  3. Work out how broader issues affect that order and the resulting combination of conditions.

Project Objectives

To establish the necessary environment, principles, systems, methods and team in which to use artificial intelligence (AI) techniques to 'connect' longitudinal cohort data with routine NHS data to identify lifecourse causes of early-onset complex MLTC-M and identify optimal timepoints for public health interventions within the future Research Collaboration.

  • Acquire and house data from the Care and Health Information Exchange Analytics database (CHIA – routine GP data) and the 1970 British Cohort Study (BCS70) in analysis-ready format (WP1&2)
  • Test optimal clustering algorithms in CHIA that identify early-onset complex MLTC-M (WP1)
  • Identify three burdensome early-onset MLTC-M clusters in CHIA as exemplars (WP3&4)
  • Identify people in BCS70 with the three exemplar clusters and test their relationship with selected early life determinants using a Directed Acyclic Graph-based approach (WP3)
  • Explore if sentinel conditions and long-term condition accrual sequence can be identified and characterised in CHIA (WP3)
  • Explore whether the nature and determinants of sentinel conditions can be characterised in BCS70 (WP3)
  • Develop AI transfer learning methods that allow extrapolation of inferences from BCS70 to CHIA and vice versa
  • Build capacity for engagement and co-production of the Research Collaboration structure, objectives and outputs

IT Innovation's Role

  • Data infrastructure, management processes and datasets ready for analysis of MLTCs in accordance with strategy.

    • Operate information governance framework to manage data assets and associated legal and ethical risks, obtaining necessary approvals for storage, processing and linking.
    • Establish data acquisition and curation pipelines necessary to support initial research questions, and data roadmap for the Research Collaboration.
    • Assess the limits and readiness of data sets (routine, birth cohort) using data readiness and data quality methodologies
  • Causal inference modelling and AI techniques to identify MLTC-M clusters, their determinants and sequence.
    • Cluster analysis within CHIA to identify those with three burdensome early-onset MLTC-M cluster exemplars
    • Explore lifecourse cluster analysis in BCS70 using multivariate time-series clustering.
    • Explore AI transfer learning methods to allow extrapolation of inferences between the two datasets

Project Funding

NIHR Logo AI Meld has received NIHR research funding.

Related Projects

Read More


Health & Wellbeing, Artificial Intelligence
Read More


Secure Society, Health & Wellbeing
Read More

NHS Wessex Trusted Research Environment

Secure Society, Health & Wellbeing
Read More

DARE UK Privacy Risk Assessment Methodology

Secure Society, Health & Wellbeing