Pointers for analytics of healthcare data

A start

In recent time I've been asked for pointers on analysing complex healthcare data. This is a difficult issue. Healthcare analytics / health informatics / medical informatics / etc. range over a wide area, driven a wide variety of interests and outcomes, overlapping hugely in some areas and not at all in others. The area is rapidly evolving but there's not a lot of formal, standardised learning. It's in around the same place bioinformatics was 20 years ago. So this is an attempt to sift the chaff, and list some useful pointers for those wanting to know more.

This is expected to be a living evolving document

What is this about?

  • Secondary use of healthcare data, for identifying patient populations, stratification, pharmaceuticals development, scientific research etc.
  • Real World Evidence and observational studies

It is explicitly not about:

  • Primary use of healthcare data, for direct improvements of patient care
  • Healthcare IT
  • Hospital software and IT systems

Skills & people

Skill-lists for scientific & technical disciplines often end up as sprawling wish-lists (c.v. the NIH list of critical skills for bioinformatics that includes several subjects at degree levels of mastery ...). So the following is given lightly and with the full intention that most is learnt on the job:

  • Analytics: scripting (e.g. R or Python), visualisation, data handling
  • Background & domain knowledge: comfort with biomedical and healthcare terms
  • Database access, some SQL
  • Some statistics
  • Awareness of standards and terminologies (e.g. CDSIC, ontologies, etc.)

People who are usefully good at health informatics often are or had had job titles like:

  • health informaticians
  • bioinformaticians
  • biomedical data scientists
  • clinicians, pharmacologists etc. who have got into programming

Books & papers

There's some interesting reading out there but you often have to pick out the relevant nuggets amidst material that intended for hospital staff or administrators:

Short courses

ClassCentral has a list of online classes in bioinformatics and healthcare

Longer courses

There's a lot of "health informatics" courses out there, but some are more about making people familiar with the technology and landscape, or talking about the IT plumbing. Some possibly relevant ones in the UK include:

  • https://www.mastersportal.com/search/#q=ci-30|di-283|lv-master
  • https://www.prospects.ac.uk/jobs-and-work-experience/job-sectors/healthcare/how-to-get-started-in-health-informatics
  • UEdinburgh https://www.ed.ac.uk/bayes/about-us/our-work/education/workforce-development/courses/health-data-science
  • ULeeds MSc Precision Medicine: Genomics & Analytics
  • London School of Hygiene and Tropical Medicine MSc Health Data Science
  • UCL AI Enabled Healthcare
  • U Manchester Health Data Science, Clinical Bioinformatics, Health Informatics
  • UCL courses: https://www.ucl.ac.uk/health-informatics/node/787/health-informatics-mscpgdippgcert
  • Kings College London Applied Statistical Modelling & Health Informatics PgCert
  • Bournemouth Digital Health and Artificial Intelligence MSc
  • University of West London: Health Informatics
  • City & Guilds Health Informatics
  • Imperial College Cancer Informatics (MRes), Data Science (Biomedical Research MRes), Health Data Analytics and Machine Learning (MSc)


Much is made of Real World Data and Real World Evidence and Real World Analytics. Tell you a secret - these are effectively all the same thing, despite the protests of experts.

AMIA (the American Medical Informatics Association) covers all the various flavours of healthcare and medical informatics, including our use cases, and hosts some excellent courses.

OHDSI (the Observational Health Data Sciences and Informatics program) is heading the standardisation of healthcare data into a common model called OMOP. This looks like becoming the dominant model for federated analysis of healthcare data. Read the Book of OHDSI for more.

Last posts

  1. The rules of analysis

    tags: data-sciencedataanalysis

  2. Fighting COVID19 with data science and bioinformatics

    tags: data-sciencebioinformaticsinfectious-disease

  3. The Kwazy Kwistmas Kwiz
  4. Long away, no excuse

    tags: news

  5. Infoglut 2020

    tags: infoglutmoviestvtravelbooks