Monthly Archives: June 2017

Machine Learning Takes on Heart Disease Risk

Machine learning is a process where computers are used to analyse past data in the hope of predicting outcomes from future data. These outcomes can be anything in the modern world of Machine Learning, ranging from what type of book or CD you are likely to buy to predicting the outcomes of sports events such as horse racing. Various different Machine Learning methods have been devised each with different strengths that can make them more applicable to different types of problems. So a Neural Network may perform better on one kind of problem than say a Random Forest algorithm, but less efficiently on another kind of problem. Risk factor calculators that you plug your data into on the web are based on simpler models which assume a linear relationship between the factors eg LDL, blood pressure etc. Machine learning algorithms can dig deeper so to speak and amongst other things, uncover weightings to various factors. Figure out which are most important and weight them accordingly.

Such an approach was taken with a 10 year project tracking people with 48 factors. Four different algorithms were employed on the data. The data was split into 75% to find out what the relationships were within the data, usually called ‘training’ the model in machine learning parlance. The remaining 25% was used to test how well the model could predict cardio events such as heart attacks. The results were pretty good, out performing conventional risk assessors. Taking an average of how the four methods ranked the different factors we can see that LDL is well behind HDL, Trig’s and HbA1c as a risk factor. Here is an ordered table of those averages showing that age (not surprisingly) was the most impactful feature. Note that some are negatively significant eg Women are significantly less likely to have an event). ‘Missing’ means within the data a patient had this data missing. here is a link to the report

http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0174944#pone-0174944-t003

Age*
Ethnicitya: South Asian
Female*
SESb: 2nd Townsend quintile
Smoking*
Ethnicitya: Black/Afro-Caribbean
SESb: 3rd Townsend quintile
SESb: 4th Townsend quintile
HDL cholesterol*
Oral corticosteroid prescribed
HbA1c missing
Total cholesterol*
COPD
Systolic blood pressure*
Ethnicitya: Other/Mixed
SESb: 5th Townsend quintile (most deprived)
Atrial fibrillation
Triglycerides
Family history of CHD < 60 years
SESb: Unknown
HbA1c
AST/ALT ratio missing
Ethnicitya: Chinese/East Asian
BMI missing
Ethnicitya: Unknown
Serum creatinine
Immunosuppressant prescribed
gamma GT
Diabetes
Chronic kidney disease
BMI
Anti-psychotic drug prescribed
Severe mental illness
Rheumatoid arthritis
Blood pressure treatment*
Hypertension
LDL cholesterol
gamma GT missing
CRP
AST/ALT ratio
Serum creatinine missing
FEV1 missing
Serum fibrinogen
CRP missing
FEV1
Serum fibrinogen missing
LDL cholesterol missing
Triglycerides missing

Advertisements