Loading...
Thumbnail Image
Publication

Stroke prediction in patients with asymptomatic carotid stenosis using electronic health records

Silvey, Jackson
Citations
Altmetric:
Abstract

Introduction: Medical management of cardiovascular disease has improved since the studies were done to determine the best management for asymptomatic carotid artery stenosis. With the one-year stroke risk for these patients being similar to the periprocedural stroke risk for treatment, we reviewed 10 years of patient data to determine if machine learning analysis of the Electronic Medical Record (EMR) clinical data could help to separate the cohorts with the best outcomes from medical management vs those benefiting the most from operative intervention.


Methods: We conducted a chart review of electronic medical records encompassing 872 patients who had available diagnosis, laboratory test, and medication data in the EMR, and who underwent carotid endarterectomy procedures between 2009 and 2022. Based on the chart review, our final cohort included 408 patients who underwent carotid intervention due to stroke and 464 patients who underwent intervention for stenosis exceeding 70%. We used the normalized ICD codes, laboratory tests, and medication groups to build the machine learning models. The patient data was divided into 70% for training and 30% for testing purposes. We employed five distinct machine learning models – Logistic Regression, Support Vector Machine, XGBoost, Random Forest, and Multilayer Perceptron – and implemented 10-fold cross-validation on the training dataset to optimize the model parameters. To interpret the factors influencing patients' likelihood of stroke as an indication for intervention compared to those who were asymptomatic, we utilized Shapley Additive exPlanations (SHAP).


Results: The comparison of the five machine learning models (Figure 1) show that Random Forest gained the best performance of a sensitivity of 0.317, a specificity of 0.906 and an Area Under the Receiver Operating Characteristics (AUROC) of 0.709. The SHAP visualization (Figure 2) shows the top risk factors which include mediation, diagnoses, and laboratory tests. The analysis on the study cohort including the top factors is shown as Table 1. Data were summarized as frequency and proportion or as mean and standard deviation. The t-test or chi2 test were used to assess the differences for variables between the two groups. The risk factors for symptomatic carotid disease included elevated glucose, kidney disease, hyperlipidemia, smoking status as evidenced by prescribed nicotine medication in category psychotherapeutic and neurological agents. While protective factors included cardiovascular agents, and beta blockers. Some of the risk factors might only be specific to this study cohort, generalizability of these factors to large populations needs further investigation.


Conclusion: This study demonstrates that an in-depth machine learning based approach of analyzing the medical record to sort out patients with carotid disease who are somewhat more likely to present with a stroke from those who remained asymptomatic prior to presentation. In addition, it clearly demonstrates the relationship of disease processes which increase the risk for having a carotid embolic stroke and treatment of these risk factors lessening the risk for a stroke. Together with additional analysis of duplex and computed tomography of plaque characteristics, it may help physicians to limit carotid intervention to patients that would benefit the most.

Date
2024-04-16