1Department of Internal Medicine, One Brooklyn Health, Brooklyn, NY, USA
2Department of Medicine, Kasturba Medical College, Mangalore, India
3Department of Medicine, Malla Reddy Medical College for Women, Hyderabad, India
4Department of Medicine, Universidad de Ciencias Medicas, San José, Costa Rica
5Department of Emergency Medicine, Green City Hospital, Saharanpur, India
6Department of Radiodiagnosis, Subharti Medical College, Meerut, India
7Department of Medicine, Kettering General Hospital, Kettering, UK
8Department of Medicine, American University of Antigua, Antigua, Barbuda
9Department of Emergency, Prathima Hospital, Hyderabad, India
10Department of Family Medicine, Exceptional Medical Ambulance and Healthcare Services, Dubai, United Arab Emirates
11Department of Medicine, Jinnah Postgraduate Medical Center, Karachi, Pakistan
© 2026 The Korean Society of Critical Care Medicine
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
CONFLICT OF INTEREST
No potential conflict of interest relevant to this article was reported.
FUNDING
None.
ACKNOWLEDGMENTS
None.
AUTHOR CONTRIBUTIONS
Conceptualization: OD, RMS, PB, MGC, AUH, SH, MI, AIC, SMY, HAWB, HFS. Methodology: OD, RMS, PB, MGC, AUH, SH, MI, AIC, SMY, HAWB, HFS. Validation: OD, RMS, PB, MGC, AUH, SH, MI, AIC, SMY, HAWB, HFS. Investigation: OD, RMS, PB, MGC, AUH, SH, MI, AIC, SMY, HAWB, HFS. Data curation: OD, RMS, PB, MGC, AUH, SH, MI, AIC, SMY, HAWB, HFS. Visualization: OD, RMS, PB, MGC, AUH, SH, MI, AIC, SMY, HAWB, HFS. Supervision: HFS. Project administration: OD, RMS, PB, MGC, AUH, SH, MI, AIC, SMY, HAWB, HFS. Writing–original draft: OD, RMS, PB, MGC, AUH, SH, MI, AIC, SMY, HAWB, HFS. Writing–review & editing: OD, RMS, PB, MGC, AUH, SH, MI, AIC, SMY, HAWB, HFS. All authors read and agreed to the published version of the manuscript.
| Study | AI/ML tool used | Disease/symptom | Data used | AUC (95%CI) | Outcome |
|---|---|---|---|---|---|
| Butler et al. (2023) [18] | AI-based ECG models | 10-Year heart failure risk | ECG data, patient history and examination findings, BMI and vital signs | ECG-chars: 0.73 (0.70–0.77) | Comparable accuracy than current HF risk calculators. Improved long-term risk prediction |
| ARIC HF-risk calculator: 0.76 (0.72–0.80) | |||||
| FH-HF risk calculator: 0.74 (0.70–0.78) | |||||
| CPH: 0.78 (0.75–0.80) | |||||
| ECG-AI: 0.77 (0.74–0.79) | |||||
| ECG-AI-cox: 0.84 (0.81–0.87) | |||||
| Cho et al. (2020) [14] | DLA with VAE | MI | 12-Lead and 6-lead ECG | Internal validation: 0.88 | ML algorithms improved MI detection. Six-lead DLA with VAE outperformed non-VAE |
| External validation: 0.854 | |||||
| Doudesis et al. (2023) [17] | Collaboration for the Diagnosis and Evaluation of ACS score using ML | Acute MI | Continuous troponin levels, patient history, clinical factors. | 0.953 (0.947–0.958) | Improved MI probability detection; identified more low-risk patients with minimal 1-year cardiac death risk |
| Herman et al. (2023) [16] | AI model | Occluded MI | ECG data | 0.938 (0.924–0.951) | Enhanced ACS triage with high accuracy |
| Lin et al. (2024) [15] | AI-ECG | ST-elevation myocardial infarction | ECG data | - | Reduced door-to-balloon and ECG-to-balloon times. Decrease in-hospital and 6-month mortality |
| Sveric et al. (2024) [19] | AI-driven automated workflows | HF and left ventricular ejection fraction estimation | Echocardiography images | Auto-ECHO: 0.93 | Reduced measurement variability (<5%). Improved reliability over traditional methods |
| MBS-ECHO: 0.92 |
AI: artificial intelligence; ML: machine learning; AUC: area under the curve; ECG: electrocardiogram; BMI: body mass index; ARIC: atherosclerosis risk in communities; HF: heart failure; FH: familial hypercholesterolemia; CPH: cox proportional hazards; DLA: deep-learning algorithm; VAE: variational autoencoder; MI: myocardial infarction; ACS: acute coronary syndrome; ECHO: echocardiogram; MBS: medicare benefits schedule echocardiography.
| Study | AI/ML tool used | Disease/symptom | Data used | Accuracy measure | Outcome |
|---|---|---|---|---|---|
| Ippolito et al. (2023) [20] | Multi-tasking mask R CNN | Pneumonia | CXR | Accuracy: 93.8% | Remarkable accuracy of AI tool found in differentiating between COVID-19 pneumonia, typical bacterial pneumonia, and healthy subjects |
| Chagas et al. (2024) [21] | Predictive models using Boruta algorithm | AIDS- associated PCP | LDH, O2 saturation | AUC: >0.8 | Predictive models differentiated AIDS PCP |
| CRP, RR (>24 bpm) dry cough, HIV viral load, CD4 cell count and CXRs | |||||
| Salvatore et al. (2021) [22] | ResNet- 50 architecture and trained and cross validated | COVID-19 pneumonia | CXRs | AUC: 0.98 | AI tools showed potential in correctly differentiating COVID-19 from community acquired pneumonia. |
| Carlile et al. (2020) [23] | CNN | COVID-19 pneumonia | CXRS with heat maps | - | More than 80% physicians found the tool aided in optimizing the workflow and clinical decisions. |
| Kwon et al. (2020) [24] | DenseNet-121 architecture pretrained on ImageNet | COVID-19 pneumonia | Chest radiographs and clinical variables | AUC: | Model improved prediction regarding mortality and intubation requirement. |
| 0.88 for intubation | |||||
| 0.82 for mortality | |||||
| Lin et al. (2024) [25] | Blood culture prediction index and CURB-65 integrated novel | Pneumonia | CBC, differential leukocyte counts, BUN, age, sex, RR, blood pressure, GCS | AUC: 0.713 for cox regression model | Integrated models had better predictive performance for low risk patients. |
| COX regression model | |||||
| Somani et al. (2021) [26] | Fusion model- ML model | Pulmonary embolism | Clinical data and ECGs | AUC: 0.84 AUC for fusion model | Fusion models performed better than the other models in improving screening of acute PE. |
| Choi et al. (2024) [27] | Smartphone application ECG BUDDY | Pulmonary embolism | ECGs | AUC: 0.895 | Right ventricular dysfunction identified rapidly on ECG as compared to markers like Troponin I and ProBNP. |
| QCG-RVDys | |||||
| Müller-Peltzer et al. (2021) [28] | Computer aided algorithm integrated into Siemens Healthineers software, VB20A | Pulmonary embolism | CT pulmonary angiography | Sensitivity: 47% at lobar and 50% at subsegmental level | Computer-aided algorithms lead to higher false positives can be used to assist radiologists but needs radiological examination for confirmation. |
| Langius-Wiffen et al. (2023) [29] | AI algorithm | Pulmonary embolism | CT pulmonary angiography | Sensitivity: 96.8% | Al algorithm showed higher accuracy in detecting pulmonary embolism versus radiologist. |
| Specificity: 99.9% | |||||
| Savage et al. (2024) [30] | AI triage system, the BriefCase for iPE from the vendor Aidoc | Pulmonary embolism | Contrast enhanced CT scans | Sensitivity: 96.2 | AI assistance showed increased sensitivity for detecting pulmonary embolism. However, no significant change observed in decreasing time. |
| Specificity: 99.9 | |||||
| Huang et al. (2020) [31] | Deep learning 77- layer 3D convolutional neural network (PENet) | Pulmonary embolism | CT scan | AUC: | Successful automated application for diagnosis of pulmonary embolism with the AI tool. |
| 0.84 for internal validation | |||||
| 0.85 for external validation | |||||
| Villacorta et al. (2022) [32] | Generalized learning logistic regression using elastic net | Pulmonary embolism | O2 saturation, history of deep venous thrombosis or pulmonary embolism, immobilization or surgery and D dimer | AUC: 0.89 | D dimer maintained a low false negative rate |
| Hillis et al. (2022) [33] | AI model | Pneumothorax | CXR | AUC: | Al model detected pneumothorax and tension pneumothorax accurately. |
| 0.97 for pneumothorax | |||||
| 0.98 for tension pneumothorax | |||||
| Han et al. (2025) [34] | Convolutional Recurrent Neural Network and sequence modelling (recurrent neural network) | ARDS | ECG, HR, RR, SpO2, and non-invasive systolic and diastolic blood pressures, body temperature, AVPU score, age, sex, arterial blood gas analysis | AUC: | Positive prediction scores and clinical outcomes were established with the AI model. |
| 0.84 for internal validation | |||||
| 0.73 for external validation | |||||
| Rashid et al. (2022) [35] | Multiple AI models | ARDS | Variable | AUC: 0.8–1 | AI models reduced cost and improved outcomes amongst ARDS patients. |
| Viderman et al. (2024) [36] | Feed forward neural networks, CNN and ANN | ARDS | Variable | - | AI models shown to perform well in predicting respiratory failure. |
| Liong-Rung et al. (2022) [37] | GoogleNet Inception V4 architecture | Dyspnea | CXRs | Accuracy: 83.2% | High negative predictive value of excluding pulmonary edema & high positive predictive value in diagnosing pneumonia. |
AI: artificial intelligence; ML: machine learning; CNN: convolutional neural network; CXR: chest X-ray; COVID-19: coronavirus disease 2019; AIDS: acquired immunodeficiency syndrome; PCP: pneumocystis jirovecii pneumonia; LDH: lactate dehydrogenase; CRP: C-reactive protein; RR: respiratory rate; HIV: human immunodeficiency virus; CD: cluster of differentiation; AUC: area under the curve; CBC: complete blood count; BUN: blood urea nitrogen; GCS: Glasgow coma scale; ECG: electrocardiogram; PE: pulmonary embolism; ProBNP: pro-B-type natriuretic peptide; CT: computed tomography; iPE: incidental pulmonary embolism; 3D: three-dimensional; ARDS: acute respiratory distress syndrome; HR: heart rate; AVPU: alert, verbal, pain, unresponsive; ANN: artificial neural networks.
| Study | AI/ML model | Disease | Data used | Accuracy measure | Outcome |
|---|---|---|---|---|---|
| Westwood et al. (2024) [45] | Viz LVO, Rapid LVO, Brainomix, Avicenna | Stroke | Head CTA | Sensitivity: 95.4% | The study showed that AI software enhanced accuracy in stroke detection. |
| Specificity: 79.4% | |||||
| Elijovich et al. (2022) [40] | Viz AI | Stroke | Head CTA | - | The AI software reduced time to notify management team and patients received earlier thrombectomies. |
| Gunda et al. (2022) [41] | e-Stroke Suite | Stroke | Non-contrast Head CT and CTA | - | The AI system helped in decision making process. Reduced time and improved rate of receiving reperfusion treatment strategies. |
| Shlobin et al. (2022) [39] | CNN, RF and SVM | Stroke | CTA | Sensitivity: 63.6% | AI tools accurately triaged and diagnosed patients with large vessel occlusion. |
| Specificity: 85.8% | |||||
| Paz et al. (2021) [38] | Rapid LVO | Stroke | Head CT scan | - | The study showed that AI tool had low sensitivity and moderate specificity. |
| Amukotuwa et al. (2019) [44] | Avicenna CINA | Stroke | Head CT angiography | Sensitivity: 95% | Avicenna CINA proved to be one of the most accurate tools in stroke detection. |
| Specificity: 79% | |||||
| Schmitt et al. (2022) [42] | Brainomix e-CTA | Stroke | Non-contrast enhanced head CT | Sensitivity: 0.91 | The AI-based algorithm dependably determined if acute intra-cranial hemorrhage were present amongst participants and calculated intra-parenchymal hemorrhage volumes accurately |
| Specificity: 0.89 | |||||
| Wouters et al. (2018) [46] | Automated multi-variate imaging model | Stroke | Diffusion weighted and perfusion weighted imaging | AUC: 0.80 | The model accurately identified patients of large vessel occlusion within 6-hour window. |
| Study | AI/ML/DL tool used | Disease/symptom | Data used | Accuracy measure | Outcome |
|---|---|---|---|---|---|
| McInnis et al. (2023) [47] | AI-EEG | Epilepsy | EEG | AUC: 0.86 | Higher diagnostic accuracy than clinical experts |
| Tveit et al. (2023) [48] | SCORE-AI | Epilepsy | EEG | Accuracy: 88.3% | Reduces diagnostic errors and improves interrater agreement |
| Keikhosrokiani et al. (2024) [49] | AI augmented pathway for digital care pathway for epilepsy (DCPE) | Epilepsy | Scalp EEG | Accuracy: 97.65% for CNN | Increase the speed of seizure detection and diagnosis |
| Jahan et al. (2023) [51] | Cloud computing | Epilepsy | EEG | - | Assessment of epileptiform abnormalities and localization of epileptogenic zones |
| Seyam et al. (2022) [52] | AI-doc | Hemorrhage | Head CT | Accuracy: 93% | High diagnostic and predictive value for ICH |
| Matsoukas et al. (2022) [53] | Logistic regression, support machine vector, random forest, gradient boosting, deep CNN | Hemorrhage | NCCT and MRI | Accuracy: 93.46% for ICH | Detected ICH and microbleeds with high accuracy |
| Heit et al. (2021) [54] | RAPID-ICH | Hemorrhage | NCCT scans | Sensitivity: 0.95 | High accuracy in detecting and quantifying the volume of ICH |
| Specificity: 0.95 | |||||
| Wang et al. (2021) [55] | GRAD-CAM | Hemorrhage | CT scans | - | Improve the accuracy and transparency in ICH detection |
| Wismüller et al. (2020) [56] | AI-PROBE | Hemorrhage | CT scans | Accuracy: 96.4% | TAT reduction for cases of ICH |
| Savage et al. (2024) [57] | AI-doc (commercial AI triage system) | Hemorrhage | NCCT | Accuracy: 99.2% | Diagnostic accuracy and reported TAT with AI assistance was on par with radiologists |
| Agarwal et al. (2023) [58] | Convolutional neural network, AI-doc, Avicenna.ai | Hemorrhage | CT and MRI | Sensitivity: 0.90 | Reflects that most abnormality detection AI studies were not adequately validated in representative clinical cohorts |
| Specificity: 0.90 | |||||
| Davis et al. (2022) [59] | AI-doc | Hemorrhage | CT scans | - | Reported significant reductions in RTAT along with improved LOS |
| Voter et al. (2021) [60] | (AI) decision support systems, Aidoc | Hemorrhage | CT scans | Sensitivity: 92.3% | Diagnostic performance in cases with a prior history of neurosurgery ICH is weaker |
| Specificity: 97.7% | |||||
| Ginat et al. (2021) [61] | AI-doc | Hemorrhage | CT | - | Shortened the time needed to scan reviews and prioritize severe cases |
| Hsu et al. (2023) [62] | Random forest and logistic regression | ALOC due to hyperglycemia | Electronic medical records | AUC: 0.79 for all-cause mortality | Improve the triage process by forecasting patient outcomes |
| Obeid et al. (2019) [63] | Word2vec | Altered mental status | Patient notes | Accuracy: 94.5% | To automate the identification of altered mental status |
| El-Rashidy et al. (2023) [64] | MIMIC III | ALOC | EEG data and vital signs | Mean absolute error: 0.269 | Enhance the predictive and intuitive explanation of consciousness levels |
AI: artificial intelligence; ML: machine learning; DL: deep learning; EEG: electroencephalogram; AUC: area under the curve; CNN: convolutional neural network; CT: computed tomography; ICH: intracranial hemorrhage; NCCT: non-contrast computed tomography; MRI: magnetic resonance imaging; TAT: turnaround time; LOS: length of stay; RTAT: report turnaround time; ALOC: altered level of consciousness.
| Study | AI/ML tool | Disease | Data used | Accuracy measure | Outcome |
|---|---|---|---|---|---|
| Zhou et al. (2022) [65] | 47 ML models (ANN, LR, SVM, XGBoost, and RF) | AP | Clinical, laboratory and imaging parameters | - | ML models showed potential in diagnosing AP and predicting mortality and recurrence. |
| Pan et al. (2024) [66] | MSAnet (CNN) | AP and AP&PDAC | Non-contrast and enhanced abdominal CT | AUC: 0.99 | AI tools enhanced diagnostic accuracy. |
| Kui et al. (2022) [67] | EASY-APP (XGBoost) | AP | RR, body temperature, abdominal muscle reflex, sex, age, and blood glucose levels | Accuracy: 89.1% | The AI tool predicted risk of severe AP within few hours of hospital admission with high accuracy. |
| Podda et al. (2024) [68] | ChatGPT | Acute biliary pancreatitis | Clinical and laboratory values | - | The AI tool helped optimize the management of patients in ED and ICU. |
| İnce et al. (2023) [69] | Gradient boost ML algorithm | AP | Demographic, clinical and laboratory parameters | Accuracy: | The ML tool proficiently delineated the severity, need of ICU and survival among patients. |
| 98.25% for ICU need 92.77% for survival | |||||
| Park et al. (2020) [70] | CNN | AA | Abdominal CT | Accuracy: more than 90% | The ML tool diagnosed AA among patient presenting in ED with severe abdominal pain with high accuracy. |
| Issaiy et al. (2023) [71] | ANN and LR | AA | Clinical, laboratory and imaging parameters | AUC: 0.985 | AI tools diagnosed AA and predicted post-surgical risk of sepsis and ICU requirement efficiently. |
| Akbulut et al. (2023) [72] | CatBoost model | Perforated and non-perforated AA | CBC, bilirubin, CRP, age and other laboratory values. | Accuracy: 92% | The AI model distinguished perforated and non-perforated AA with remarkable accuracy. |
| Roshanaei et al. (2024) [73] | GNB model, RF model, GBA and SVM | AA | Patient characteristics, laboratory parameters and cause of pain | Accuracy: | AI tools showed promising results in accurately diagnosing AA in the emergency setting. Gaussian naïve bayes model displayed most superior results. |
| 95.03% for GNB | |||||
| 92.55% for RF | |||||
| 94.41% for GBA | |||||
| 91.93% for SVM | |||||
| Ghareeb et al. (2024) [74] | AI platform | AA | Patient characteristics | AUC: 0.97 | The AI platform showed to be an effective diagnostic tool in detecting AA. |
| Shung at al. (2019) [76] | 30 ML models | GIB | Clinical parameters | AUC: | ML models showed superiority over clinical tools in predicting mortality in upper GIB. ANN proved to be the most effective model. |
| 0.93 for ANN | |||||
| 0.81 for other ML models | |||||
| Saraiva et al. (2021) [77] | CNN | GIB | Capsule endoscopy images | Accuracy: 99% | Model proved to be effective in identifying small bowel lesions and predicting risk of bleeding. |
| Mohan et al. (2021) [78] | CNN-based CAD | GIB or gastrointestinal hemorrhage | WCE | Accuracy: 95.4% | CNN models showed high accuracy, NPV and PPV in diagnosing GIB or hemorrhage by proficiently interpreting WCE images. |
AI: artificial intelligence; ML: machine learning; ANN: artificial neural network; LR: logistic regression; SVM: support vector machine; RF: random forest; AP: acute pancreatitis; CNN: convoluted neural network; AP&PDAC: acute pancreatitis with pancreatic ductal carcinoma; CT: computed tomography; AUC: area under the curve; RR: respiratory rate; ED: emergency department; ICU: intensive care unit; AA: acute appendicitis; CBC: complete blood count; CRP: C-reactive protein; GNB: Gaussian naïve bayes; GBA: gradient boost algorithm; GIB: gastrointestinal bleeding; ANN: artificial neural network; CAD: computer aided design; WCE: wireless capsule endoscopy NPV: negative predictive value.
| Study | AI/ML tool used | Disease | Data used | Accuracy measure | Outcome |
|---|---|---|---|---|---|
| Delahanty et al. (2019) [80] | Risk of Sepsis score | Sepsis | Rhee clinical surveillance criteria | AUC: 0.93–0.97 | ML model proved to be more effective than conventional clinical screening tools. |
| Kaji et al. (2019) [82] | Long short-term memory recurrent Neural Network | Sepsis | Clinical parameters | AUC: | The model effectively predicted daily sepsis occurrence among patients in admitted in ICU. |
| 0.87 for sepsis | |||||
| 0.83 for antibiotic administration | |||||
| Barton et al. (2019) [79] | Supervised gradient -boosted tree model | Sepsis | Vital signs | AUC: 0.88 for sepsis onset | The model accurately predicted sepsis 48 hours before onset. |
| Mao et al. (2018) [81] | InSight | Severe sepsis and septic shock | Vital signs | Accuracy: | ML model predicted severe sepsis and septic shock 4 hours before onset with high accuracy. |
| 0.92 for sepsis | |||||
| 0.87 for severe sepsis | |||||
| Henry et al. (2022) [84] | TRWES ML-based warning system | Sepsis | Clinical parameters | - | The alert system model decreased time to initiate antibiotic treatment and improved patient outcomes. |
| Yuan et al. (2020) [86] | Diagnostic algorithm | Sepsis | Clinical data | Accuracy: 82% | The AL algorithm can improve patient outcomes by diagnosing sepsis among ICU patients accurately as compared to SOFA. |
| XG Boost | |||||
| Yan et al. (2022) [87] | ML models | Sepsis | Clinical data notes from providers, demographic, vital signs, laboratory data and medications | - | The models proved effective in early identification of sepsis. |
| Yue et al. (2022) [88] | ANN, XGBoost, LR and SVM | AKI in sepsis | Various clinical and laboratory parameters | Accuracy: 0.82 for XGboost | The ML models proved to be proficient in predicting AKI among patients with sepsis. |
| Zhang et al. (2023) [89] | Multiple ML models | Mortality in sepsis | Various clinical and laboratory parameters | Sensitivity: 0.71 | The ML models showed superiority in predicting mortality as compared to conventional clinical scoring tools. ML highlighted lactate to be an effective indictor, usually missed by clinical tools. |
| Specificity: 0.68 | |||||
| She et al. (2023) [90] | SVM and RF | Sepsis | Non-targeted liquid chromatography-high-resolution mass spectrometry metabolomics | - | ML model delineated the association between various metabolites and sepsis that can potentially aid in diagnosis and management. |
| Study | AI/ML tool used | Disease/symptom | Data used | AUC (95%CI) | Outcome |
|---|---|---|---|---|---|
| Butler et al. (2023) [18] | AI-based ECG models | 10-Year heart failure risk | ECG data, patient history and examination findings, BMI and vital signs | ECG-chars: 0.73 (0.70–0.77) | Comparable accuracy than current HF risk calculators. Improved long-term risk prediction |
| ARIC HF-risk calculator: 0.76 (0.72–0.80) | |||||
| FH-HF risk calculator: 0.74 (0.70–0.78) | |||||
| CPH: 0.78 (0.75–0.80) | |||||
| ECG-AI: 0.77 (0.74–0.79) | |||||
| ECG-AI-cox: 0.84 (0.81–0.87) | |||||
| Cho et al. (2020) [14] | DLA with VAE | MI | 12-Lead and 6-lead ECG | Internal validation: 0.88 | ML algorithms improved MI detection. Six-lead DLA with VAE outperformed non-VAE |
| External validation: 0.854 | |||||
| Doudesis et al. (2023) [17] | Collaboration for the Diagnosis and Evaluation of ACS score using ML | Acute MI | Continuous troponin levels, patient history, clinical factors. | 0.953 (0.947–0.958) | Improved MI probability detection; identified more low-risk patients with minimal 1-year cardiac death risk |
| Herman et al. (2023) [16] | AI model | Occluded MI | ECG data | 0.938 (0.924–0.951) | Enhanced ACS triage with high accuracy |
| Lin et al. (2024) [15] | AI-ECG | ST-elevation myocardial infarction | ECG data | - | Reduced door-to-balloon and ECG-to-balloon times. Decrease in-hospital and 6-month mortality |
| Sveric et al. (2024) [19] | AI-driven automated workflows | HF and left ventricular ejection fraction estimation | Echocardiography images | Auto-ECHO: 0.93 | Reduced measurement variability (<5%). Improved reliability over traditional methods |
| MBS-ECHO: 0.92 |
| Study | AI/ML tool used | Disease/symptom | Data used | Accuracy measure | Outcome |
|---|---|---|---|---|---|
| Ippolito et al. (2023) [20] | Multi-tasking mask R CNN | Pneumonia | CXR | Accuracy: 93.8% | Remarkable accuracy of AI tool found in differentiating between COVID-19 pneumonia, typical bacterial pneumonia, and healthy subjects |
| Chagas et al. (2024) [21] | Predictive models using Boruta algorithm | AIDS- associated PCP | LDH, O2 saturation | AUC: >0.8 | Predictive models differentiated AIDS PCP |
| CRP, RR (>24 bpm) dry cough, HIV viral load, CD4 cell count and CXRs | |||||
| Salvatore et al. (2021) [22] | ResNet- 50 architecture and trained and cross validated | COVID-19 pneumonia | CXRs | AUC: 0.98 | AI tools showed potential in correctly differentiating COVID-19 from community acquired pneumonia. |
| Carlile et al. (2020) [23] | CNN | COVID-19 pneumonia | CXRS with heat maps | - | More than 80% physicians found the tool aided in optimizing the workflow and clinical decisions. |
| Kwon et al. (2020) [24] | DenseNet-121 architecture pretrained on ImageNet | COVID-19 pneumonia | Chest radiographs and clinical variables | AUC: | Model improved prediction regarding mortality and intubation requirement. |
| 0.88 for intubation | |||||
| 0.82 for mortality | |||||
| Lin et al. (2024) [25] | Blood culture prediction index and CURB-65 integrated novel | Pneumonia | CBC, differential leukocyte counts, BUN, age, sex, RR, blood pressure, GCS | AUC: 0.713 for cox regression model | Integrated models had better predictive performance for low risk patients. |
| COX regression model | |||||
| Somani et al. (2021) [26] | Fusion model- ML model | Pulmonary embolism | Clinical data and ECGs | AUC: 0.84 AUC for fusion model | Fusion models performed better than the other models in improving screening of acute PE. |
| Choi et al. (2024) [27] | Smartphone application ECG BUDDY | Pulmonary embolism | ECGs | AUC: 0.895 | Right ventricular dysfunction identified rapidly on ECG as compared to markers like Troponin I and ProBNP. |
| QCG-RVDys | |||||
| Müller-Peltzer et al. (2021) [28] | Computer aided algorithm integrated into Siemens Healthineers software, VB20A | Pulmonary embolism | CT pulmonary angiography | Sensitivity: 47% at lobar and 50% at subsegmental level | Computer-aided algorithms lead to higher false positives can be used to assist radiologists but needs radiological examination for confirmation. |
| Langius-Wiffen et al. (2023) [29] | AI algorithm | Pulmonary embolism | CT pulmonary angiography | Sensitivity: 96.8% | Al algorithm showed higher accuracy in detecting pulmonary embolism versus radiologist. |
| Specificity: 99.9% | |||||
| Savage et al. (2024) [30] | AI triage system, the BriefCase for iPE from the vendor Aidoc | Pulmonary embolism | Contrast enhanced CT scans | Sensitivity: 96.2 | AI assistance showed increased sensitivity for detecting pulmonary embolism. However, no significant change observed in decreasing time. |
| Specificity: 99.9 | |||||
| Huang et al. (2020) [31] | Deep learning 77- layer 3D convolutional neural network (PENet) | Pulmonary embolism | CT scan | AUC: | Successful automated application for diagnosis of pulmonary embolism with the AI tool. |
| 0.84 for internal validation | |||||
| 0.85 for external validation | |||||
| Villacorta et al. (2022) [32] | Generalized learning logistic regression using elastic net | Pulmonary embolism | O2 saturation, history of deep venous thrombosis or pulmonary embolism, immobilization or surgery and D dimer | AUC: 0.89 | D dimer maintained a low false negative rate |
| Hillis et al. (2022) [33] | AI model | Pneumothorax | CXR | AUC: | Al model detected pneumothorax and tension pneumothorax accurately. |
| 0.97 for pneumothorax | |||||
| 0.98 for tension pneumothorax | |||||
| Han et al. (2025) [34] | Convolutional Recurrent Neural Network and sequence modelling (recurrent neural network) | ARDS | ECG, HR, RR, SpO2, and non-invasive systolic and diastolic blood pressures, body temperature, AVPU score, age, sex, arterial blood gas analysis | AUC: | Positive prediction scores and clinical outcomes were established with the AI model. |
| 0.84 for internal validation | |||||
| 0.73 for external validation | |||||
| Rashid et al. (2022) [35] | Multiple AI models | ARDS | Variable | AUC: 0.8–1 | AI models reduced cost and improved outcomes amongst ARDS patients. |
| Viderman et al. (2024) [36] | Feed forward neural networks, CNN and ANN | ARDS | Variable | - | AI models shown to perform well in predicting respiratory failure. |
| Liong-Rung et al. (2022) [37] | GoogleNet Inception V4 architecture | Dyspnea | CXRs | Accuracy: 83.2% | High negative predictive value of excluding pulmonary edema & high positive predictive value in diagnosing pneumonia. |
| Study | AI/ML model | Disease | Data used | Accuracy measure | Outcome |
|---|---|---|---|---|---|
| Westwood et al. (2024) [45] | Viz LVO, Rapid LVO, Brainomix, Avicenna | Stroke | Head CTA | Sensitivity: 95.4% | The study showed that AI software enhanced accuracy in stroke detection. |
| Specificity: 79.4% | |||||
| Elijovich et al. (2022) [40] | Viz AI | Stroke | Head CTA | - | The AI software reduced time to notify management team and patients received earlier thrombectomies. |
| Gunda et al. (2022) [41] | e-Stroke Suite | Stroke | Non-contrast Head CT and CTA | - | The AI system helped in decision making process. Reduced time and improved rate of receiving reperfusion treatment strategies. |
| Shlobin et al. (2022) [39] | CNN, RF and SVM | Stroke | CTA | Sensitivity: 63.6% | AI tools accurately triaged and diagnosed patients with large vessel occlusion. |
| Specificity: 85.8% | |||||
| Paz et al. (2021) [38] | Rapid LVO | Stroke | Head CT scan | - | The study showed that AI tool had low sensitivity and moderate specificity. |
| Amukotuwa et al. (2019) [44] | Avicenna CINA | Stroke | Head CT angiography | Sensitivity: 95% | Avicenna CINA proved to be one of the most accurate tools in stroke detection. |
| Specificity: 79% | |||||
| Schmitt et al. (2022) [42] | Brainomix e-CTA | Stroke | Non-contrast enhanced head CT | Sensitivity: 0.91 | The AI-based algorithm dependably determined if acute intra-cranial hemorrhage were present amongst participants and calculated intra-parenchymal hemorrhage volumes accurately |
| Specificity: 0.89 | |||||
| Wouters et al. (2018) [46] | Automated multi-variate imaging model | Stroke | Diffusion weighted and perfusion weighted imaging | AUC: 0.80 | The model accurately identified patients of large vessel occlusion within 6-hour window. |
| Study | AI/ML/DL tool used | Disease/symptom | Data used | Accuracy measure | Outcome |
|---|---|---|---|---|---|
| McInnis et al. (2023) [47] | AI-EEG | Epilepsy | EEG | AUC: 0.86 | Higher diagnostic accuracy than clinical experts |
| Tveit et al. (2023) [48] | SCORE-AI | Epilepsy | EEG | Accuracy: 88.3% | Reduces diagnostic errors and improves interrater agreement |
| Keikhosrokiani et al. (2024) [49] | AI augmented pathway for digital care pathway for epilepsy (DCPE) | Epilepsy | Scalp EEG | Accuracy: 97.65% for CNN | Increase the speed of seizure detection and diagnosis |
| Jahan et al. (2023) [51] | Cloud computing | Epilepsy | EEG | - | Assessment of epileptiform abnormalities and localization of epileptogenic zones |
| Seyam et al. (2022) [52] | AI-doc | Hemorrhage | Head CT | Accuracy: 93% | High diagnostic and predictive value for ICH |
| Matsoukas et al. (2022) [53] | Logistic regression, support machine vector, random forest, gradient boosting, deep CNN | Hemorrhage | NCCT and MRI | Accuracy: 93.46% for ICH | Detected ICH and microbleeds with high accuracy |
| Heit et al. (2021) [54] | RAPID-ICH | Hemorrhage | NCCT scans | Sensitivity: 0.95 | High accuracy in detecting and quantifying the volume of ICH |
| Specificity: 0.95 | |||||
| Wang et al. (2021) [55] | GRAD-CAM | Hemorrhage | CT scans | - | Improve the accuracy and transparency in ICH detection |
| Wismüller et al. (2020) [56] | AI-PROBE | Hemorrhage | CT scans | Accuracy: 96.4% | TAT reduction for cases of ICH |
| Savage et al. (2024) [57] | AI-doc (commercial AI triage system) | Hemorrhage | NCCT | Accuracy: 99.2% | Diagnostic accuracy and reported TAT with AI assistance was on par with radiologists |
| Agarwal et al. (2023) [58] | Convolutional neural network, AI-doc, Avicenna.ai | Hemorrhage | CT and MRI | Sensitivity: 0.90 | Reflects that most abnormality detection AI studies were not adequately validated in representative clinical cohorts |
| Specificity: 0.90 | |||||
| Davis et al. (2022) [59] | AI-doc | Hemorrhage | CT scans | - | Reported significant reductions in RTAT along with improved LOS |
| Voter et al. (2021) [60] | (AI) decision support systems, Aidoc | Hemorrhage | CT scans | Sensitivity: 92.3% | Diagnostic performance in cases with a prior history of neurosurgery ICH is weaker |
| Specificity: 97.7% | |||||
| Ginat et al. (2021) [61] | AI-doc | Hemorrhage | CT | - | Shortened the time needed to scan reviews and prioritize severe cases |
| Hsu et al. (2023) [62] | Random forest and logistic regression | ALOC due to hyperglycemia | Electronic medical records | AUC: 0.79 for all-cause mortality | Improve the triage process by forecasting patient outcomes |
| Obeid et al. (2019) [63] | Word2vec | Altered mental status | Patient notes | Accuracy: 94.5% | To automate the identification of altered mental status |
| El-Rashidy et al. (2023) [64] | MIMIC III | ALOC | EEG data and vital signs | Mean absolute error: 0.269 | Enhance the predictive and intuitive explanation of consciousness levels |
| Study | AI/ML tool | Disease | Data used | Accuracy measure | Outcome |
|---|---|---|---|---|---|
| Zhou et al. (2022) [65] | 47 ML models (ANN, LR, SVM, XGBoost, and RF) | AP | Clinical, laboratory and imaging parameters | - | ML models showed potential in diagnosing AP and predicting mortality and recurrence. |
| Pan et al. (2024) [66] | MSAnet (CNN) | AP and AP&PDAC | Non-contrast and enhanced abdominal CT | AUC: 0.99 | AI tools enhanced diagnostic accuracy. |
| Kui et al. (2022) [67] | EASY-APP (XGBoost) | AP | RR, body temperature, abdominal muscle reflex, sex, age, and blood glucose levels | Accuracy: 89.1% | The AI tool predicted risk of severe AP within few hours of hospital admission with high accuracy. |
| Podda et al. (2024) [68] | ChatGPT | Acute biliary pancreatitis | Clinical and laboratory values | - | The AI tool helped optimize the management of patients in ED and ICU. |
| İnce et al. (2023) [69] | Gradient boost ML algorithm | AP | Demographic, clinical and laboratory parameters | Accuracy: | The ML tool proficiently delineated the severity, need of ICU and survival among patients. |
| 98.25% for ICU need 92.77% for survival | |||||
| Park et al. (2020) [70] | CNN | AA | Abdominal CT | Accuracy: more than 90% | The ML tool diagnosed AA among patient presenting in ED with severe abdominal pain with high accuracy. |
| Issaiy et al. (2023) [71] | ANN and LR | AA | Clinical, laboratory and imaging parameters | AUC: 0.985 | AI tools diagnosed AA and predicted post-surgical risk of sepsis and ICU requirement efficiently. |
| Akbulut et al. (2023) [72] | CatBoost model | Perforated and non-perforated AA | CBC, bilirubin, CRP, age and other laboratory values. | Accuracy: 92% | The AI model distinguished perforated and non-perforated AA with remarkable accuracy. |
| Roshanaei et al. (2024) [73] | GNB model, RF model, GBA and SVM | AA | Patient characteristics, laboratory parameters and cause of pain | Accuracy: | AI tools showed promising results in accurately diagnosing AA in the emergency setting. Gaussian naïve bayes model displayed most superior results. |
| 95.03% for GNB | |||||
| 92.55% for RF | |||||
| 94.41% for GBA | |||||
| 91.93% for SVM | |||||
| Ghareeb et al. (2024) [74] | AI platform | AA | Patient characteristics | AUC: 0.97 | The AI platform showed to be an effective diagnostic tool in detecting AA. |
| Shung at al. (2019) [76] | 30 ML models | GIB | Clinical parameters | AUC: | ML models showed superiority over clinical tools in predicting mortality in upper GIB. ANN proved to be the most effective model. |
| 0.93 for ANN | |||||
| 0.81 for other ML models | |||||
| Saraiva et al. (2021) [77] | CNN | GIB | Capsule endoscopy images | Accuracy: 99% | Model proved to be effective in identifying small bowel lesions and predicting risk of bleeding. |
| Mohan et al. (2021) [78] | CNN-based CAD | GIB or gastrointestinal hemorrhage | WCE | Accuracy: 95.4% | CNN models showed high accuracy, NPV and PPV in diagnosing GIB or hemorrhage by proficiently interpreting WCE images. |
| Study | AI/ML tool used | Disease | Data used | Accuracy measure | Outcome |
|---|---|---|---|---|---|
| Delahanty et al. (2019) [80] | Risk of Sepsis score | Sepsis | Rhee clinical surveillance criteria | AUC: 0.93–0.97 | ML model proved to be more effective than conventional clinical screening tools. |
| Kaji et al. (2019) [82] | Long short-term memory recurrent Neural Network | Sepsis | Clinical parameters | AUC: | The model effectively predicted daily sepsis occurrence among patients in admitted in ICU. |
| 0.87 for sepsis | |||||
| 0.83 for antibiotic administration | |||||
| Barton et al. (2019) [79] | Supervised gradient -boosted tree model | Sepsis | Vital signs | AUC: 0.88 for sepsis onset | The model accurately predicted sepsis 48 hours before onset. |
| Mao et al. (2018) [81] | InSight | Severe sepsis and septic shock | Vital signs | Accuracy: | ML model predicted severe sepsis and septic shock 4 hours before onset with high accuracy. |
| 0.92 for sepsis | |||||
| 0.87 for severe sepsis | |||||
| Henry et al. (2022) [84] | TRWES ML-based warning system | Sepsis | Clinical parameters | - | The alert system model decreased time to initiate antibiotic treatment and improved patient outcomes. |
| Yuan et al. (2020) [86] | Diagnostic algorithm | Sepsis | Clinical data | Accuracy: 82% | The AL algorithm can improve patient outcomes by diagnosing sepsis among ICU patients accurately as compared to SOFA. |
| XG Boost | |||||
| Yan et al. (2022) [87] | ML models | Sepsis | Clinical data notes from providers, demographic, vital signs, laboratory data and medications | - | The models proved effective in early identification of sepsis. |
| Yue et al. (2022) [88] | ANN, XGBoost, LR and SVM | AKI in sepsis | Various clinical and laboratory parameters | Accuracy: 0.82 for XGboost | The ML models proved to be proficient in predicting AKI among patients with sepsis. |
| Zhang et al. (2023) [89] | Multiple ML models | Mortality in sepsis | Various clinical and laboratory parameters | Sensitivity: 0.71 | The ML models showed superiority in predicting mortality as compared to conventional clinical scoring tools. ML highlighted lactate to be an effective indictor, usually missed by clinical tools. |
| Specificity: 0.68 | |||||
| She et al. (2023) [90] | SVM and RF | Sepsis | Non-targeted liquid chromatography-high-resolution mass spectrometry metabolomics | - | ML model delineated the association between various metabolites and sepsis that can potentially aid in diagnosis and management. |
AI: artificial intelligence; ML: machine learning; AUC: area under the curve; ECG: electrocardiogram; BMI: body mass index; ARIC: atherosclerosis risk in communities; HF: heart failure; FH: familial hypercholesterolemia; CPH: cox proportional hazards; DLA: deep-learning algorithm; VAE: variational autoencoder; MI: myocardial infarction; ACS: acute coronary syndrome; ECHO: echocardiogram; MBS: medicare benefits schedule echocardiography.
AI: artificial intelligence; ML: machine learning; CNN: convolutional neural network; CXR: chest X-ray; COVID-19: coronavirus disease 2019; AIDS: acquired immunodeficiency syndrome; PCP: pneumocystis jirovecii pneumonia; LDH: lactate dehydrogenase; CRP: C-reactive protein; RR: respiratory rate; HIV: human immunodeficiency virus; CD: cluster of differentiation; AUC: area under the curve; CBC: complete blood count; BUN: blood urea nitrogen; GCS: Glasgow coma scale; ECG: electrocardiogram; PE: pulmonary embolism; ProBNP: pro-B-type natriuretic peptide; CT: computed tomography; iPE: incidental pulmonary embolism; 3D: three-dimensional; ARDS: acute respiratory distress syndrome; HR: heart rate; AVPU: alert, verbal, pain, unresponsive; ANN: artificial neural networks.
AI: artificial intelligence; ML: machine learning; LVO: large vessel occlusion; CTA: computed tomography angiography; CT: computed tomography; CNN: convolutional neural network; RF: random forest; SVM: support vector machine; AUC: area under the curve.
AI: artificial intelligence; ML: machine learning; DL: deep learning; EEG: electroencephalogram; AUC: area under the curve; CNN: convolutional neural network; CT: computed tomography; ICH: intracranial hemorrhage; NCCT: non-contrast computed tomography; MRI: magnetic resonance imaging; TAT: turnaround time; LOS: length of stay; RTAT: report turnaround time; ALOC: altered level of consciousness.
AI: artificial intelligence; ML: machine learning; ANN: artificial neural network; LR: logistic regression; SVM: support vector machine; RF: random forest; AP: acute pancreatitis; CNN: convoluted neural network; AP&PDAC: acute pancreatitis with pancreatic ductal carcinoma; CT: computed tomography; AUC: area under the curve; RR: respiratory rate; ED: emergency department; ICU: intensive care unit; AA: acute appendicitis; CBC: complete blood count; CRP: C-reactive protein; GNB: Gaussian naïve bayes; GBA: gradient boost algorithm; GIB: gastrointestinal bleeding; ANN: artificial neural network; CAD: computer aided design; WCE: wireless capsule endoscopy NPV: negative predictive value.
AI: artificial intelligence; ML: machine learning; AUC: area under the curve; ICU: intensive care unit; SOFA: Sequential Organ Failure Assessment; ANN: artificial neural network; LR: logistic regression; SVM: support vector machine; AKI: acute kidney injury; RF: random forest.