Scott Cohen wins the 2025 pre-doctoral research poster competition

Person poses in front of the EPI vertical banner that reads "understanding global emergence and the spread of infectious disease." Wide-shot image.
Scott Cohen is a graduate student in the Department of Epidemiology at the UF College of Public Health and Health Professions. His research uses machine learning to help predict hospital outbreaks of certain bacteria. (Photo credit: Brianne Lehan)

In the world of artificial intelligence, Scott Cohen, a graduate student at the University of Florida’s College of Public Health and Health Professions and the UF College of Medicine, has jumped at the opportunity to take nearly 10 years of underutilized data to help hospitals. 

Cohen, in the Department of Epidemiology, combined machine learning with eight years of electronic health records to win first place in the pre-doctoral research poster competition at the Emerging Pathogens Institute Research Day 2025. 

While hospital-acquired infections remain a persistent challenge in healthcare, Cohen and others in the lab are working to outsmart a particularly stubborn culprit: Pseudomonas aeruginosa. They hope to catch outbreaks of this multidrug-resistant pathogen earlier and faster than traditional methods allow. 

Between 2016 and 2024, the team sifted through data from over 19,000 hospitalizations and more than 11,000 patients, analyzing culture-positive P. aeruginosa cases using pre-existing electronic health records. About a third of these cases were hospital-onset, meaning infections appeared more than two days after admission — precisely the kind of thing hospitals aim to prevent. 

When describing the opportunity to conduct this study, Cohen said, “By combining these large, complex existing data [pools] with these new methods out there, I was able to demonstrate that, yes, you could potentially identify related strains of hospital-onset infections but also identify potential reasons why these infections are occurring.” 

Standard surveillance often takes time and lacks consistency, delaying crucial opportunities for intervention. Cohen and the team combined space-time permutation analysis (WHONET-SaTScan) — a method used to analyze data for disease outbreaks that occur in both space and time — with machine learning to detect clusters of infections that might signal an outbreak. They didn’t stop there; using whole-genome sequencing (WGS), they tested whether these clusters were genetically related. Although none of the 12 detected clusters overlapped with WGS-confirmed outbreaks, the process offered valuable insight into emerging resistance patterns. 

Man poses in front of poster in arena.
Cohen presents his poster during the poster session at EPI Research Day 2025. (Photo sourced by Scott Cohen)

One standout cluster that caught the team’s attention was 17 patients with P. aeruginosa resistant to cephalosporins, a common class of antibiotics. The researchers ran a case-control study to determine why the patients were grouped together in time and space. They compared this group with over 200 other hospitalized patients who had P. aeruginosa but weren’t a part of the cluster. 

Using machine learning, specifically an elastic net model, they identified 32 potential risk factors linked to the cluster. Their results highlighted two striking factors. First, patients who had undergone an open excision procedure five to nine days earlier were 28 times more likely to be in the cluster. Even more dramatically, those preparing for a skin graft just two days before had 45 times the odds. 

Their findings suggest that by combining statistical tools with machine learning, hospitals could detect and prevent outbreaks of dangerous pathogens faster than ever before. 

For his next chapter, Cohen noted, “I’ll be heading back to complete my medical school training and then pursue a career in infectious disease and hospital-acquired infections to further study this. The data and methods are always changing so rapidly.” A field such as this is brimming with promise as we enter a new era of infectious disease research. 


Resistance-driven outbreak detection and etiologic investigation for hospital-onset Pseudomonas aeruginosa using machine learning on electronic health records

Collaborators

  • Scott Cohen – Department of Epidemiology, Emerging Pathogens Institute, College of Public Health and Health Professions, University of Florida
  • Massimiliano Tagliamonte – Department of Pathology, Immunology, and Laboratory Medicine, Interdisciplinary Center for Biotechnology Research, Emerging Pathogens Institute, College of Medicine, University of Florida
  • Nicole Iovine – Division of Infectious Diseases, Department of Medicine, College of Medicine, University of Florida
  • Kwangcheol Casey Jeong – Department of Animal Sciences, Emerging Pathogens Institute, College of Agricultural and Life Sciences, University of Florida
  • Marco Salemi – Department of Pathology, Immunology, and Laboratory Medicine, Emerging Pathogens Institute, College of Medicine, University of Florida
  • Mattia Prosperi – Department of Epidemiology, Emerging Pathogens Institute, College of Public Health and Health Professions, University of Florida
  • J. Glenn Morris, Jr. – Emerging Pathogens Institute, University of Florida

Introduction

Hospital outbreak identification often lacks standardization and relies on time-intensive surveillance methods, which may delay intervention against multidrug-resistant hospital-acquired pathogens. To address this challenge, we demonstrate combining space-time permutation analysis with machine learning to statistically identify hospital-onset Pseudomonas aeruginosa outbreaks and potential outbreak-specific etiologies contributing to the spread.

Methods

We retrospectively analyzed hospital-onset P. aeruginosa isolates collected between 2016 and 2024 from electronic health records (EHR) at a single tertiary care center. Hospital-onset infections were defined as culture-positive samples obtained more than two days after admission. Antibiotic susceptibility profiles were standardized by imputing for intrinsic resistance and classifying multidrug-resistance (MDR) and extensively drug-resistant (XDR). Space-time permutation analysis (WHONET-SatScan) identified clusters of resistance profiles, with cluster confirmation performed against whole-genome sequencing (WGS) data using SNP differences. For each cluster, we conducted case-control studies comparing affected hospital-onset cases with non-case P. aeruginosa patients hospitalized during the same period. Feature selection for potential temporally-scaled hospital-based risk factors was performed using elastic net regularization (R package glmnet) with 10-fold cross-validation.

Results

Between 2016 and 2024, 19,055 hospitalizations (11,112 patients) yielded culture-positive P. aeruginosa isolates, with 12,498 (65.6%) community-onset and 6,557 (34.4%) hospital-onset cases. Hospital-onset isolates were 1.9 (95% CI: 1.8–2.1) times more likely to be MDR and 2.2 (95% CI: 1.9–2.6) times more likely to be XDR compared to community-onset cases. Of the hospital-onset isolates, 615 unique resistance profiles were identified, with 53% fully susceptible to major antipseudomonal classes. Resistance to antipseudomonal cephalosporins was most common (30.2%; n=1,981), with 27% resistant to cefepime and 27.6% to ceftazidime. Space-time permutation analysis detected 12 unique clusters, though none overlapped with clusters identified via WGS. One cluster defined by cephalosporin resistance (n=17) was further examined. Final elastic net model identified 32 features associated with cluster membership compared to controls (n=213), including an open excision procedure 5–9 days prior (OR: 28.1, p<0.001) and preparation for skin graft 2 days prior (OR: 45.2, p<0.001).

Discussion

We demonstrate the use of EHRs to statistically detect related hospital-onset P. aeruginosa infections and potential etiologies associated with these clusters. Several factors with strength in association and biological plausibility were identified, though resistance profile-based models did not capture relatedness among clusters confirmed by WGS. Future research will establish causal links and extend this framework to other hospital-acquired pathogens.