Integrating traditional omics and machine learning approaches to identify microbial biomarkers and therapeutic targets in pediatric inflammatory bowel disease
Background Pediatric inflammatory bowel disease (IBD), especially Crohn’s disease, significantly affects gut health and quality of life. Although gut microbiome research has advanced, identifying reliable biomarkers remains difficult due to microbial complexity. Methods We used RNA-seq-based microbial profiling and machine learning (ML) to find robust biomarkers in pediatric IBD. Microbial taxa were profiled at phylum, genus, and species levels using kraken2 on Crohn’s disease and non-IBD ileal biopsies. We performed abundance-based analyses and applied four ML models (Logistic Regression, Random Forest, Support Vector Machine, XGBoost) to detect discriminative taxa. An independent cohort of 36 pediatric stool samples assessed by 16S rRNA sequencing validated top ML results. Results Traditional abundance-based methods showed compositional shifts but identified few consistently significant taxa. ML models had better discriminatory performance, with XGBoost outperforming others and pinpointing Orthotospovirus and Vescimonas as key genera. These findings were confirmed in the validation cohort, where only one traditionally noted genus, Actinomyces , maintained significance. Discussion Integrating conventional omics with AI-driven analytics boosts reproducibility and clinical relevance of microbial biomarker discovery, opening new possibilities for targeted therapies and precision medicine in pediatric IBD.
Preview
Cite
Access Statistic
