Artikel,

A comparative patient-level prediction study in OMOP CDM: applicative potential and insights from synthetic data

N. Ahmadi, {. Nguyen, M. Sedlmayr, und M. Wolfien.
Scientific reports, (27.01.2024)
DOI: 10.1038/s41598-024-52723-y

Zusammenfassung

The emergence of collaborations, which standardize and combine multiple clinical databases across different regions, provide a wealthy source of data, which is fundamental for clinical prediction models, such as patient-level predictions. With the aid of such large data pools, researchers are able to develop clinical prediction models for improved disease classification, risk assessment, and beyond. To fully utilize this potential, Machine Learning (ML) methods are commonly required to process these large amounts of data on disease-specific patient cohorts. As a consequence, the Observational Health Data Sciences and Informatics (OHDSI) collaborative develops a framework to facilitate the application of ML models for these standardized patient datasets by using the Observational Medical Outcomes Partnership (OMOP) common data model (CDM). In this study, we compare the feasibility of current web-based OHDSI approaches, namely ATLAS and "Patient-level Prediction" (PLP), against a native solution (R based) to conduct such ML-based patient-level prediction analyses in OMOP. This will enable potential users to select the most suitable approach for their investigation. Each of the applied ML solutions was individually utilized to solve the same patient-level prediction task. Both approaches went through an exemplary benchmarking analysis to assess the weaknesses and strengths of the PLP R-Package. In this work, the performance of this package was subsequently compared versus the commonly used native R-package called Machine Learning in R 3 (mlr3), and its sub-packages. The approaches were evaluated on performance, execution time, and ease of model implementation. The results show that the PLP package has shorter execution times, which indicates great scalability, as well as intuitive code implementation, and numerous possibilities for visualization. However, limitations in comparison to native packages were depicted in the implementation of specific ML classifiers (e.g., Lasso), which may result in a decreased performance for real-world prediction problems. The findings here contribute to the overall effort of developing ML-based prediction models on a clinical scale and provide a snapshot for future studies that explicitly aim to develop patient-level prediction models in OMOP CDM.

BibTeX-Schlüssel: e7be8793a22c4d5685dd87e3252aeaa8
Eintragstyp: article
Jahr: 2024
Monat: jan
Tag: 27
Zeitschrift: Scientific reports
Nummer: 1
Verlag: Nature Publishing Group
Band: 14
language: English
issn: 2045-2322
DOI: 10.1038/s41598-024-52723-y

Nutzer

Kommentare und Rezensionenanzeigen / verbergen

Bitte melden Sie sich an um selbst Rezensionen oder Kommentare zu erstellen.

Zitieren Sie diese Publikation

%0 Journal Article %1 e7be8793a22c4d5685dd87e3252aeaa8 %A Ahmadi, Najia %A Nguyen, Quang Vu %A Sedlmayr, Martin %A Wolfien, Markus %D 2024 %I Nature Publishing Group %J Scientific reports %K Databases, Electronic FIS_scads Factual, Health Humans, Informatics, Learning, Medical Records topic_lifescience yaff machine learning %N 1 %R 10.1038/s41598-024-52723-y %T A comparative patient-level prediction study in OMOP CDM: applicative potential and insights from synthetic data %V 14 %X The emergence of collaborations, which standardize and combine multiple clinical databases across different regions, provide a wealthy source of data, which is fundamental for clinical prediction models, such as patient-level predictions. With the aid of such large data pools, researchers are able to develop clinical prediction models for improved disease classification, risk assessment, and beyond. To fully utilize this potential, Machine Learning (ML) methods are commonly required to process these large amounts of data on disease-specific patient cohorts. As a consequence, the Observational Health Data Sciences and Informatics (OHDSI) collaborative develops a framework to facilitate the application of ML models for these standardized patient datasets by using the Observational Medical Outcomes Partnership (OMOP) common data model (CDM). In this study, we compare the feasibility of current web-based OHDSI approaches, namely ATLAS and "Patient-level Prediction" (PLP), against a native solution (R based) to conduct such ML-based patient-level prediction analyses in OMOP. This will enable potential users to select the most suitable approach for their investigation. Each of the applied ML solutions was individually utilized to solve the same patient-level prediction task. Both approaches went through an exemplary benchmarking analysis to assess the weaknesses and strengths of the PLP R-Package. In this work, the performance of this package was subsequently compared versus the commonly used native R-package called Machine Learning in R 3 (mlr3), and its sub-packages. The approaches were evaluated on performance, execution time, and ease of model implementation. The results show that the PLP package has shorter execution times, which indicates great scalability, as well as intuitive code implementation, and numerous possibilities for visualization. However, limitations in comparison to native packages were depicted in the implementation of specific ML classifiers (e.g., Lasso), which may result in a decreased performance for real-world prediction problems. The findings here contribute to the overall effort of developing ML-based prediction models on a clinical scale and provide a snapshot for future studies that explicitly aim to develop patient-level prediction models in OMOP CDM.

@article{e7be8793a22c4d5685dd87e3252aeaa8, abstract = {The emergence of collaborations, which standardize and combine multiple clinical databases across different regions, provide a wealthy source of data, which is fundamental for clinical prediction models, such as patient-level predictions. With the aid of such large data pools, researchers are able to develop clinical prediction models for improved disease classification, risk assessment, and beyond. To fully utilize this potential, Machine Learning (ML) methods are commonly required to process these large amounts of data on disease-specific patient cohorts. As a consequence, the Observational Health Data Sciences and Informatics (OHDSI) collaborative develops a framework to facilitate the application of ML models for these standardized patient datasets by using the Observational Medical Outcomes Partnership (OMOP) common data model (CDM). In this study, we compare the feasibility of current web-based OHDSI approaches, namely ATLAS and {"}Patient-level Prediction{"} (PLP), against a native solution (R based) to conduct such ML-based patient-level prediction analyses in OMOP. This will enable potential users to select the most suitable approach for their investigation. Each of the applied ML solutions was individually utilized to solve the same patient-level prediction task. Both approaches went through an exemplary benchmarking analysis to assess the weaknesses and strengths of the PLP R-Package. In this work, the performance of this package was subsequently compared versus the commonly used native R-package called Machine Learning in R 3 (mlr3), and its sub-packages. The approaches were evaluated on performance, execution time, and ease of model implementation. The results show that the PLP package has shorter execution times, which indicates great scalability, as well as intuitive code implementation, and numerous possibilities for visualization. However, limitations in comparison to native packages were depicted in the implementation of specific ML classifiers (e.g., Lasso), which may result in a decreased performance for real-world prediction problems. The findings here contribute to the overall effort of developing ML-based prediction models on a clinical scale and provide a snapshot for future studies that explicitly aim to develop patient-level prediction models in OMOP CDM.}, added-at = {2024-11-28T16:27:18.000+0100}, author = {Ahmadi, Najia and Nguyen, {Quang Vu} and Sedlmayr, Martin and Wolfien, Markus}, biburl = {https://puma.scadsai.uni-leipzig.de/bibtex/2e48490a9e00034976de25694314ed7d8/scadsfct}, day = 27, doi = {10.1038/s41598-024-52723-y}, interhash = {552b52a075418c2ef9ec7d2284cd4edf}, intrahash = {e48490a9e00034976de25694314ed7d8}, issn = {2045-2322}, journal = {Scientific reports}, keywords = {Databases, Electronic FIS_scads Factual, Health Humans, Informatics, Learning, Medical Records topic_lifescience yaff machine learning}, language = {English}, month = jan, number = 1, publisher = {Nature Publishing Group}, timestamp = {2025-07-29T12:31:01.000+0200}, title = {A comparative patient-level prediction study in OMOP CDM: applicative potential and insights from synthetic data}, volume = 14, year = 2024 }

PUMA

A comparative patient-level prediction study in OMOP CDM: applicative potential and insights from synthetic data

Zusammenfassung

Tags

Nutzer

Kommentare und Rezensionenanzeigen / verbergen

Zitieren Sie diese Publikation

Mehr Zitationsstile

Suchen auf