Releases · davifmdhack/adeno_predict

GitHub Page Authors

Davi Ferreira MD., MSc.

Fernanda Veloso MD., PhD. candidate

Introduction

This repository (Adeno Predict) serves the purpose of applying machine learning algorithms to predict the consistency of pituitary macroadenomas from demographic data and brain MRI parameters.
The objective of this application is to optimize the ability to predict non-soft consistency and consequently improve surgical planning and ultimately reduce post-surgical complications.

Using a database of 70 patients from Hospital de Clínicas of the State University of Campinas (HC- UNICAMP). Our group opted for the following classification algorithms: Decision Tree (DT), K-nearst Neighbor (KNN), Support Vector Machine (SVM) and Ensemble of two best models (DT and SVM).

In this repository, we divided the codes according to the following steps: example of dataset folder (dataset__example.csv), imputation of missing values (imputation folder), tunning flow using Leave-One-Out strategy (workflow_algorithms folder), metrics and bootstrap (metrics folder).

Dataset format

Because data collection was carried out in a single research center, it was not necessary to build a server, implement a cluster or distributed processing. The database was built similar to
the available file dataset_example.csv with single acess PATH in domestic domain in dataset folder. The features used were described in features_detail.md.

Imputation missing values

The imputation process was used according to Van Buuren criteria, six values for ADC and eleven for consistency. KNN was used for deterministic process and multiple imputation by chained equations (MICE) with linear regression for stochastic methods. More information in imputation folder.

Workflow algorithms

We applied a pipeline from scikit-learn of the pre-processed dataset for each algorithm, considering particularities such as standardization of numerical variables. A cross-validation method using Leave-One-Out (leave_one_out.md) with
10 folds for cross-validation process until finding the best_model for each algorithm considering the parameters (algorithms_parameters.md).

Metrics and bootstrap implementation

We used the following metrics considering the nature of the problem and its unbalanced data: (1) Area Under Curve (AUC) of Receiver Operating Curve (ROC), (2) Average precision-recall (AP), (3) Sensitivity (or Recall), (4) Specificity, (5) F1 score and (6) Matthew Correlation Coefficient (MCC). The formulas and bootsrap techniques are described in metrics folder. Bootstrap was used to find interval confidence (IC) with 95% confidence (n= 1000) after find best threshold (bootstrap_code.md).

Clone repository and application for domestic dataset

We have developed a step-by-step guide, available in clone_repository > repository_clone.md, so that researchers can apply our trained model if they have the necessary information.

At the moment, this application is limited to databases that have all the required values. In the future, we will implement a method for imputing missing data.

References

van Buuren S. Flexible Imputation of Missing Data, Second Edition. Second edition. | Boca Raton, Florida : CRC Press, [2019] |: Chapman and Hall/CRC; 2018. doi: 10.1201/9780429492259.
Mas̕s, S. (2021). Interpretable Machine Learning with Python : Learn to Build Interpretable High-performance Models with Hands-on Real-world Examples. 1st Edition, Packt Publishing, Birmingham.
Garbin, C., & Marques, O. (2022). Assessing Methods and Tools to Improve Reporting, Increase Transparency, and Reduce Failures in Machine Learning Applications in Health Care. Radiology: Artificial Intelligence, 4(2). https://doi.org/10.1148/ryai.210127.
Rouzrokh, P., Khosravi, B., Faghani, S., Moassefi, M., Garcia, D. V. V., Singh, Y., Zhang, K., Conte, G. M., & Erickson, B. J. (2022). Mitigating Bias in Radiology Machine Learning: 1. Data Handling. Radiology: Artificial Intelligence, 4(5). https://doi.org/10.1148/ryai.210290
Faghani, S., Khosravi, B., Zhang, K., Moassefi, M., Jagtap, J. M., Nugen, F., Vahdati, S., Kuanar, S. P., Rassoulinejad-Mousavi, S. M., Singh, Y., Vera Garcia, D. v., Rouzrokh, P., & Erickson, B. J. (2022). Mitigating Bias in Radiology Machine Learning: 3. Performance Metrics. Radiology: Artificial Intelligence, 4(5). https://doi.org/10.1148/ryai.220061.
Murphy KP. Probabilistic machine learning : advanced topics. Cambridge, Massachusetts: The MIT Press; 2023.

Institutional

School of Medical Sciences State University of Campinas - FCM/UNICAMP

Unicamp Clinical Hospital - HC/UNICAMP

Institutional Partnership

Aeronautics Institute of Technology - ITA

Full Changelog: https://github.com/davifmdhack/adeno_predict/commits/v.1.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

What's Changed

Contributors

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

GitHub Page Authors

Davi Ferreira MD., MSc.

Fernanda Veloso MD., PhD. candidate

Introduction

Dataset format

Imputation missing values

Workflow algorithms

Metrics and bootstrap implementation

Clone repository and application for domestic dataset

References

Institutional

Institutional Partnership

Uh oh!

Releases: davifmdhack/adeno_predict

Improvements - Adeno Predict model

What's Changed

Contributors

Uh oh!

Adeno Predict

GitHub Page Authors

Davi Ferreira MD., MSc.

Fernanda Veloso MD., PhD. candidate

Introduction

Dataset format

Imputation missing values

Workflow algorithms

Metrics and bootstrap implementation

Clone repository and application for domestic dataset

References

Institutional

Institutional Partnership

Uh oh!