Unsupervised clustering analyses tend to be well suited for discovering hidden patterns of clustering, for example to propose a novel molecular taxonomy of cancers 14 or define subtypes of a psychiatric disorder 15. Researchers should start any ML project with clear project goals and an analysis of the advantages that AI, ML or conventional statistical techniques deliver in the specific clinical use case. ML techniques should be avoided when dealing with very small, but readily available, convenience clinical datasets.Ĭlinician–researchers should aim to procure and utilize large, harmonized multicenter or international datasets with high-resolution data, if feasible.Ī guideline on the choice of statistical approach, whether ML or traditional statistical techniques, would aid clinical researchers and highlight proper choices. The type of ML technique used should be chosen taking into account the type, size and dimensionality of the available dataset. Researchers should commit to developing interpretable and transparent ML algorithms that can be subjected to checks and balances.ĭatasets should be inspected for sources of bias and necessary steps taken to address biases. Publications using ML algorithms should be accompanied by disclaimers about their decision-making process, and their conclusions should be carefully formulated. Protocols should be published and peer reviewed whenever possible, and the choice of model should be stated and substantiated.Īll model performance parameters should be disclosed and, ideally, the dataset and analysis script should be made public. Whenever appropriate, (predefined) sensitivity analyses using traditional statistical models should be presented alongside ML models. Even very high AUCs are no guarantees of robustness, as an AUC of 0.99 with an overall event rate of <1% is possible, and would lead to all negative cases being predicted correctly, while the few positive events were not. When a traditional regression technique is applied and compared against ML algorithms, the more sophisticated ML models often offer only marginal accuracy gains, presenting a questionable trade-off between model complexity and accuracy 1, 2, 8, 9, 10, 11, 12. A high AUC is not necessarily a mark of quality, as the ML model might be over-fit (Fig. Many clinical studies have employed ML techniques to achieve respectable or impressive performance, as shown by area under the curve (AUC) values between 0.80 and 0.90, or even >0.90 (Box 1). In such cases, the use of AI and ML techniques is not necessarily inappropriate or unsound, but the justification for such research is unclear or artificial: for example, a novel technique may be proposed that delivers no meaningful new answers. The term ‘overuse’ refers to the unnecessary adoption of AI or advanced ML techniques where alternative, reliable or superior methodologies already exist. More than 200 prediction models were developed for COVID-19, some using ML, and virtually all suffer from poor reporting and high risk of bias 7. These ML algorithms were not explainable and, while appearing to be at the cutting edge, were inferior to traditional diagnostic techniques such as RT-PCR, obviating their usefulness. Instead, they exploited shortcuts or spurious associations that reflected biologically meaningless variations in image acquisition, such as laterality markers, patient positioning or differences in radiographic projection 6. These algorithms based their predictions on various data elements captured in electronic health records, such as chest radiographs.ĭespite their promising initial validation results, the success of numerous artificial neural networks trained on chest X-rays were largely not replicated when applied to different hospital settings, in part because the models failed to learn or understand the true underlying pathology of COVID-19. At the beginning of the COVID-19 pandemic, before the widespread adoption of reliable point-of-care assays to detect SARS-CoV-2, one highly active area of research involved the development of ML algorithms to estimate the probability of infection.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |