An Integrated Method for Disease Risk Prediction Using Next-Generation Sequencing Data
Main Article Content
Abstract
Disease screening has seen increased adoption owing to heightened health awareness among individuals. Traditionally, wet labs have served as the conventional approach for testing; however, recent strides in bioinformatics have facilitated genetic testing and disease risk detection through computational analysis of data. This study presents a preprocessing methodology tailored for next-generation sequencing (NGS) data, integrating advanced computational tools. Leveraging the inherent advantages of NGS technology, this methodology ensures the acquisition of high-quality data essential for model training. Consequently, machine learning algorithms and neural networks are deployed to accurately predict disease risk and identify significant genetic variants. The performance of the proposed methods is higher than that of previous research. Through rigorous analysis, we have identified a subset of the most significant 8 SNPs linked to obesity and 61 SNPs associated with type 2 diabetes. These findings contribute to an understanding of the genetic factors underlying these complex diseases.