Seleksi Fitur dengan Menggunakan Metode Information Gain pada Algoritma Logistic Regression pada Penyakit Diabetes
Feature Selection Using Information Gain Method in Logistic Regression Algorithm for Diabetes Disease

Date
2024Author
Simamora, Edward Bob
Advisor(s)
Harumy, T Henny Febriana
Metadata
Show full item recordAbstract
Diabetes is one of the significant global health problems with widespread impacts on
individual quality of life and a significant economic burden on healthcare systems. In
an effort to improve early diagnosis and understanding of factors influencing this
disease, the use of data analysis techniques has become increasingly important. One
approach used is the application of logistic regression algorithms, which provide
information on the probability of diabetes occurrence based on independent variables.
In this study, the use of Information Gain-based feature selection methods is explored
to enhance the performance of logistic regression algorithms in identifying risk factors
for diabetes. Information Gain method is employed to evaluate the relevance of
variables to the target class, i.e., the presence or absence of diabetes. In the
experimental process, a dataset consisting of clinical attributes such as age, body mass
index (BMI), blood pressure, and several other biochemical parameters is used. The
experimental results indicate that the use of Information Gain method for feature
selection can improve the performance of logistic regression models in predicting the
presence of diabetes. By reducing the dimensionality of irrelevant attributes, the
resulting model tends to have higher accuracy and can identify more significant risk
factors. This highlights the potential of Information Gain-based feature selection
methods in enhancing the efficiency and effectiveness of predictive analysis in diabetes.
Collections
- Undergraduate Theses [1181]