Analisis Akurasi Algoritma Pohon Keputusan dan K-Nearest Neighbor (K-Nn)

Huliman

Analisis Akurasi Algoritma Pohon Keputusan dan K-Nearest Neighbor (K-Nn)

dc.contributor.advisor	Mawengkang, Herman
dc.contributor.advisor	Nababan, Erna Budhiarti
dc.contributor.author	Huliman
dc.date.accessioned	2021-09-16T06:46:04Z
dc.date.available	2021-09-16T06:46:04Z
dc.date.issued	2013
dc.identifier.uri	http://repositori.usu.ac.id/handle/123456789/43496
dc.description.abstract	The development of modern database technology has enabled large space of storage and this concept has become the background of the data mining applications. One of the main functions of data mining is the classification that is used to predict the class and generate information based on historical data. In the classification, there is a lot of algorithms that can be used to process the input into the desired output, thus it is very important to observe and measure the performance of each algorithm. The purpose of this research is to analyze and compare the performance of decision tree (C4.5) and k- Nearest Neighbor (k-NN) algorithm from the point of view of accuracy. Data sets are derived from UCI data sets, namely BreastCancer, Car, Diabetes, Ionosphere, and Iris. The evaluation method used in both kinds of algorithms is 10-fold cross validation. Evaluation result for each algorithm is a confusion matrix for measuring the precision, recall, F-measure, and success rate. Comparative analysis of the accuracy showed that the accuracy of the decision tree algorithm is better by variation of 2.28% - 2.5% compared to k-NN algorithm in the implementation for 5 research data sets.	en_US
dc.description.abstract	Perkembangan teknologi basis data modern telah memungkinkan ruang penyimpanan yang besar dan hal ini menjadi latar belakang dikembangkannya konsep data mining. Salah satu fungsi utama data mining adalah fungsi klasifikasi yang digunakan untuk memprediksi kelas dan menghasilkan informasi berdasarkan data historis. Pada fungsi klasifikasi, terdapat banyak algoritma yang dapat digunakan untuk mengolah input menjadi output yang diinginkan, sehingga harus diperhatikan aspek performance dari masing-masing algoritma tersebut. Tujuan penelitian ini adalah untuk menganalisis dan membandingkan performance algoritma klasifikasi pohon keputusan (C4.5) dan k-Nearest Neighbor (k-NN) dari sudut pandang akurasi. Data sets penelitian berasal dari UCI data sets, yaitu BreastCancer, Car, Diabetes, Ionosphere, dan Iris. Adapun metode evaluasi yang digunakan pada kedua macam algoritma adalah 10-fold cross validation. Hasil evaluasi berupa confusion matrix untuk penilaian precision, recall, F-measure, dan success rate. Hasil analisis perbandingan akurasi menunjukkan bahwa nilai keakuratan algoritma pohon keputusan lebih baik dengan variasi 2.28% - 2.5% dibandingkan algoritma k-NN pada implementasi terhadap 5 data sets penelitian.	en_US
dc.language.iso	id	en_US
dc.publisher	Universitas Sumatera Utara	en_US
dc.subject	Klasifikasi	en_US
dc.subject	Pohon Keputusan	en_US
dc.subject	k-NN	en_US
dc.subject	10-fold Cross Validation	en_US
dc.subject	Confusion Matrix	en_US
dc.subject	Akurasi	en_US
dc.title	Analisis Akurasi Algoritma Pohon Keputusan dan K-Nearest Neighbor (K-Nn)	en_US
dc.type	Thesis	en_US
dc.identifier.nim	NIM117038025
dc.description.pages	125 Halaman	en_US
dc.description.type	Tesis Magister	en_US

Files in this item

Name:: 117038025.pdf
Size:: 3.397Mb
Format:: PDF
Description:: Fulltext

View/Open

This item appears in the following Collection(s)

Master Theses [627]
Tesis Magister

Show simple item record