Optimasi Kinerja K-Means Clustering Menggunakan Pembobotan Symmetrical Uncertainty dalam Klasterisasi Data
View/ Open
Date
2022Author
Ginting, Suranta Bill Fatric
Advisor(s)
Sawaluddin
Zarlis, Muhammad
Metadata
Show full item recordAbstract
In some previous research, iK-Means iClustering ihas iseveral iweaknesses,
ione of which ilies iin ithe idistance imodel iused in idetermining ithe isimilarity
ibetween idata that igives ithe isame itreatment ito ieach idata iattribute, so that
iattributes ithat iare iless relevant iand ihave ilittle icontribution ito idata ivariation
ican iprovide isignificant iimpact on the results iof iclustering. This iof icourse ican
ireduce ithe iperformance iof K-Means Clustering. Attribute iweighting is one iway
ithat ican ibe iused ito iget ithe icorrelation of data iattributes ito idata ivariations. The
ihigher the iweight ivalue of ian iattribute, the greater ithe icorrelation ito the
ivariation iof the idata, so that ithe ilow iweight ivalue of an iattribute icertainly ihas
ia iismall icontribution to the variation of the data and can have ia isignificant iimpact
ion ithe iperformance iand iresults of iclustering. In ithis istudy, the imethod iused in
icalculating ithe iweight of data attributes is Symmetrical Uncertainty. To test the
proposed method, this research uses a dataset from UCI Machine Learning which
consists of Iris with 150 data and Wine Quality with 178 data. The ievaluation iof
ithe iproposed iclustering iperformance iis ibased ion ithe iDavies-Bouldin iIndex
(DBI) ivalue. The itest iresults iin ithis istudy ishow ithat the iproposed method ican
iproduce ia isignificantly ilarger Davies-Bouldin Index (DBI) value ikelemahan, salah isatunya iterletak ipada imodel ijarak iyang idigunakan idalam
ipenentuan ikemiripan antar idata iyang imemberikan iperlakuan iyang isama
iterhadap isetiap iatribut idata, sehingga iatribut iyang ikurang irelevan idan imemiliki
isedikit ikontribusi iterhadap variasi idata idapat imemberikan idampak iyang icukup
iberpengaruh iterhadap ihasil clustering. iHal iini itentu isaja idapat imenurunkan
ikinerja iK-Means iClustering. Pembobotan iatribut imerupakan isalah isatu icara
iyang idapat idigunakan iuntuk mendapatkan ikorelasi iatribut idata iterhadap ivariasi
idata. iSemakin itinggi inilai ibobot dari isuatu iatribut imaka isemakin ibesar
ikorelasinya iterhadap ivariasi idata, isehingga nilai ibobot iyang irendah idari isuatu
iatribut itentunya imemiliki isedikit ikontribusi terhadap ivariasi idata idan idapat
imemberikan idampak iyang icukup iberpengaruh terhadap ikinerja idan ihasil
iclustering. iPada ipenelitian iini, imetode iyang idigunakan dalam iperhitungan ibobot
iatribut idata iyaitu iSymmetrical iUncertainty. iUntuk melakukan ipengujian
iterhadap imetode iyang idiusulkan, imaka penelitian iini menggunakan idataset idari
iUCI iMachine iLearning iyang iterdiri idari iIris idengan jumlah idata isebanyak i150
idata idan iWine iQuality idengan ijumlah idata isebanyak i178 data. iEvaluasi ikinerja
clustering yang diusulkan berdasarkan nilai Davies-Bouldin Index (DBI). iHasil
ipengujian ipada ipenelitian iini iterlihat ibahwa idengan imetode iyang diusulkan
idapat imenghasilkan inilai iDavies-Bouldin iIndex (DBI) iyang isignifikan lebih
ikecil.
Collections
- Master Theses [621]