Analisis Sentimen Pada Agen Perjalanan Online Menggunakan Naïve Bayes dan K-Nearest Neighbor
Abstract
Social media has impact for decision maker to get more insights broadly. Including for online travel agent company, where costumer’s interest to use online travel agent for their chosen agent will grows along with the high number of customer’s satisfaction. As a one of the most important point in distribution, company provides a platform that reliable and effective to purchase a trip and share information of their experience through Online travel agent. It is important to know how consumer considerate which one the online travel agent they choose. One of their method is looking at the reviews. Facebook is one of social media that provide numerous reviews through comments sections. The research purposes are twofold, algorithm comparison and reveal the effect of uppercase as well as punctuation mark. The accuracy comparison between Naïve Bayes and K-Nearest Neighbor is provided against the datasets. This research collects the data from user comments on Facebook about the biggest three online travel agents in Indonesia. We classify the comments into three categories which are positive, negative, and neutral. The result of this research is found that K-Nearest Neighbor have slightly higher accuracy than the Naïve Bayes. Moreover, lowercase text without punctuation achieves better accuracy for both of algorithm.
Downloads
References
S. . Miaha, H. . Vub, J. Gammackc, and M. McGrath, “A Big Data Analytics Method for Tourist Behaviour Analysis,” Inf. Manag., vol. 54, pp. 771–785, 2016.
M. . Navarro, C. . Manero, M. . Campillo, and M. P. Iglesias, Strenght of Online Travel Agencies from the Perspective of Digital Tourist. USA: IGI Global, 2019.
F. Zebua, “Laporan DailySosial: Survey Online Travel Agencies (OTA) 2018,” Daily Social, 2018. https://dailysocial.id/post/laporan-dailysocial-survey-online-travel-agencies-ota-2018 (accessed Jun. 01, 2021).
M. Ghiassi and S. Lee, “A domain transferable lexicon set for Twitter sentiment analysis using a supervised machine learning approach,” Expert Syst. witf Appl., vol. 106, pp. 197–216, 2018.
B. Agarwal, N. Mittal, P. Bansal, and S. Garg2, “Sentiment Analysis Using Common-Sense and Context Information,” Comput. Intell. Neurosci., vol. 2015, pp. 1–9, 2015.
J. . Matthew, G. Spencer, and Z. Andrea, “Potential applications of sentiment analysis in educational research and practice – Is SITE the friendliest conference? In: Slykhuis, D., Marks, G. (Eds.),” 2015.
R. S. Perdana and A. Pinandito, “Combining Likes-Retweet Analysis and Naïve Bayes Classifier within Twitter for Sentiment Analysis,” Univesitas Brawijaya, 2018.
U. Shafique and H. Qaiser, “A Comparative Study of Data Mining Process Models (KDD, CRISP- DM and SEMMA),” Int. J. Innov. Sci. Res., vol. 12, no. 1, pp. 217–222, 2014.
A. G. Novianti and D. Prasetyo, “Penerapan Algoritma K-Nearest Neighbor (K-NN) untuk Prediksi Waktu Kelulusan Mahasiswa,” in Seminar Nasional APTIKOM (SEMNASTIKOM), 2017, pp. 108–113.
D.A.Adeniyi, Z.Wei, and Y.Yongquan, “Automated web usage data mining and recommendation system using K-Nearest Neighbor (KNN) classification method,” Appl. Comput. Informatics, vol. 12, no. 1, pp. 90–108, 2016.
S. Hussain, N. A. Dahan, F. M. Ba-Alwib, and N. Ribata, “Educational Data Mining and Analysis of Students’ Academic Performace Using WEKA,” Indones. J. Electr. Eng. Comput. Sci., vol. 9, pp. 447–459, 2018.
Kou, “WEKA Packages,” Weka, 2016. https://weka.sourceforge.io/packageMetaData/ (accessed Jun. 01, 2021).
M. Azam, T. Ahmed, F. Sabah, and M. . Hussain, “Feature Extraction based Text Classification using K-Nearest Neighbor Algorithm,” IJCSNS Int. J. Comput. Sci. Netw. Secur., vol. 18, no. 12, 2018.
A. A, Y. A, and V. DK, “Multimodal sentiment analysis via RNN variants,” in In IEEE international conference on big data, cloud computing, data science and engineering (BCD), 2019, pp. 19–23.
M. S. Bhatia, K. Sharma, and K. Bhatia, “Strategies for mining opinions: A survey,” in Proceedings of the IEEE 2nd International Conference on Computing for Sustainable Global Development, 2015, pp. 262–266.
W. Medhat, A. Hassan, and H. Korashy, “Sentiment analysis algorithms and applications: A survey,” Ain Shams Eng. J., vol. 5, no. 4, pp. 1093–1113, 2014.
S. Ahire, “A Survey of Sentiment Lexicons,” 2015.
S. Akter and M. T. Aziz, “Sentiment analysis on facebook group using lexicon-based approach.,” 2016.
D. Anand and D. Naorem, “Semi-supervised Aspect Based Sentiment Analysis for Movies Using Review Filtering,” Procedia Comput. Sci., vol. 84, pp. 86–93, 2016.
M. Meire, M. Ballings, and D. Van den Poel, “The added value of auxiliary data in sentiment analysis of Facebook posts,” Decis. Support Syst., vol. 89, pp. 93–112, 2016.
V. A. Rohani and S. Shayaa, “Utilizing machine learning in Sentiment Analysis: SentiRobo approach,” 2015.
B. Samal, A. K. Behera, and M. Panda, “Performance analysis of supervised machine learning techniques for sentiment analysis,” 2017.
M. S. Bhatia, K. K. B. Sharma, and P. Das, “Opinion Target Extraction with Sentiment Analysis,” Int. J. Comput., vol. 17, no. 3, pp. 136–142, 2018.
L. Zhang, L. Jiang, C. Li, and G. Kon, “Two feature weighting approaches for naive Bayes text classifiers,” Knowledge-Based Syst., vol. 100, pp. 137–144, 2016.
A. S. Rathor, A. Agarwal, and P. Dimri, “Comparative Study of Machine Learning Approaches for Amazon Reviews,” Procedia Comput. Sci., vol. 132, pp. 1552–1561, 2018.
M. Wongkar and A. Angdresey, “Sentiment Analysis Using Naive Bayes Algorithm Of The Data Crawler: Twitter,” 2019.
M. Husnain, M. M. S. Missen, N. Akhtar, M. Coustaty, S. Mumtaz, and V. B. S. Prasath, “A systematic study on the role of SentiWordNet in opinion mining,” Front. Comput. Sci., vol. 15, 2021.
A. Hasan, S. Moin, A. Karim, and S. Shamshirband, “Machine Learning-Based Sentiment Analysis for Twitter Accounts,” Math. Comput. Appl., vol. 23, no. 1, 2018.
J. Diz, G. Marreiros, and A. Freitas, “Applying Data Mining Techniques to Improve Breast Cancer Diagnosis,” J. Med. Syst., vol. 40, no. 9, 2016.
K. Chomboon, P. Chujai, P. Teerarassamee, K. Kerdprasop, and N. Kerdprasop, “An Empirical Study of Distance Metrics for K-Nearest Neighbor Algorithm,” 2015.
G. Tripathi and G. Singh, “Sentiment Analysis Approach based N-gram and KNN Classifier,” Int. J. Adv. Res. Comput. Sci., vol. 9, no. 3, 2018.
A. Allahverdipoor and F. . Gharehchopogh, “An Improved K-nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Document Classification,” J. Adv. Comput. Res. Q. Sari Branch, Islam. Azad Univ., vol. 9, no. 2, pp. 37–48, 2018.
S. . Solloum, C. Mhamdi, M. Al-Emran, and K. Shaalan, “Analysis and Classification of Arabic Newspaper’s Facebook Page using Text Mining Techniques,” Int. J. Inf. Technol. Languange Stud., vol. 1, no. 2, pp. 8–17, 2017.
T.-T. Wong and P.-Y. Yeh, “Reliable Accuracy Estimates from k-Fold Cross Validation,” EEE Trans. Knowl. Data Eng., vol. 32, no. 8, pp. 1586–1594, 2019.
E. Morvant, S. Koco, and L. Ralaivola, “PAC-Bayesian Generalization Bound on Confusion Matrix for Multi-class Classification,” in International Conference on Machine Learning, 2018, pp. 815–822.
Copyright (c) 2022 Eka Wahyu Sholeha, Selviana Yunita, Rifqi Hammad, Veny Cahya Hardita, Kaharuddin Kaharuddin
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.