A Text Mining Implementation Based on Twitter Data to Analyse Information Regarding Corona Virus in Indonesia

Enda Esyudha Pratama, Rizqia Lestika Atmi

Abstract


CORONA virus outbreak (COVID-19) began to infect almost all countries in early 2020 including Indonesia. Since its distribution, various information has been spread in the community from various sources, one of them is social media. Various terms also appear on social media related to the corona virus. This study analyzes related terms that emerge from social media-based. The data used was sourced from Twitter in the past month where the data processed was text data. The method used is text mining. Text Mining is a method used to extract important information from a group of texts. From the results of the research conducted, there are several terms or information that tend to appear frequently on social media, namely “PSBB”, “new normal”, “karantina”, and “juru bicara Dr. Reisa”.


Full Text:

PDF

References


Chen, E., Lerman, K., & Ferrara, E. (2020). Tracking social media discourse about the COVID-19 pandemic: development of a public coronavirus Twitter data set. JMIR Public Health and Surveillance, 6(2), 1-9.

Gupta, V., & Lehal, G. S. (2009). A survey of text mining techniques and applications. Journal of emerging technologies in web intelligence, 1(1), 60-76.

Lai, C. C., Shih, T. P., Ko, W. C., Tang, H. J., & Hsueh, P. R. (2020). Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and corona virus disease-2019 (COVID-19): the epidemic and the challenges. International journal of antimicrobial agents, 55, 1-9.

Feldman, R., & Sanger, J. (2007). The text mining handbook: advanced approaches in analyzing unstructured data. Cambridge university press.

Liao, S. H., Chu, P. H., & Hsiao, P. Y. (2012). Data mining techniques and applications–A decade review from 2000 to 2011. Expert systems with applications, 39(12), 11303-11311.

Gaikwad, S. V., Chaugule, A., & Patil, P. (2014). Text mining methods and techniques. International Journal of Computer Applications, 85(17), 42-45.

Berry Michael, W. (2004). Automatic discovery of similar words. Survey of Text Mining: Clustering, Classification and Retrieval”, Springer Verlag, New York, LLC, 24-43.

Hakim, A. A., Erwin, A., Eng, K. I., Galinium, M., & Muliady, W. (2014). Automated document classification for news article in Bahasa Indonesia based on term frequency inverse document frequency (TF-IDF) approach. In 2014 6th International Conference on Information Technology and Electrical Engineering (ICITEE) (pp. 1-4). IEEE.

Mediayani, M., Wibisono, Y., Riza, L. S., & Pérez, A. R. (2019). Determining trending topics in twitter with a data-streaming method in R. Indonesian Journal of Science and Technology, 4(1), 148-157.

Riza, L. S., Putra, B., Wihardi, Y., & Paramita, B. (2019). Data to text for generating information of weather and air quality in the R programming language. Journal of Engineering Science and Technology, 14(1), 498-508.

Riza, L. S., Pertiwi, A. D., Rahman, E. F., Munir, M., & Abdullah, C. U. (2019). Question Generator System of Sentence Completion in TOEFL Using NLP and K-Nearest Neighbor. Indonesian Journal of Science and Technology, 4(2), 294-311.

Eslami, B., Rezaei, Z., Habibzadeh, M., Fouladian, M., & Ebrahimpour-Komleh, H. (2020). Using deep learning methods for discovering associations between drugs and side effects based on topic modeling in social network. Social Network Analysis and Mining, 10, 1-17.

Liu, Y., Peng, H., Li, J., Song, Y., & Li, X. (2020). Event detection and evolution in multi-lingual social streams. Frontiers of Computer Science, 14(5), 1-15.

Salloum, S. A., Al-Emran, M., Monem, A. A., & Shaalan, K. (2017). A survey of text mining in social media: facebook and twitter perspectives. Adv. Sci. Technol. Eng. Syst. J, 2(1), 127-133.

Riza, L. S., Janusz, A., Bergmeir, C., Cornelis, C., Herrera, F., Ślezak, D., & Benítez, J. M. (2014). Implementing algorithms of rough set theory and fuzzy rough set theory in the R package “RoughSets”. Information Sciences, 287, 68-89.

Riza, L.S., Bergmeir, C., Herrera, F., Benítez, J.M. (2015). frbs: Fuzzy rule-based systems for classication and regression in R. Journal of Statistical Software, 65(6), 1-30.

Riza, L. S., Nasrulloh, I. F., Junaeti, E., Zain, R., & Nandiyanto, A. B. D. (2016, August). gradDescentR: An R package implementing gradient descent and its variants for regression tasks. In 2016 1st International Conference on Information Technology, Information Systems and Electrical Engineering (ICITISEE) (pp. 125-129). IEEE.

Riza, L. S., Rachmat, A. B., Munir, T. H., & Nazir, S. (2019). Genomic Repeat Detection Using the Knuth-Morris-Pratt Algorithm on R High-Performance-Computing Package. Int. J. Advance Soft Compu. Appl, 11(1), 94-111.

Riza, L. S., Pratama, F. D., Piantari, E., & Fashi, M. (2020). Genomic repeats detection using Boyer-Moore algorithm on Apache Spark Streaming. Telkomnika, 18(2), 783-791.




DOI: https://doi.org/10.17509/jcs.v1i1.25502

Refbacks

  • There are currently no refbacks.