Penerapan LSI untuk Topic Modelling Skripsi Sistem Informasi di Beberapa PTN Surabaya

Authors

  • Oryza Sativa Nufi UIN Sunan Ampel Surabaya
  • Fitriya Mawadah Warahmah UIN Sunan Ampel Surabaya
  • Ailsa Aurellia UIN Sunan Ampel Surabaya
  • Mohammad Khusnu Milad UIN Sunan Ampel Surabaya

Keywords:

Information Systems, Latent Semantic Indexing, Topic Modelling, Latent Semantic Indexing, Pemodelan Topik, Sistem Informasi

Abstract

This study applies Latent Semantic Indexing (LSI) for topic modelling on undergraduate thesis titles from Information Systems students at four public universities in Surabaya. The purpose of this research is to explore dominant research themes and academic trends that have emerged over recent years. A text mining approach is employed, where hundreds of thesis titles are collected from each university and pre-processed through tokenization, stopword removal, stemming, and term weighting using TF-IDF. The LSI method is then used to extract latent topics by reducing the dimensionality of the term-document matrix through singular value decomposition (SVD). The results indicate the presence of several dominant topics such as information system development, decision support systems, and data mining. These topics reflect the recurring areas of interest in the Information Systems curriculum across the universities studied. The study concludes that LSI is effective in identifying hidden semantic patterns and grouping thesis topics into meaningful clusters. These findings may support curriculum development and academic planning by highlighting students' thematic focus and institutional research directions.

References

Aggarwal, C. C., & Zhai, C. X. (2013). Mining text data. In Mining Text Data (Vol. 9781461432). https://doi.org/10.1007/978-1-4614-3223-4

Alghamdi, R., & Alfalqi, K. (2015). A Survey of Topic Modeling in Text Mining. International Journal of Advanced Computer Science and Applications, 6(1), 147–153. https://doi.org/10.14569/ijacsa.2015.060121

Dayera, Musa Bundaris Palungan, F. O. (2024). G-Tech : Jurnal Teknologi Terapan. G-Tech : Jurnal Teknologi Terapan, 8(1), 186–195. https://ejournal.uniramalang.ac.id/index.php/g-tech/article/view/1823/1229

Fernando, E. H., & Toba, H. (2020). Pemanfaatan Latent Semantic Indexing untuk Mengukur Potensi Kerjasama Jurnal Ilmiah Lintas Universitas. Jurnal Teknik Informatika Dan Sistem Informasi, 6(3), 489–503. https://doi.org/10.28932/jutisi.v6i3.2894

Guntoorkar, P., & Pearce, C. (n.d.). Latent Semantic Indexing : A Regularized approach to large-scale modeling.

Kontostathis, A., & Pottenger, W. M. (2006). A framework for understanding Latent Semantic Indexing (LSI) performance. Information Processing and Management, 42(1 SPEC. ISS), 56–73. https://doi.org/10.1016/j.ipm.2004.11.007

Lossio-Ventura, J. A., Gonzales, S., Morzan, J., Alatrista-Salas, H., Hernandez-Boussard, T., & Bian, J. (2021). Evaluation of clustering and topic modeling methods over health-related tweets and emails. Artificial Intelligence in Medicine, 117(May 2020), 102096. https://doi.org/10.1016/j.artmed.2021.102096

Rani, M., Dhar, A. K., & Vyas, O. P. (2017). Semi-automatic terminology ontology learning based on topic modeling. Engineering Applications of Artificial Intelligence, 63, 108–125. https://doi.org/10.1016/j.engappai.2017.05.006

Reynolds, T. P., & Mesbahi, M. (2020). The Crawling Phenomenon in Sequential Convex Programming. Proceedings of the American Control Conference, 2020-July, 3613–3618. https://doi.org/10.23919/ACC45564.2020.9147550

Wang, Q., Xu, J., Li, H., & Craswell, N. (2013). Regularized latent semantic indexing: A new approach to large-scale topic modeling. ACM Transactions on Information Systems, 31(1). https://doi.org/10.1145/2414782.2414787

Downloads

Published

2025-05-15

Issue

Section

Articles