Abstract
The social network field has suffered significant transformations in the last 25 years, particularly with the introduction of social networks online, as well as incorporated studies from many other knowledge fields that adopt the social network approach in their analyses. This paper offers an overview of the evolution of research topics in this field between 1997 and 2021 based on topic modeling. The methodology used draws from the Scopus database, considering time windows of a year and using the software, Mallet. Seven topics are obtained, whose evolution over time is described. It is concluded that the topics related to social media and social networks online have been studied with special intensity in the last years.
References
Agarwal, T., Arya, S. y Bhasin, K. (2021). The evolution of internal employer branding and employee engagement: The temporal role of internal social media usage. Journal of Information and Knowledge Management, 20(1), 2150012. https://doi.org/10.1142/S021964922150012X
Aichner, T. y Jacob, F. H. (2015). Measuring the degree of corporate social media use. International Journal of Market Research, 57(2), 257-275. https://doi.org/10.2501/IJMR-2015-018
Alias-I (2016). LingPipe Home. Alias-i. http://www.alias-i.com/lingpipe/
Apache Software Foundation (2022, 25 de octubre). Welcome to Apache Lucene. ASF. https://lucene.apache.org
Armann-Keown, V. Y Patterson, L. (2020). Content analysis in library and information research: An analysis of trends. Library & Information Science Research, 42(4), art. 101048. https://doi.org/10.1016/j.lisr.2020.101048
Arruda, H. F., Costa, L. F. y Amancio, D. R. (2016). Topic segmentation via community detection in complex networks. Chaos (Woodbury, N.Y.), 26(6), 063120. http://dx.doi.org/10.1063/1.4954215
Ballester, O. y Penner, O. (2022). Robustness, replicability and scalability in topic modelling. Journal of Informetrics, 16 (1), 101224. https://doi.org/10.1016/j.joi.2021.101224
Banerjee, A. y Basu, S. (2007). Topic Models over Text Streams: A Study of Batch and Online Unsupervised Learning. En Proceedings of the seventh SIAM international conference on Data Mining (pp. 431-436). https://doi.org/10.1137/1.9781611972771.40
Berkowitz, S. D. (1982). An introduction to structural analysis: The network approach to social research. Butterworths.
Blei, D. M. y Lafferty, J. D. (2006). Dynamic topic models. En Proceedings of the 23rd International Conference on Machine Learning (ICML ’06, pp. 113-120). ACM Press. https://doi.org/10.1145/1143844.1143859
Blei, D. M., Ng, A. Y. y Jordan, M. I. (2003). Latent Dirichlet Allocation. Journal of Machine Learning Research, 3(4-5), 993-1022. https://jmlr.org/papers/volume3/blei03a/blei03a.pdf
Boyd, D. M. y Ellison, N. B. (2008). Social Network Sites: Definition, History, and Scholarship. Journal of Computer-Mediated Communication, 13(1), 210-230. https://doi.org/10.1111/j.1083-6101.2007.00393.x
Brigadir, I. (2022, 25 de octubre). Default English stopwords lists from many different sources. Github. https://github.com/igorbrigadir/stopwords
Buehling, K. (2021). Changing research topic trends as an effect of publication ranking: The case of German economists and the Handelsblatt Ranking. Journal of Informetrics, 15(3), 101199. https://doi.org/10.1016/j.joi.2021.101199
Bunnenberg, C., Logge, T. y Steffen, N. (2021). Social Media History. Historische Anthropologie, 29(2), 267-283. https://doi.org/10.7788/hian.2021.29.2.267
Cacheda Seijo, F., Fernández Luna, J. M. y Huete Guadix, J. F. (coords.) (2011). Recuperación de información: un enfoque práctico y multidisciplinar. Ra-Ma.
Chabowski, B. R. y Samiee, S. (2023). A bibliometric examination of the literature on emerging market MNEs as the basis for future research. Journal of Business Research, 155, art. 113263. https://doi.org/10.1016/j.jbusres.2022.08.027
Chang, Y.-W., Huang, M.-H. y Lin, C.-W. (2015). Evolution of research subjects in Library and Information Science based on keyword, bibliographical coupling, and co-citation analyses. Scientometrics, 105(3), 2071-2087. https://doi.org/10.1007/s11192-015-1762-8.
Chen, B., Tsutsui, S., Ding, Y. y Ma, F. (2017). Understanding the topic evolution in a scientific domain: An exploratory study for the field of information retrieval. Journal of Informetrics, 11(4), 1175-1189. https://doi.org/10.1016/j.joi.2017.10.003
Chen, C. (2006). CiteSpace II: Detecting and visualizing emerging trends and transient patterns in scientific literature. Journal of the American Society for Information Science and Technolog y, 57(3), 359-377. https://doi.org/10.1002/asi.20317
Cho, S. M., Park, C. y Song, M. (2020). The evolution of social health research topics: A data-driven analysis. Social Science & Medicine, 265, 113299. https://doi.org/10.1016/j.socscimed.2020.113299
Chodera, J. D. y Pande, V. S. (2011). The social network (of protein conformations). Proceedings of the National Academy of Sciences of the United States of America, 108(32), 12969- 12970. https://doi.org/10.1073/pnas.1109571108
Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K. y Harshman, R. (1990). Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41(6), 391-407. https://doi.org/10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9
Ding, W. y Chen, C. (2014). Dynamic topic detection and tracking: A comparison of HDP, C-word, and cocitation methods. Journal of the Association for Information Science and Technolog y, 65(10), 2084-2097. https://doi.org/10.1002/asi.23134
Ding, Y., Chowdhury, G. y Foo, S. (2001). Bibliometric cartography of information retrieval research by using co-word analysis. Information Processing y Management, 37(6), 817-842. https://doi.org/10.1016/S0306-4573(00)00051-0.
Ferrer, R., Solé, R. V. y Köhler, R. (2004). Patterns in syntactic dependency networks. Physical Review E, 69(5), 051915.http://dx.doi.org/10.1103/PhysRevE.69.051915.
Freeman, L. C. (1988). Computer programs in social network analysis. Connections, 11(2), 26-31. https://www.researchgate.net/publication/239060164_Computer_Programs_and_Social_Network_Analysis
Freeman, L. C. (2004). The development of social network analysis: A study in the sociolog y of science. Empirical Press. https://www.researchgate.net/publication/238341375_The_Development_of_Social_Network_Analysis_A_Study_in_the_Sociology_of_Science
Gálvez, C. (2019). Evolución del campo de investigación de los social media mediante mapas de la ciencia (2008-2017). Communication & Society, 32(2), 61-76. https://doi.org/10.15581/003.32.2.61-76
Gaul, W. y Vincent, D. (2017). Evaluation of the evolution of relationships between topics over time. Advances in Data Analysis and Classification, 11, 159-178. https://doi.org/10.1007/s11634-016-0241-2
Gore, D. J., Schueler, K., Ramani, S., Uvin, A., Phillips, G., McNulty, M., Fujimoto, K. y Schneider, J. (2021). HIV response interventions that integrate HIV molecular cluster and social network analysis: A systematic review. AIDS and Behavior, 26(6), 1750-1792. https://doi.org/10.1007/s10461-021-03525-0
Graham, S., Weingart, S. y Milligan, I. (2021, 3 de septiembre). Getting started with topic modeling and Mallet. https://programminghistorian.org/en/lessons/topic-modeling-and-mallet
Griffiths, T., Steyvers, M. (2004). Finding scientifics topics. Proceedings of the National Academy of Sciences of the United States of America, 101(suppl. 1), 5228-5235. https://doi.org/10.1073/pnas.0307752101
Ha, I., Park, H. y Kim, C. (2014). Analysis of Twitter research trends based on SLR. En 16th International Conference on Advanced Communication Technolog y (pp. 774-778). IEEE. https://doi.org/10.1109/ICACT.2014.6779067
Han, X. (2020). Evolution of research topics in LIS between 1996 and 2019: An analysis based on latent Dirichlet allocation topic model. Scientometrics, 125(3), 2561-2595. https://doi.org/10.1007/s11192-020-03721-0
Harary, F. (1969). The Graph Theory. Addison-Wesley Publishing Company.
Heider, F. (1946). Attitudes and cognitive organization. The Journal of Psycholog y, 21, 107-112. https://doi.org/10.1080/00223980.1946.9917275
Hofmann, T. (1999). Probabilistic latent semantic indexing. En Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval (pp. 50-57). Association for Computing Machinery. https://doi.org/10.1145/312624.312649
Isasi, J. (2022, 15 de noviembre). Modelado de temas con Mallet. https://repositories.lib.utexas.edu/handle/2152/72737
Jeong, D. H. y Min, S. (2014). Time gap analysis by the topic model-based temporal technique. Journal of Informetrics, 8(3), 776-790. https://doi.org/10.1016/j.joi.2014.07.005
Johnson, B. J. (2019, 23 de febrero). Contar todas las palabras diferentes en un archivo de texto. LWP, lawebdelprogramador. https://www.lawebdelprogramador.com/foros/Java/1685229-Contar-todas-las-palabras-diferentes-en-un-archivo-de-texto.html
Jung, S. y Yoon, W. C. (2020). An alternative topic model based on Common Interest Authors for topic evolution analysis. Journal of Informetrics, 14(3), 101040. https://doi.org/10.1016/j.joi.2020.101040
Kai, H., Qing, L., Kunlun, Q., Siluo, Y., Jin, M., Xiaokang, F., Jie, Z., Huayi, W., Ya, G. y Qibing, Z. (2019). Understanding the topic evolution of scientific literatures like an evolving city: Using Google Word2Vec model and spatial autocorrelation analysis. Information Processing and Management, 56(4), 1185-1203. https://doi.org/10.1016/j.ipm.2019.02.014
Kleinberg, J. (2002). Bursty and hierarchical structure in streams. Data Mining y Knowledge Discovery, 7(4), 373-397. https://doi.org/10.1145/775047.775061
Knoke, D. y Kuklinski, J. H. (1982). Network analysis. Sage.
Kumar, K. (2018, 3 de mayo). Evaluation of topic modeling: Topic coherence. https://datascienceplus.com/evaluation-of-topic-modeling-topic-coherence/
Landauer, T. K., McNamara, D. S., Dennis, S. y Kintsch, W. (2007). Handbook of latent semantic analysis. Taylor y Francis Group. https://doi.org/10.4324/9780203936399
Leydesdorff, L. (2007). On the normalization and visualization of author co-citation data: Salton’s Cosine versus the Jaccard index. Journal of the American Society for Information Science and Technolog y, 59(1), 77-85. https://doi.org/10.1002/asi.20732
Li, X. y Lei, L. (2021). A bibliometric analysis of topic modelling studies (2000-2017). Journal of Information Science, 47(2), 161-175. https://doi.org/10.1177/0165551519877049
Li, Y., Wu, Y. y Chen, Y. (2021). A review of enterprise social media: visualization of landscape and evolution. Internet Research, 31(4), 1203-1235. https://doi.org/10.1108/INTR-07-2020-0389.
Liu, H., Chen, Z., Tang, J., Zhou, Y. y Liu, S. (2020). Mapping the technology evolution path: a novel model for dynamic topic detection and tracking. Scientometrics, 125(3), 2043-2090. https://doi.org/10.1007/s11192-020-03700-5
Ma, J., Wang, L., Zhang, Y.-R., Yuan, W. y Guo, W. (2023). An integrated latent Dirichlet allocation and Word2vec method for generating the topic evolution of mental models from global to local. Expert Systems With Applications, 212, 118695. https://doi.org/10.1016/j.eswa.2022.118695
Mallet (2022, 12 de julio). Importing data Mallet. Mallet. https://mimno.github.io/Mallet/import.html
McCallum, A. K. (2022a, 15 de noviembre). MALLET: A Machine Learning for Language Toolkit. https://mallet.cs.umass.edu/index.php/Main_Page
McCallum, A. K. (2022b,15 de noviembre). Topic model diagnostics. https://mallet.cs.umass.edu/diagnostics.php
McCallum, A., Wang, X. y Corrada-Emmanuel, A. (2007). Topic and role discovery in social networks with experiments on Enron and academic email. Journal of Artificial Intelligence Research, 30, 249-272.
McCandless, M., Hatcher, E. y Gospodnetic, O. (2010). Lucene in action. Manning.
Mikolov, T., Chen, K., Corrado, G. y Dean, J. (2013). Efficient estimation of word representations invector space. arXiv, 1301.3781 [cs.CL]. https://doi.org/10.48550/arXiv.1301.3781
Moreno, J. L. (1937). Inter-personal therapy and the psychopathology of inter-personal relations. Sociometry, 1(1-2), 9-76. https://doi.org/10.2307/2785258
Moreno, J. L., Jennings, H. H. (1938). Statistics of social configurations. Sociometry, 1(3-4), 342-373. https://doi.org/10.2307/2785588
NLTK (2022, 25 de octubre). Natural Language Toolkit. NLTK. https://www.nltk.org
Onyancha, O. B. (2018). Forty-five years of LIS research evolution, 1971-2015: An informetrics study of the author-supplied keywords. Publishing Research Quarterly, 34(3), 456-470. https://doi.org/10.1007/s12109-018-9590-3
OpenNLP (2022, 5 de junio). SnowballStemmer (Apache OpenNLP Tools 1.8.0 API). Open-NLP Tools. https://opennlp.apache.org/docs/1.8.0/apidocs/opennlp-tools/opennlp/tools/stemmer/snowball/SnowballStemmer.html
Otte, E. y Rousseau, R. (2002). Social network analysis: A powerful strategy, also for the information sciences. Journal of Information Science, 28(6), 441-453. https://doi.org/10.1177/016555150202800601
Pappi, F. U. y Stelck, K. (1987). Ein Databanksystem zur Netzwerkanalyse. En Pappi, F. U. (ed.), Methoden Netzwerkanalyse (1st ed., pp. 253-265). Oldenberg.
Peset, F., Garzón-Farinos, F., González, L. M. et al. (2020). Survival analysis of author keywords: An application to the library and information sciences area. Journal of the Association for Information Science and Technolog y, 71(4), 462-473. https://doi.org/10.1002/asi.24248
Pohlert, T. (2022, 26 de marzo). Non-parametric trend tests and change-point detection. R project. https://cran.r-project.org/web/packages/trend/vignettes/trend.pdf
R Core Team (2022, 15 de noviembre). Hclust function: Hierarchical Clustering. https://www.rdocumentation.org/packages/stats/versions/3.6.2/topics/hclust
Ricci, R. (2018). Movimentos e mobilizaçōes sociais no Brasil: de 2013 aos dias atuais. Saúde em Debate, 42, 90-107. https://doi.org/10.1590/0103-11042018S308.
Ridings, C. M., Gefen, D. y Arinze, B. (2002). Some antecedents and effects of trust in virtual communities. The Journal of Strategic Information Systems, 11(3-4), 271-295. https://doi.org/10.1016/s0963-8687(02)00021-15
Shan, B. y Li, F. (2010). A survey of topic evolution based on LDA. Journal of Chinese Information Processing, 24(6), 43-50.
Sharma, S. y Verma, H. V. (2018). Social media marketing: Evolution and change. En G. Heggde y G. Shainesh (eds.). Social Media Marketing: Emerging Concepts and Applications, pp. 19-36. Springer.
Shen, X. y Wang, L. (2020). Topic evolution and emerging topic analysis based on open source software. Journal of Data and Information Science, 5(4), 126-136. https://doi.org/10.2478/jdis-2020-0033
Shibuya, Y., Hamm, A. y Pargman, T. C. (2022). Mapping HCI research methods for studying social media interaction: A systematic literature review. Computers in Human Behavior, 129, 107131. https://doi.org/10.1016/j.chb.2021.107131
Silge, J. (2018, 8 de septiembre). Training, evaluating, and interpreting topic models. https://juliasilge.com/blog/evaluating-stm/
Singhal, A. (2001). Modern information retrieval: a brief overview. Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, 24(4), 35-43. http://www1.cs.columbia.edu/~gravano/Qual/Papers/singhal.pdf
Snowball (2021, 25 de octubre). Snowball. Snowball. https://snowballstem.org
Song, M., Heo, G. E. y Kim, S. Y. (2014). Analyzing topic evolution in bioinformatics: Investigation of dynamics of the field with conference data in DBLP. Scientometrics, 101(1), 397-428. https://doi.org/10.1007/s11192-014-1246-2
Song, J., Huang, Y., Qi, Y., Li, Y., Li, F., Fu, K. y Huang, T. (2016). Discovering Hierarchical Topic Evolution in Time-Stamped Documents. Journal of the Association for Information Science and Technolog y, 67(4), 915-927. https://doi.org/10.1002/asi.23439
Statista, We Are Social, Hootsuite y DataReportal (2022, 26 de enero). Most popular social networks worldwide as of Januay 2022, ranked by number of monthly active users (in millions). https://www.statista.com/statistics/272014/global-social-networks-ranked-bynumber-of-users/
Sueur, C. y Pelé, M. (2016). Social network and decision-making in primates: A report on Franco-Japanese research collaborations. Primates, 57(3), 327-332. https://doi.org/10.1007/s10329-015-0505-z
Suominen, A. y Toivanen, H. (2015). Map of science with topic modeling: Comparison of unsupervised learning and human-assigned subject classification. Journal of the Association for Information Science and Technolog y, 67(10), 2464-2476. https://doi.org/10.1002/asi.23596.
Taipale, S. y Farinosi, M. (2018). The big meaning of small messages: The use of WhatsApp in intergenerational family communication. En J. Zhou, J. y Salvendy, G. (eds.), Human aspects of IT for the aged population. Acceptance, Communication and Participation (pp. 532-546). Springer. https://doi.org/10.1007/978-3-319-92034-4_40
Tdk Technologies (2020, 12 de noviembre). Topic modeling explained: LDA to Bayesian Inference. https://www.tdktech.com/tech-talks/topic-modeling-explained-lda-to-bayesian-inference/
Tuomaala, O., Järvelin, K. y Vakkari, P. (2014). Evolution of library and information science, 1965-2005: Content analysis of journal articles. Journal of the Association for Information Science and Technolog y, 65(7), 1446-1462. https://doi.org/10.1002/asi.23034
Wang, G. y Robinson, R. (2002). An architecture for web-enabled engineering applications based on lightweight high-performance CORBA. En Williams, A. D. (ed.), Proceedings of the 6th International Enterprise Distributed Object Computing Conference (pp. 249-257). IEEE Computer Society. https://doi.org/10.1109/EDOC.2002.1137714
Wang, X. y McCallum, A. (2006). Topics over time: A non-Markov continuous-time model of topical trends. En Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’06, pp. 424-433). ACM Press https://doi.org/10.1145/1150402.1150450
Wasserman, S. y Faust, K. (1994). Social network analysis in the social and behavioral sciences. En Social network analysis: Methods and applications (Structural Analysis in the Social Sciences, pp. 3-27). Cambridge University Press. https://doi.org/10.1017/CB09780511815487.002
Wu, Q., Zhang, C., Hong, Q. y Chen, L. (2014). Topic evolution based on LDA and HMM and its application in stem cell research. Journal of Information Science, 40(5), 611-620. https://doi.org/10.1177/0165551514540565
Xu, S., Hao, L., An, X., Yang, G. y Wang, F. (2019). Emerging research topics detection with multiple machine learning models. Journal of Informetrics, 13(4), 100983. https://doi.org/10.1016/j.joi.2019.100983
Yang, C., Tang, X., Kim, S. Y. y Song, M. (2012). A trend analysis of domain-specific literatures with content and co-author network similarity. En Chen, H. H. y Chowdhury, G. (eds.), The 14th International Conference on Asia-Pacific Digital Libraries (ICADL 2012, pp. 73-76). Springer. https://doi.org/10.1007/978-3-642-34752-8_10
Yanhui, S., Lijuan, W. y Junping, Q. (2021). A comparative study of first and all-author bibliographic coupling analysis based on Scientometrics. Scientometrics, 126(2), 1125-1147. https://doi.org/10.1007/s11192-020-03798-7
Yau, C. K., Porter, A., Newman, N. y Suominen, A. (2014). Clustering scientific documents with topic modeling. Scientometrics, 100(3), 767-786. https://doi.org/10.1007/s11192-014-1321-8
Yu, Z., Sukjairungwattana, P. y Xu, W. (2023). Bibliometric analyses of social media for educational purposes over four decades. Frontiers in Psycholog y, 13, 1061989. https://doi.org/10.3389/fpsyg.2022.1061989
Zanardo, N., Parra, G. J., Diaz-Aguirre, F., Pratt, E. A. L. y Möller, L. M. (2018). Social cohesion and intra-population community structure in southern Australian bottlenose dolphins. Behavioral Ecolog y and Sociobiolog y, 72(9), 1-13. https://doi.org/10.1007/s00265-018-2557-8
Zhang, J., Chen, H., Chan, H. C. B. y Leung, V. C. M. (2009). PUCS: Personal unified communications over heterogeneous wireless networks. En Ramasubramanian, S. y Aracil-Rico, J. (eds.), Proceedings of the 2009 6th International Conference on Broadband Communications, Networks and Systems, BROADNETS 2009 (article number 5336353). https://doi.org/10.4108/ICST.BROADNETS2009.7851
Zhang, Y., Lu, J., Liu, F., Liu, Q., Porter, A., Chen, H. y Zhang, G. (2018). Does deep learning help topic extraction? A kernel k-means clustering method with word embedding. Journal of Informetrics, 12(4), 1099-1117. https://doi.org/10.1016/j.joi.2018.09.004
Zhou, H., Yu, H. y Hu, R. (2017). Topic evolution based on the probabilistic topic model: a review. Frontiers of Computer Science, 11(5), 786-802. https://doi.org/10.1007/s11704-016-5442-5
Zhu, M., Zhang, X. y Wang, H. (2016). A LDA based model for topic evolution: Evidence from Information Science journals. En Proceedings of 2016 International Conference on Modeling, Simulation and Optimization Technologies and Applications (pp. 49-54). Atlantis Press. https://doi.org/10.2991/msota-16.2016.12
Zou, C. (2018). Analyzing research trends on drug safety using topic modeling. Expertopinion on drug safety, 17(6), 629-636. https://doi.org/10.1080/14740338.2018.1458838
Authors:
- They must sent the publication authorization letter to Investigación Bibliotecológica: archivonomía, bibliotecología e información.
- They can share the submission with the scientific community in the following ways:
- As teaching support material
- As the basis for lectures in academic conferences
- Self-archiving in academic repositories.
- Dissemination in academic networks.
- Posting to author’s blogs and personal websites
These allowances shall remain in effect as long as the conditions of use of the contents of the journal are duly observed pursuant to the Creative Commons:Attribution-NonCommercial-NoDerivatives 4.0 license that it holds. DOI links for download the full text of published papers are provided for the last three uses.
Self-archiving policy
For self-archiving, authors must comply with the following
a) Acknowledge the copyright held by the journal Investigación Bibliotecológica: archivonomía, bibliotecología e información.
b) Establish a link to the original version of the paper on the journal page, using, for example, the DOI.
c) Disseminate the final version published in the journal.
Licensing of contents
The journal Investigación Bibliotecológica: archivonomía, bibliotecología e información allows access and use of its contents pursuant to the Creative Commons license: Attribution- Non-commercial-NoDerivatives 4.0.
Investigación Bibliotecológica: archivonomía, bibliotecología e información by Universidad Nacional Autónoma de México is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 Internacional License.
Creado a partir de la obra en http://rev-ib.unam.mx/ib.
This means that contents can only be read and shared as long as the authorship of the work is acknowledged and cited. The work shall not be exploited for commercial ends nor shall it been modified.
Limitation of liability
The journal is not liable for academic fraud or plagiarism committed by authors, nor for the intellectual criteria they employ. Similarly, the journal shall not be liable for the services offered through third party hyperlinks contained in papers submitted by authors.
In support of this position, the journal provides the Author’s Duties notice at the following link: Responsibilities of authors.
The director or editor of the journal shall notify authors in the event it migrates the contents of the journal’s official website to a distinct IP or domain.