Recommender Systems and Natural Language Processing: A Structured Review and Emerging Trends Supported by Artificial Intelligence Tools
PDF (Português (Brasil))

Keywords

Recommender Systems
Structured Review
Artificial Intelligence
Natural Language Processing

How to Cite

Denise Fukumi, Ribeiro, P. F. R., Pires, J. de L., Hubner, K. V. ., Reis, M. H. A. dos, Bastos, P. A., & Rigo, R. . (2026). Recommender Systems and Natural Language Processing: A Structured Review and Emerging Trends Supported by Artificial Intelligence Tools. Investigación Bibliotecológica. Archivonomía, bibliotecología información, 40(106), 79–108. https://doi.org/10.22201/iibi.24488321xe.2026.106.59106
Métricas de PLUMX

Abstract

This article presents a structured literature review on recommender systems that use natural language processing (NLP), covering publications between 2020 and 2025. The final corpus included 240 fully analyzed articles after screening and deduplication (214 from 2020-2024 and 26 from 2025). Digital tools such as Zotero, Rayyan, SciSpace, NotebookLM, and Biblioshiny supported data curation and organization. The results highlight the predominance of deep learning techniques, with emphasis on models such as BERT, Word2Vec, and GPT, as well as the growing use of large language models (LLMs) and knowledge graphs. However, no records were found in the Revista Brasileira de Informática na Educação (RBIE) or the digital library of the Sociedade Brasileira de Computação (SBC-OpenLib), showing that, compared to global domains, scientific production in specific national niches is still incipient. This gap highlights relevant opportunities for transposing advanced natural language processing techniques to specific domains that are still underexplored at a national level, such as the educational context, in addition to fostering the training of researchers in the use of artificial intelligence-assisted systematic review methodologies.

https://doi.org/10.22201/iibi.24488321xe.2026.106.59106
PDF (Português (Brasil))

References

Baker, Ryan Shaun Joazeiro de, Seiji Isotani e Adriana Maria Joazeiro Baker de Carvalho. 2011. “Mineração de dados educacionais: oportunidades para o Brasil”. Revista Brasileira de Informática na Educação 19 (2): 3-13. https://doi.org/10.5753/RBIE.2011.19.02.03

Bazzan, Jordana, Márcia Elisa Echeveste, Carlos Torres Formoso, Bernardo Altenbernd e Márcia Helena Barbian. 2023. “An Information Management Model for Addressing Residents’ Complaints Through Artificial Intelligence Techniques”. Buildings 13 (3), e737. https://doi.org/10.3390/buildings13030737

Bojanowski, Piotr, Edouard Grave, Armand Joulin e Tomas Mikolov. 2017. “Enriching Word Vectors with Subword Information”. Transactions of the Association for Computational Linguistics 5: 135-46. https://doi.org/10.1162/tacl_a_00051

Brown, Tom B., Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan et al. 2020. “Language Models are Few-Shot Learners”. Pré-publicação Arxiv. https://doi.org/10.48550/arXiv.2005.14165

Cho, Kyunghyun, Bart van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk e Yoshua Bengio. 2014. “Learning Phrase Representations Using RNN Encoder-Decoder for Statistical Machine Translation”. Em Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), organizado por Alessandro Moschitti, Bo Pang e Walter Daelemans, 1724-34. Association for Computational Linguistics. https://doi.org/10.3115/v1/D14-1179

Devlin, Jacob, Ming-Wei Chang, Kenton Lee e Kristina Toutanova. 2019. “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding”. Pré-publicação Arxiv. https://doi.org/10.48550/arXiv.1810.04805

Gugnani, Akshay, e Hemant Misra. 2020. “Implicit Skills Extraction Using Document Embedding and its Use in Job Recommendation”. Proceedings of the AAAI Conference on Artificial Intelligence 34 (8): 13286-93. https://doi.org/10.1609/aaai.v34i08.7038

Hochreiter, Sepp, e Jürgen Schmidhuber. 1997. “Long Short-Term Memory”. Neural Computation 9 (8): 1735-80. https://doi.org/10.1162/neco.1997.9.8.1735

Kim, Yoon. 2014. “Convolutional Neural Networks for Sentence Classification”. Em Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), organizado por Alessandro Moschitti, Bo Pang, Walter Daelemans, 1746-51. Association for Computational Linguistics. https://doi.org/10.3115/v1/D14-1181

Kingma, Diederik P., e Max Welling. 2014. “Auto-encoding Variational Bayes”. Pré-publicação Arxiv. https://doi.org/10.48550/arXiv.1312.6114

Le, Quoc V., e Tomas Mikolov. 2014. “Distributed Representations of Sentences and Documents”. Pré-publicação Arxiv. https://doi.org/10.48550/arXiv.1405.4053

Lin, Jianghao, Xinyi Dai, Yunjia Xi, Weiwen Liu, Bo Chen, Hao Zhang, Yong Liu et al. 2024. “How Can Recommender Systems Benefit from Large Language Models: A Survey”. Pré-publicação Arxiv. https://doi.org/10.48550/arXiv.2306.05817

Liu, Bing. 2012. Sentiment Analysis and Opinion Mining. Springer. https://doi.org/10.2200/S00416ED1V01Y201204HLT016

Liu, Yinhan, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy et al. 2019. “RoBERTa: A Robustly Optimized BERT Pretraining Approach”. Pré-publicação Arxiv. https://arxiv.org/abs/1907.11692

Manning, Christopher D., Prabhakar Raghavan e Hinrich Schütze. 2008. Introduction to Information Retrieval. Cambridge University Press. https://doi.org/10.1017/CBO9780511809071

Mikolov, Tomas, Kai Chen, Greg Corrado e Jeffrey Dean. 2013a. “Efficient Estimation of Word Representations in Vector Space”. Pré-publicação Arxiv. https://doi.org/10.48550/arXiv.1301.3781

Mikolov, Tomas, Ilya Sutskever, Kai Chen, Greg Corrado e Jeffrey Dean. 2013b. “Distributed Representations of Words and Phrases and their Compositionality”. Pré-publicação Arxiv. https://doi.org/10.48550/arXiv.1310.4546

Page, Matthew J., Joanne E. McKenzie, Patrick M. Bossuyt, Isabelle Boutron, Tammy C. Hoffmann, Cynthia D Mulrow, Larissa Shamseer et al. 2021. “The PRISMA 2020 Statement: An Updated Guideline for Reporting Systematic Reviews”. BMJ 372, 71. https://doi.org/10.1136/bmj.n71

Pennington, Jeffrey, Richard Socher e Christopher Manning. 2014. “GloVe: Global Vectors for Word Representation”. Em Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), organizado por Alessandro Moschitti, Bo Pang e Walter Daelemans, 1532-43. Association for Computational Linguistics. https://doi.org/10.3115/v1/D14-1162

Pereira, Aluisio José, Alex Sandro Gomes e Tiago Thompsen Primo. 2022. “Design de sistema e recomendação educacional: abordagens com Mágico de Oz”. Em Anais do XXXIII Simpósio Brasileiro de Informática na Educação (SBIE 2022), 1184-95. Sociedade Brasileira de Computação. https://doi.org/10.5753/sbie.2022.225760

Pires, Pedro R., Bruna B. Rizzi e Thiago A. Almeida. 2024. “Why Ignore Content? A Guideline for Intrinsic Evaluation of Item Embeddings for Collaborative Filtering”. Em Brazilian Symposium on Multimedia and the Web (Webmedia), 345-354. Sociedade Brasileira de Computação. https://doi.org/10.5753/webmedia.2024.243199

Qiu, Zhangchi, Ye Tao, Shirui Pan e Alan Wee-Chung Liew. 2024. “Knowledge Graphs and Pretrained Language Models Enhanced Representation Learning for Conversational Recommender Systems”. IEEE Transactions on Neural Networks and Learning Systems 36 (4): 6107-21. https://doi.org/10.1109/tnnls.2024.3395334

RBIE (Revista Brasileira de Informática na Educação). s. d. About the Journal. Acessado em 19 de março, 2026. https://journals-sol.sbc.org.br/index.php/rbie

Ricci, Francesco, Lior Rokach e Bracha Shapira, orgs. 2015. Recommender Systems Handbook, 2.º ed. Springer. https://doi.org/10.1007/978-1-4899-7637-6

Salton, Gerard, e Christopher Buckley. 1988. “Term-Weighting Approaches in Automatic Text Retrieval”. Information Processing & Management 24 (5): 513-23. https://doi.org/10.1016/0306-4573(88)90021-0

Sanh, Victor, Lysandre Debut, Julien Chaumond e Thomas Wolf. 2019. “DistilBERT, a Distilled Version of BERT: Smaller, Faster, Cheaper and Lighter”. Pré-publicação Arxiv. https://doi.org/10.48550/arXiv.1910.01108

SBC (Sociedade Brasileira de Computação). s. d. SBC OpenLib (SOL). Acessado em 19 de março, 2026. https://sol.sbc.org.br

Shaikh, Aryaan, Nikita Newalkar, Sakshi Gaikwad, Namrata Kadav e Chaitali Shewale. 2023. “Autocomplete Recommendation Plugin and Summarizing Text Using Natural Language Processing”. Journal of Innovation Information Technology and Application (JINITA) 5 (2): 114-23. https://doi.org/10.35970/jinita.v5i2.1912

Singla, Priyanka, e Vishal Verma. 2025. “An Intelligent Job Recommendation System Based on Semantic Embeddings and Machine Learning”. Journal of Information Systems Engineering and Management 10 (5s): 520-42. https://doi.org/10.52783/jisem.v10i5s.681

Velpula, Koteswara Rao, Hema Pavuluri, Poojitha Neeluri, Anushka Pappala e Mounika Narra. 2024. “Recommendation System for Code Validation and Optimal Refactoring”. International Journal of Advanced Research in Computer and Communication Engineering 13 (3): 80-87. https://doi.org/10.17148/IJARCCE.2024.13313

Yang, Yixiao. 2022. “Improving the Robustness to Data Inconsistency Between Training and Testing for Code Completion by Hierarchical Language Model”. Pré-publicação Arxiv. https://arxiv.org/abs/2003.08080v2

Yang, Fan, Zheng Chen, Ziyan Jiang, Eunah Cho, Xiaojiang Huang e Yanbin Lu. 2023. “PALR: Personalization Aware LLMs for Recommendation”. Pré-publicação Arxiv. https://arxiv.org/abs/2305.07622

Authors:

  • They must sent the publication authorization letter to Investigación Bibliotecológica: archivonomía, bibliotecología e información.
  • They can share the submission with the scientific community in the following ways:
    • As teaching support material
    • As the basis for lectures in academic conferences
    • Self-archiving in academic repositories.
    • Dissemination in academic networks.
    • Posting to author’s blogs and personal websites

These allowances shall remain in effect as long as the conditions of use of the contents of the journal are duly observed pursuant to the Creative Commons:Attribution-NonCommercial-NoDerivatives 4.0 license that it holds. DOI links for download the full text of published papers are provided for the last three uses.

Self-archiving policy

For self-archiving, authors must comply with the following

a) Acknowledge the copyright held by the journal Investigación Bibliotecológica: archivonomía, bibliotecología e información.

b) Establish a link to the original version of the paper on the journal page, using, for example, the DOI.

c) Disseminate the final version published in the journal.

Licensing of contents

The journal Investigación Bibliotecológica: archivonomía, bibliotecología e información allows access and use of its contents pursuant to the Creative Commons license: Attribution- Non-commercial-NoDerivatives 4.0.

Licencia de Creative Commons


Investigación Bibliotecológica: archivonomía, bibliotecología e información by Universidad Nacional Autónoma de México is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 Internacional License.
Creado a partir de la obra en http://rev-ib.unam.mx/ib.

 

This means that contents can only be read and shared as long as the authorship of the work is acknowledged and cited. The work shall not be exploited for commercial ends nor shall it been modified.

Limitation of liability

The journal is not liable for academic fraud or plagiarism committed by authors, nor for the intellectual criteria they employ. Similarly, the journal shall not be liable for the services offered through third party hyperlinks contained in papers submitted by authors.

In support of this position, the journal provides the Author’s Duties notice at the following link: Responsibilities of authors.

The director or editor of the journal shall notify authors in the event it migrates the contents of the journal’s official website to a distinct IP or domain.

 

Downloads

Download data is not yet available.