From Audiobook Narration to the Verbal and Visual Textuality of the Audiotext: An Alternative for Knowledge Acquisition
PDF (Español (España))

Keywords

Audiotext
Audiobook
Tertiary Orality
Text-to-Speech Conversion
Synthetic Voices
Deepfake

How to Cite

Barragán-Perea, E. A., & Tarango, J. (2024). From Audiobook Narration to the Verbal and Visual Textuality of the Audiotext: An Alternative for Knowledge Acquisition. Investigación Bibliotecológica. Archivonomía, bibliotecología E información, 38(99), 13–33. https://doi.org/10.22201/iibi.24488321xe.2024.99.58856
Métricas de PLUMX

Abstract

Traditionally, access to information through reading refers to perceiving and understanding writing through sight or touch; however, reading via listening has been established as a form of tertiary orality that allows writing, image, and voice combination. It conforms to a powerful alternative to knowledge acquisition for new generations, who sometimes prefer to listen to books instead of reading them. For this reason, a documentary-type investigation of the scientific literature on the subject was carried out –through a descriptive study– to delve into using audiotext as an alternative way to acquire knowledge. Audiotext, audiobook, tertiary orality, textto-speech conversion, synthetic voices, and voice deepfake were the concepts analyzed to do this. The impact of information and communication technologies has made it possible for audiotexts to become a powerful tool for spoken word vindication and a complementary device for teaching-learning processes.

https://doi.org/10.22201/iibi.24488321xe.2024.99.58856
PDF (Español (España))

References

Adeyemo, Olufemi, y Anthony Idowu. 2015. “Development and Integration of Text to Speech Usability Interface for Visually Impaired Users in Yoruba Language”. African Journal of Computing and ICT 8 (1): 87-94. https://bit.ly/3LNqbHR

Alonso, Agustín, Iñaki Sainz, Daniel Erro, Eva Navas e Inma Hernaez. 2013. “Sistema de conversión texto a voz de código abierto para lenguas ibéricas”. Procesamiento del lenguaje natural 51: 169-75. https://bit.ly/3PZwESH

Amazon Polly. 2023. “¿Qué es Amazon Polly?” Amazon Web Services. Consultado el 20 octubre 2023. https://docs.aws.amazon.com/es_es/polly/latest/dg/what-is.html

Balci, Erdem. 2019. “Overview of Intelligent Personal Assistants”. Acta Infológica 3 (1): 22-33. https://doi.org/10.26650/acin.454522

Bañuelos Capistrán, Jacob. 2020. “Deepfake: la imagen en tiempos de la posverdad”. Revista Panamericana de Comunicación 2 (1): 51-61. https://doi.org/10.21555/rpc.v0i1.2315

Bernstein, Charles. ed. 1998. Close Listening: Poetry and the Performed Word. Oxford University Press.

Bjork, Olin, y John Rumrich. 2018. “Is There a Class in This Audiotext? Paradise Lost and the Multimodal Social”. En Digital Milton, editado por David Currell e Islam Issa, 47-76. Palgrave Macmillan. https://doi.org/10.1007/978-3-319-90478-8_3

Bonifacci, Paola, Elisa Colombini, Michele Marzocchi, Valentina Tobia y Lorenzo Desideri. 2022. “Text to Speech Applications to Reduce Mind Wandering in Students with Dyslexia”. Journal of Computer Assisted Learning 38 (2): 440-54. https://doi.org/10.1111/jcal.12624

Cahill, Maria, y Jennifer Richey. 2015. “Audiobooks as a Window to the World”. En The School Library Rocks: Proceedings of the 44th International Association of School Librarianship

(IASL) Conference 2015, Volume 1: Professional Papers, editado por Lourense Das, Saskia Brand-Gruwel, Kees Kok y Jaap Walhout, 92-98. Heerlen: Open Universiteit. https://www.iaslonline.org/resources/Pictures/IASL2015_Proceedings_Vol12ndEd_ProfPapers.pdf

Chen, Chih-Ming, Chia-Chen Tan y Bey-Jane Lo. 2016. “Facilitating English-Language Learners’ Oral Reading Fluency with Digital Pen Technology”. Interactive Learning Environments 24 (1): 96-118. https://doi.org/10.1080/10494820.2013.817442

Coronado Arjona, Manuel Alejandro, Víctor Manuel Bianchi Rosado y Juan Alberto Vivas Burgos. 2017. “Evaluación de la usabilidad en aplicaciones domóticas móviles usando el método de recorrido”. Tecnología Educativa Revista CONAIC 4 (1): 53-63. https://doi.org/10.32671/terc.v4i1.114

Costa-Jussà, Marta, y José Fonollosa. 2017. “DeepVoice: Tecnologías de aprendizaje profundo aplicadas al procesado de voz y audio”. Procesamiento del Lenguaje Natural 59: 117-20. https://www.redalyc.org/pdf/5157/515754427013.pdf

Coto Jiménez, Marvin, y Maribel Morales Rodríguez. 2020. “Tecnologías del habla para la educación inclusiva”. Actualidades Investigativas en Educación 20 (1): 631-656. http://dx.doi.org/10.15517/aie.v20i1.40129

De Giusti, Maria Raquel, Ariel Lira, Julieta Paz Rodríguez Vuan y Gonzalo Luján Villarreal. 2016. “Accesibilidad de los contenidos en un repositorio institucional: análisis, herramientas y usos del formato EPUB”. e-Ciencias de la Información 6 (2): 1-23. http://dx.doi.org/10.15517/eci.v6i2.23690

Fauzi, Esa, Adri Genta Rahdian, Agustinus Ipan Suryana, Penta Al, Tiara Nastiti Handana Ningtias y Kinanti Dara Nurkhofifah. 2021. “Design and Implementation IVR Outbound Service API Using Text to Speech”. Review of International Geographical Education 11 (5): 789-96. https://onx.la/f62f1

Franganillo, Jorge. 2023. “La inteligencia artificial generativa y su impacto en la creación de contenidos mediáticos”. Methaodos. Revista De Ciencias Sociales 11 (2): 1-17. https://doi.org/10.17502/mrcs.v11i2.710

García-Roca, Anastasio. 2020. “Virtually Digital Reading: The Collective Challenge of Textual Interpretation.” Cinta de moebio 67: 65-74. http://dx.doi.org/10.4067/s0717-554x2020000100065

García-Ull, Francisco José. 2021. “Deepfakes: el próximo reto en la detección de noticias falsas”. Anàlisi 64: 103-20. https://doi.org/10.5565/rev/analisi.3378

Gil, José María, y Jonás Ezequiel Bergonzi Martínez. 2023. “Lectura en voz alta y comentada para enseñar (y disfrutar) a Borges”. Prometeica-Revista de Filosofía y Ciencias 26: 143-62. https://doi.org/10.34024/prometeica.2023.26.14766

Google Cloud. 2023. “IA de Text-to-Speech”. Cloud Text-to-Speech. Consultado el 10 octubre 2023. https://cloud.google.com/text-to-speech

Gramajo, María Cecilia, Miguel Santagada y Anabel Paoletta. 2017. “Una audioteca en la UNICEN”. La Escalera - Anuario de la Facultad de Arte 27: 123-36. https://www.ojs.arte.unicen.edu.ar/index.php/laescalera/article/view/567/486

Greenwood, Charles R., Judith J. Carta, Gabriela Guerrero, Jane Atwater, Elizabeth S. Kelley, Na Young Kong y Howard Goldstein. 2016. “Systematic Replication of the Effects of a Supplementary, Technology-Assisted, Storybook Intervention for Preschool Children with Weak Vocabulary and Comprehension Skills”. The Elementary School Journal 116 (4): 574-99. http://dx.doi.org/10.1086/686223

Henkel, Ayoe Quist, Sarah Mygind y Helle Bundgaard Svendsen. 2021. “Exploring Reading Experiences of Three Media Versions: Danish 8th Grade Students Reading the Story Nord”. L1-Educational Studies in Language and Literature 21: 1-29. https://doi.org/10.17239/L1ESLL-2021.21.02.04

Hernández, Gonzalo. 2023. “VALL-E: así es la IA de Microsoft capaz de simular cualquier voz a partir de una muestra de audio de tan solo tres segundos de duración”. Xataka México, 10 enero 2023. https://cutt.ly/dwnLB0JM

Juca Faicán, Wilmer Adrián. 2023. “Diseño de un entorno virtual de aprendizaje para atender las necesidades educativas especiales de un estudiante con discapacidad visual en la asignatura de Lengua y Literatura”. Tesis de maestría, Universidad del Azuay. https://bit.ly/3PFi4hN

Kaur, Navdeep, y Parminder Singh. 2023. “Conventional and Contemporary Approaches Used in Text to Speech Synthesis: A Review”. Artificial Intelligence Review 56: 5837-80. https://doi.org/10.1007/s10462-022-10315-0

Keelor, Jennifer L., Nancy Creaghead, Noah Silbert y Tzipi Horowitz-Kraus. 2020. “Text to Speech Technology: Enhancing Reading Comprehension for Students with Reading Difficulty”. Assistive Technolog y Outcomes and Benefits 14: 19-35. https://acortartu.link/mpe4z

Kuligowska, Karolina, Paweł Kisielewicz y Aleksandra Włodarz. 2018. “Speech Synthesis Systems: Disadvantages and Limitations”. International Journal of Engineering & Technolog y 7 (2.28): 234-39. https://doi.org/10.14419/ijet.v7i2.28.12933

Llanga Vargas, Edgar Francisco, Tatiana Silvana Arias Cáceres y Francisco José Araque Zaldaña. 2019. “Vicios de la lectura y el aprendizaje”. Revista Atlante: Cuadernos de Educación y Desarrollo. https://bit.ly/46wGBwj

López Delacruz, Santiago. 2023. “Un vínculo paradójico: narrativas audiovisuales generadas por inteligencia artificial, entre el pastiche y la cancelación del futuro”. Hipertext.net 26: 31-35. https://doi.org/10.31009/hipertext.net.2023.i26.05

Maldonado, Lucía. 2020. Tecnología y educación: recursos para personas con dificultades de aprendizaje, limitaciones intelectuales, motoras, visuales y auditivas. Buenos Aires: Editorial Biblos.

Masood, Momina, Mariam Nawaz, Khalid Mahmood Malik, Ali Javed, Aun Irtaza y Hafiz Malik. 2023. “Deepfakes Generation and Detection: State-of-the-art, Open Challenges, Countermeasures, and Way Forward”. Applied Intelligence 53: 3974-4026. https://doi.org/10.1007/s10489-022-03766-z

Microsoft Azure. 2023. “¿Qué es Speech Service?”. 23 enero 2024. https://rb.gy/kyr1e

Murf AI. 2023. “Go from Text to Speech with a Versatile AI Voice Generator”. Consultado 5 octubre 2023. https://murf.ai/

Natural Reader. 2023. “AI Text to Speech”. Consultado 5 octubre 2023. https://www.naturalreaders.com/

Nekvinda, Tomáš, y Ondřej Dušek. 2020. “One Model, Many Languages: Meta-Learning for Multilingual Text to Speech”. Ponencia presentada en INTERSPEECH 2020 en Shanghai, China: 2972-76. https://doi.org/10.48550/arXiv.2008.00768

Ning, Yishuang, Sheng He, Zhiyong Wu, Chunxiao Xing y Liang-Jie Zhang. 2019. “A Review of Deep Learning Based Speech Synthesis”. Applied Sciences 9 (19): 1-16. https://doi.org/10.3390/app9194050

Noah, Ben, Arathi Sethumadhavan, Josh Lovejoy y David Mondello. 2021. “Public Perceptions Towards Synthetic Voice Technology”. Proceedings of the Human Factors and Ergonomics Society Annual Meeting 65 (1): 1448-52. https://doi.org/10.1177/1071181321651128

Ong, Walter J. 1987. Oralidad y Escritura. Ciudad de México: Fondo de Cultura Económica.

Orozco Aguirre, Héctor Rafael, y Gonzalo Ivan Riego Caravantes. 2019. “Un tutor virtual inteligente para apoyar y asistir el proceso de enseñanza-aprendizaje en los primeros tres grados de educación primaria en México”. Pistas Educativas (134): 524-41. http://hdl.handle.net/20.500.11799/106254

Paddeu, Gavino, Andrea Devola, Andrea Ferrero y Antonio Pintori. 2019. “Interactive Audio-Text Guide for Museum Accessibility”. Poster presentado en la 18th IADIS International Conference WWW/Internet 2019 en Cagliari, Italia, noviembre 2019. http://dx.doi.org/10.33965/icwi2019_201913P027

Paladines, Lenin, y Cristina Aliagas. 2023. “Literacy and Literary Learning on BookTube through the Lenses of Latina BookTubers”. Literacy 57 (1): 17-27. https://doi.org/10.1111/lit.12310

Pesaru, Swetha, y Tilottama Goswami. 2021. “AI Based Assistance for Visually Impaired People Using TTS (Text to Speech).” International Journal of Innovative Research in Science and Technolog y 1 (1): 8-14. https://acortartu.link/2pew3

Rini, Regina. 2020. “Deepfakes and the Epistemic Backstop”. Philosophers’ Imprint 20(24): 1-16. https://philpapers.org/archive/RINDAT.pdf

Rodero, Emma, e Ignacio Lucas. 2023. “Voces sintéticas versus voces humanas en audiolibros: el efecto de la intimidad emocional humana”. New Media and Society 25 (7): 1746-64. https://doi.org/10.1177/14614448211024142

Ronda Pupo, Jorge Carlos, Niurka Cueto Rodríguez y María del Carmen Cougle Iglesias. 2020. “Dimensiones e indicadores para la evaluación de la comprensión auditiva en la práctica integral de la lengua inglesa”. Varona. Revista Científico Metodológica 70: 98-102. http://scielo.sld.cu/pdf/vrcm/n70/1992-8238-vrcm-70-98.pdf

Sánchez, Jaime, y Héctor Flores. 2005. “AudioMath: Blind Children Learning Mathematics through Audio”. International Journal on Disability and Human Development 4 (4): 311-16. https://doi.org/10.1515/IJDHD.2005.4.4.311

Sierra Berrocal, Ángel. 2022. “Adaptación de libros hablados digitales mediante síntesis de voz en el Servicio Bibliográfico de la ONCE”. RED Visual: Revista Especializada en Discapacidad Visual 80: 106-26. https://hdl.handle.net/11162/242234

Taki, Sifat Ut, y Spyridon Mastorakis. 2023. “Rethinking Internet Communication Through LLMs: How Close Are We?”. Journal of Latex Class Files 18 (9): 1-6. https://arxiv.org/pdf/2309.14247.pdf

Tan, Chia-Chen, Chih-Ming Chen y Hanh-Ming Lee. 2013. “Using a Paper-Based Digital Pen for Supporting English Courses in Regular Classrooms to Improve Reading Fluency”. International Journal of Humanities and Arts Computing 7: 234-46. https://doi.org/10.3366/ijhac.2013.0073

Taylor, Paul. 2009. Text to Speech Synthesis. Cambridge: Cambridge University Press. Vallorani, Cecilia María, e Isabel Gibert. 2022. “The Audiobook: The New Orality in the Digital Era. Visual Review”. International Visual Culture Review 12 (2): 1-9. https://doi.org/10.37467/revvisual.v9.3734

Van den Oord, Aäron, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior y Koray Kavukcuoglu. 2016. “Wavenet: A Generative Model for Raw Audio”. https://arxiv.org/pdf/1609.03499.pdf

Zavgorodniaia, Albina, Arto Hellas, Otto Seppälä y Juha Sorva. 2020. “Should Explanations of Program Code Use Audio, Text, or Both? A Replication Study”. Artículo presentado en la 20th Koli Calling International Conference on Computing Education Research en Koli, Finlandia, 19-22 noviembre 2020. https://doi.org/10.1145/3428029.3428050

Authors:

  • They must sent the publication authorization letter to Investigación Bibliotecológica: archivonomía, bibliotecología e información.
  • They can share the submission with the scientific community in the following ways:
    • As teaching support material
    • As the basis for lectures in academic conferences
    • Self-archiving in academic repositories.
    • Dissemination in academic networks.
    • Posting to author’s blogs and personal websites

These allowances shall remain in effect as long as the conditions of use of the contents of the journal are duly observed pursuant to the Creative Commons:Attribution-NonCommercial-NoDerivatives 4.0 license that it holds. DOI links for download the full text of published papers are provided for the last three uses.

Self-archiving policy

For self-archiving, authors must comply with the following

a) Acknowledge the copyright held by the journal Investigación Bibliotecológica: archivonomía, bibliotecología e información.

b) Establish a link to the original version of the paper on the journal page, using, for example, the DOI.

c) Disseminate the final version published in the journal.

Licensing of contents

The journal Investigación Bibliotecológica: archivonomía, bibliotecología e información allows access and use of its contents pursuant to the Creative Commons license: Attribution- Non-commercial-NoDerivatives 4.0.

Licencia de Creative Commons


Investigación Bibliotecológica: archivonomía, bibliotecología e información by Universidad Nacional Autónoma de México is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 Internacional License.
Creado a partir de la obra en http://rev-ib.unam.mx/ib.

 

This means that contents can only be read and shared as long as the authorship of the work is acknowledged and cited. The work shall not be exploited for commercial ends nor shall it been modified.

Limitation of liability

The journal is not liable for academic fraud or plagiarism committed by authors, nor for the intellectual criteria they employ. Similarly, the journal shall not be liable for the services offered through third party hyperlinks contained in papers submitted by authors.

In support of this position, the journal provides the Author’s Duties notice at the following link: Responsibilities of authors.

The director or editor of the journal shall notify authors in the event it migrates the contents of the journal’s official website to a distinct IP or domain.

 

Downloads

Download data is not yet available.