Abstract
The aim of this study is to analyze how text mining techniques applied to textual documents of Brazilian police investigation can promote knowledge discovery. The research collected documents from the police investigation and submitted them to the text mining process. The study used the techniques of case folding, tokenization, custom stopwords, bag of words and TF-IDF in order to extract results in ngrams. The results were presented with word clouds. In the research, k-means were used to cluster the sets of trigrams, identifying in each clusters the most representative terms of the clusters. The use of text mining techniques on these documents was intended to extract non-trivial knowledge. The techniques of text mining, or discovery of knowledge in a textual database, have the purpose of discovering unobservable patterns when analyzed by human manipulation of large volumes of documents. The results found favored the discovery of knowledge in the identification of entities and connections, as well as thematic categories of the investigation.
Authors:
- They must sent the publication authorization letter to Investigación Bibliotecológica: archivonomía, bibliotecología e información.
- They can share the submission with the scientific community in the following ways:
- As teaching support material
- As the basis for lectures in academic conferences
- Self-archiving in academic repositories.
- Dissemination in academic networks.
- Posting to author’s blogs and personal websites
These allowances shall remain in effect as long as the conditions of use of the contents of the journal are duly observed pursuant to the Creative Commons:Attribution-NonCommercial-NoDerivatives 4.0 license that it holds. DOI links for download the full text of published papers are provided for the last three uses.
Self-archiving policy
For self-archiving, authors must comply with the following
a) Acknowledge the copyright held by the journal Investigación Bibliotecológica: archivonomía, bibliotecología e información.
b) Establish a link to the original version of the paper on the journal page, using, for example, the DOI.
c) Disseminate the final version published in the journal.
Licensing of contents
The journal Investigación Bibliotecológica: archivonomía, bibliotecología e información allows access and use of its contents pursuant to the Creative Commons license: Attribution- Non-commercial-NoDerivatives 4.0.

Investigación Bibliotecológica: archivonomía, bibliotecología e información by Universidad Nacional Autónoma de México is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 Internacional License.
Creado a partir de la obra en http://rev-ib.unam.mx/ib.
This means that contents can only be read and shared as long as the authorship of the work is acknowledged and cited. The work shall not be exploited for commercial ends nor shall it been modified.
Limitation of liability
The journal is not liable for academic fraud or plagiarism committed by authors, nor for the intellectual criteria they employ. Similarly, the journal shall not be liable for the services offered through third party hyperlinks contained in papers submitted by authors.
In support of this position, the journal provides the Author’s Duties notice at the following link: Responsibilities of authors.
The director or editor of the journal shall notify authors in the event it migrates the contents of the journal’s official website to a distinct IP or domain.

