Automatic Distinction of Fernando Pessoas' Heteronyms

Thumbnail Image
Date
2015
Authors
João Pedro Teixeira
Marco Linhares Couto
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Text Mining has opened a vast array of possibilities concerning automatic information retrieval from large amounts of text documents. A variety of themes and types of documents can be easily analyzed. More complex features such as those used in Forensic Linguistics can gather deeper understanding from the documents, making possible performing difficult tasks such as author identification. In this work we explore the capabilities of simpler Text Mining approaches to author identification of unstructured documents, in particular the ability to distinguish poetic works from two of Fernando Pessoas' heteronyms: 'Alvaro de Campos and Ricardo Reis. Several processing options were tested and accuracies of 97% were reached, which encourage further developments.
Description
Keywords
Citation