Automatic Distinction of Fernando Pessoas' Heteronyms
    
  
 
 
  
  
    
    
        Automatic Distinction of Fernando Pessoas' Heteronyms
    
  
Date
    
    
        2015
    
  
Authors
  João Pedro Teixeira
  Marco Linhares Couto
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
    
    
        Text Mining has opened a vast array of possibilities concerning automatic information retrieval from large amounts of text documents. A variety of themes and types of documents can be easily analyzed. More complex features such as those used in Forensic Linguistics can gather deeper understanding from the documents, making possible performing difficult tasks such as author identification. In this work we explore the capabilities of simpler Text Mining approaches to author identification of unstructured documents, in particular the ability to distinguish poetic works from two of Fernando Pessoas' heteronyms: 'Alvaro de Campos and Ricardo Reis. Several processing options were tested and accuracies of 97% were reached, which encourage further developments.