Classifying heart sounds using SAX motifs, random forests and text mining techniques

No Thumbnail Available
Date
2014
Authors
Elsa Ferreira Gomes
Alípio Jorge
Paulo Jorge Azevedo
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
In this paper we describe an approach to classifying heart sounds (classes Normal, Murmur and Extra-systole) that is based on the discretization of sound signals using the SAX (Symbolic Aggregate Approximation) representation. The ability of automatically classifying heart sounds or at least support human decision in this task is socially relevant to spread the reach of medical care using simple mobile devices or digital stethoscopes. In our approach, sounds are first pre-processed using signal processing techniques (decimate, low-pass filter, normalize, Shannon envelope). Then the pre-processed symbols are transformed into sequences of discrete SAX symbols. These sequences are subject to a process of motif discovery. Frequent sequences of symbols (motifs) are adopted as features. Each sound is then characterized by the frequent motifs that occur in it and their respective frequency. This is similar to the term frequency (TF) model used in text mining. In this paper we compare the TF model with the application of the TFIDF (Term frequency - Inverse Document Frequency) and the use of bi-grams (frequent size two sequences of motifs). Results show the ability of the motifs based TF approach to separate classes and the relative value of the TFIDF and the bi-grams variants. The separation of the Extra-systole class is overly difficult and much better results are obtained for separating the Murmur class. Empirical validation is conducted using real data collected in noisy environments. We have also assessed the cost-reduction potential of the proposed methods by considering a fixed cost model and using a cost sensitive meta algorithm. Copyright 2014 ACM.
Description
Keywords
Citation