Rule Induction for Sentence Reduction

Thumbnail Image
Date
2013
Authors
João Cordeiro
Dias,G
Pavel Brazdil
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Sentence Reduction has recently received a great attention from the research community of Automatic Text Summarization. Sentence Reduction consists in the elimination of sentence components such as words, part-of-speech tags sequences or chunks without highly deteriorating the information contained in the sentence and its grammatical correctness. In this paper, we present an unsupervised scalable methodology for learning sentence reduction rules. Paraphrases are first discovered within a collection of automatically crawled Web News Stories and then textually aligned in order to extract interchangeable text fragment candidates, in particular reduction cases. As only positive examples exist, Inductive Logic Programming (ILP) provides an interesting learning paradigm for the extraction of sentence reduction rules. As a consequence, reduction cases are transformed into first order logic clauses to supply a massive set of suitable learning instances and an ILP learning environment is defined within the context of the Aleph framework. Experiments evidence good results in terms of irrelevancy elimination, syntactical correctness and reduction rate in a real-world environment as opposed to other methodologies proposed so far.
Description
Keywords
Citation