POPSTAR at RepLab 2013: Name ambiguity resolution on Twitter

Thumbnail Image
Date
2013
Authors
Saleiro,P
Rei,L
Pasquali,A
Carlos Manuel Soares
Teixeira,J
Pinto,F
Mohammad Nozari
Catarina Félix Oliveira
Strecht,P
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Filtering tweets relevant to a given entity is an important task for online reputation management systems. This contributes to a reliable analysis of opinions and trends regarding a given entity. In this paper we describe our participation at the Filtering Task of RepLab 2013. The goal of the competition is to classify a tweet as relevant or not relevant to a given entity. To address this task we studied a large set of features that can be generated to describe the relationship between an entity and a tweet. We explored different learning algorithms as well as, different types of features: text, keyword similarity scores between enti-ties metadata and tweets, Freebase entity graph and Wikipedia. The test set of the competition comprises more than 90000 tweets of 61 entities of four distinct categories: automotive, banking, universities and music. Results show that our approach is able to achieve a Reliability of 0.72 and a Sensitivity of 0.45 on the test set, corresponding to an F-measure of 0.48 and an Accuracy of 0.908.
Description
Keywords
Citation