DOTS: Drift Oriented Tool System
    
  
 
 
  
  
    
    
        DOTS: Drift Oriented Tool System
    
  
Date
    
    
        2015
    
  
Authors
  Cósta,J
  Silva,C
  Mário João Antunes
  Ribeiro,B
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
    
    
        Drift is a given in most machine learning applications. The idea that models must accommodate for changes, and thus be dynamic, is ubiquitous. Current challenges include temporal data streams, drift and non-stationary scenarios, often with text data, whether in social networks or in business systems. There are multiple drift patterns types: concepts that appear and disappear suddenly, recurrently, or even gradually or incrementally. Researchers strive to propose and test algorithms and techniques to deal with drift in text classification, but it is difficult to find adequate benchmarks in such dynamic environments. In this paper we present DOTS, Drift Oriented Tool System, a framework that allows for the definition and generation of text-based datasets where drift characteristics can be thoroughly defined, implemented and tested. The usefulness of DOTS is presented using a Twitter stream case study. DOTS is used to define datasets and test the effectiveness of using different document representation in a Twitter scenario. Results show the potential of DOTS in machine learning research. © Springer International Publishing Switzerland 2015.