A study of machine learning methods for detecting user interest during web sessions

No Thumbnail Available
Date
2014
Authors
Alípio Jorge
Anand,SS
José Paulo Leal
Dias,H
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The ability to have an automated real time detection of user interest during a web session is very appealing and can be very useful for a number of web intelligence applications. Low level interaction events associated with user interest manifestations form the basis of user interest models. However such data sets present a number of challenges from a machine learning perspective, including the level of noise in the data and class imbalance (given that the majority of content will not be of interest to a user). In this paper we evaluate a large number of machine learning techniques aimed at learning from class imbalanced data using two data sets collected from a real user study. We use the AUC, recall, precision and model complexity to compare the relative merits of these techniques and conclude that useful models with AUC above 0.8 can be obtained using a mix of sampling and cost based methods. Ensemble models can provide further accuracy but make deployment more complex. Copyright 2014 ACM.
Description
Keywords
Citation