HCAC: semi-supervised hierarchical clustering using confidence-based active learning

No Thumbnail Available
Date
2012
Authors
Alípio Jorge
Bruno Magalhães Nogueira
Solange Rezende
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Despite their importance, hierarchical clustering has been little ex- plored for semi-supervised algorithms. In this paper, we address the problem of semi-supervised hierarchical clustering by using an active learning solution with cluster-level constraints. This active learning approach is based on a new concept of merge confidence in agglomerative clustering. When there is low confidence in a cluster merge the user is queried and provides a cluster-level constraint. The proposed method is compared with an unsupervised algorithm (average-link) and two state-of-the-art semi-supervised algorithms (pairwise constraints and Con- strained Complete-Link). Results show that our algorithm tends to be better than the two semi-supervised algorithms and can achieve a significant improvement when compared to the unsupervised algorithm. Our approach is particularly use- ful when the number of clusters is high which is the case in many real problems.
Description
Keywords
Citation