Statistics for Feature extraction for the author name disambiguation problem in a bibliographic database