Statistics for Clustering and classifying text documents a revisit to tagging integration methods