Term frequency dynamics in collaborative articles
Term frequency dynamics in collaborative articles
dc.contributor.author | Sérgio Nunes | en |
dc.contributor.author | Cristina Ribeiro | en |
dc.contributor.author | Gabriel David | en |
dc.date.accessioned | 2017-11-16T13:05:18Z | |
dc.date.available | 2017-11-16T13:05:18Z | |
dc.date.issued | 2010 | en |
dc.description.abstract | Documents on the World Wide Web are dynamic entities. Mainstream information retrieval systems and techniques are primarily focused on the latest version a document, gen- erally ignoring its evolution over time. In this work, we study the term frequency dynamics in web documents over their lifespan. We use the Wikipedia as a document collec- tion because it is a broad and public resource and, more im- portant, because it provides access to the complete revision history of each document. We investigate the progression of similarity values over two projection variables, namely re- vision order and revision date. Based on this investigation we find that term frequency in encyclopedic documents – i.e. comprehensive and focused on a single topic – exhibits a rapid and steady progression towards the document’s cur- rent version. The content in early versions quickly becomes very similar to the present version of the document. | en |
dc.identifier.uri | http://repositorio.inesctec.pt/handle/123456789/1999 | |
dc.identifier.uri | http://dx.doi.org/10.1145/1860559.1860620 | en |
dc.language | eng | en |
dc.relation | 5448 | en |
dc.relation | 212 | en |
dc.relation | 215 | en |
dc.rights | info:eu-repo/semantics/openAccess | en |
dc.subject | Document Dynamics | |
dc.subject | Term Frequency | |
dc.subject | Wikipedia | |
dc.title | Term frequency dynamics in collaborative articles | en |
dc.type | conferenceObject | en |
dc.type | Publication | en |