Term frequency dynamics in collaborative articles

dc.contributor.author Sérgio Nunes en
dc.contributor.author Cristina Ribeiro en
dc.contributor.author Gabriel David en
dc.date.accessioned 2017-11-16T13:05:18Z
dc.date.available 2017-11-16T13:05:18Z
dc.date.issued 2010 en
dc.description.abstract Documents on the World Wide Web are dynamic entities. Mainstream information retrieval systems and techniques are primarily focused on the latest version a document, gen- erally ignoring its evolution over time. In this work, we study the term frequency dynamics in web documents over their lifespan. We use the Wikipedia as a document collec- tion because it is a broad and public resource and, more im- portant, because it provides access to the complete revision history of each document. We investigate the progression of similarity values over two projection variables, namely re- vision order and revision date. Based on this investigation we find that term frequency in encyclopedic documents – i.e. comprehensive and focused on a single topic – exhibits a rapid and steady progression towards the document’s cur- rent version. The content in early versions quickly becomes very similar to the present version of the document. en
dc.identifier.uri http://repositorio.inesctec.pt/handle/123456789/1999
dc.identifier.uri http://dx.doi.org/10.1145/1860559.1860620 en
dc.language eng en
dc.relation 5448 en
dc.relation 212 en
dc.relation 215 en
dc.rights info:eu-repo/semantics/openAccess en
dc.subject Document Dynamics
dc.subject Term Frequency
dc.subject Wikipedia
dc.title Term frequency dynamics in collaborative articles en
dc.type conferenceObject en
dc.type Publication en
Files