Scalable subgraph counting using MapReduce

dc.contributor.author Eddin,AN en
dc.contributor.author Pedro Manuel Ribeiro en
dc.date.accessioned 2018-01-18T14:57:32Z
dc.date.available 2018-01-18T14:57:32Z
dc.date.issued 2017 en
dc.description.abstract Networks are powerful in representing a wide variety of systems in many fields of study. Networks are composed of smaller substructures (subgraphs) that characterize them and give important information related to their topology and functionality. Therefore, discovering and counting these subgraph patterns is very important towards mining the features of networks. Algorithmically, subgraph counting in a network is a computationally hard problem and the needed execution time grows exponentially as the size of the subgraph or the network increases. The main goal of this paper is to contribute towards subgraph search, by providing an accessible and scalable parallel methodology for counting subgraphs. For that we present a dynamic iterative MapReduce strategy to parallelize algorithms that induce an unbalanced search tree, and apply it in the subgraph counting realm. At the core of our methods lies the g-trie, a state-of-the-art data structure that was created precisely for this task. Our strategy employs an adaptive time threshold and an efficient work-sharing mechanism to dynamically do load balancing between the workers. We evaluate our implementations using Spark on a large set of representative complex networks from different fields. The results obtained are very promising and we achieved a consistent and almost linear speedup up to 32 cores, with an average efficiency close to 80%. To the best of our knowledge this is the fastest and most scalable method for subgraph counting within the MapReduce programming model. Copyright 2017 ACM. en
dc.identifier.uri http://repositorio.inesctec.pt/handle/123456789/6960
dc.identifier.uri http://dx.doi.org/10.1145/3019612.3019744 en
dc.language eng en
dc.relation 5316 en
dc.rights info:eu-repo/semantics/openAccess en
dc.title Scalable subgraph counting using MapReduce en
dc.type conferenceObject en
dc.type Publication en
Files
Original bundle
Now showing 1 - 1 of 1
Thumbnail Image
Name:
P-00M-ZA3.pdf
Size:
394.2 KB
Format:
Adobe Portable Document Format
Description: