Large Scale Graph Representations for Subgraph Census

Thumbnail Image
Date
2016
Authors
Pedro Reis Paredes
Pedro Manuel Ribeiro
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
A Subgraph Census (determining the frequency of smaller subgraphs in a network) is an important computational task at the heart of several graph mining algorithms. Here we focus on the g-tries, an efficient state-of-the art data structure. Its algorithm makes extensive use of the graph primitive that checks if a certain edge exists. The original implementation used adjacency matrices in order to make this operation as fast as possible, as is the case with most past approaches. This representation is very expensive in memory usage, limiting the applicability. In this paper we study a number of possible approaches that scale linearly with the number of edges. We make an extensive empirical study of these alternatives in order to find an efficient hybrid approach that combines the best representations. We achieve a performance that is less than 50% slower than the adjacency matrix on average (almost 3 times more efficient than a naive binary search implementation), while being memory efficient and tunable for different memory restrictions. © Springer-Verlag Berlin Heidelberg 2016.
Description
Keywords
Citation