Couillard: Parallel programming via coarse-grained Data-flow Compilation

No Thumbnail Available
Date
2014
Authors
Marzulo,LAJ
Alves,TAO
Franca,FMG
Vítor Santos Costa
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Data-flow is a natural approach to parallelism. However, describing dependencies and control between fine-grained data-flow tasks can be complex and present unwanted overheads. TALM (TALM is an Architecture and Language for Multi-threading) introduces a user-defined coarse-grained parallel data-flow model, where programmers identify code blocks, called super-instructions, to be run in parallel and connect them in a data-flow graph. TALM has been implemented as a hybrid Von Neumann/data-flow execution system: the Trebuchet. We have observed that TALM's usefulness largely depends on how programmers specify and connect super-instructions. Thus, we present Couillard, a full compiler that creates, based on an annotated C-program, a data-flow graph and C-code corresponding to each super-instruction. We show that our toolchain allows one to benefit from data-flow execution and explore sophisticated parallel programming techniques, with small effort. To evaluate our system we have executed a set of real applications on a large multi-core machine. Comparison with popular parallel programming methods shows competitive speedups, while providing an easier parallel programing approach. More specifically, for an application that follows the wavefront method, running with big inputs, Trebuchet achieved up to 4.7% speedup over Intel (R) TBB novel flow-graph approach and up to 44% over OpenMP.
Description
Keywords
Citation