Please use this identifier to cite or link to this item: http://repositorio.inesctec.pt/handle/123456789/5539
Full metadata record
DC FieldValueLanguage
dc.contributor.authorNuno Miguel Paulinoen
dc.contributor.authorJoão Canas Ferreiraen
dc.contributor.authorJoão Paiva Cardosoen
dc.date.accessioned2018-01-05T16:05:14Z-
dc.date.available2018-01-05T16:05:14Z-
dc.date.issued2017en
dc.identifier.urihttp://repositorio.inesctec.pt/handle/123456789/5539-
dc.identifier.urihttp://dx.doi.org/10.1109/tvlsi.2016.2573640en
dc.description.abstractMany embedded applications process large amounts of data using regular computational kernels, amenable to acceleration by specialized hardware coprocessors. To reduce the significant design effort, the dedicated hardware may be automatically generated, usually starting from the application's source or binary code. This paper presents a moduloscheduled loop accelerator capable of executing multiple loops and a supporting toolchain. A generation/scheduling procedure, which fully relies on MicroBlaze instruction traces, produces accelerator instances, customized in terms of functional units and interconnections. The accelerators support integer and single-precision floating-point arithmetic, and exploit instruction-level parallelism, loop pipelining, and memory access parallelism via two read/write ports. A complete implementation of the proposed architecture is evaluated in a Virtex-7 device. Augmenting a MicroBlaze processor with a tailored accelerator achieves a geometric mean speedup, over software-only execution, of 6.61x for 13 floating-point kernels from the Livermore Loops set, and of 4.08x for 11 integer kernels from Texas Instruments' IMGLIB. The proposed customized accelerators are compared with ALU-based ones. The average specialized accelerator requires only 0.47x the number of field-programmable gate array slices of an accelerator with four ALUs. A geometric mean speedup of 1.78x over a four-issue very long instruction word (without floating-point support) was obtained for the integer kernels.en
dc.languageengen
dc.relation5550en
dc.relation5802en
dc.relation473en
dc.rightsinfo:eu-repo/semantics/embargoedAccessen
dc.titleGeneration of Customized Accelerators for Loop Pipelining of Binary Instruction Tracesen
dc.typearticleen
dc.typePublicationen
Appears in Collections:CSIG - Articles in International Journals
CTM - Articles in International Journals

Files in This Item:
File Description SizeFormat 
P-00M-AM7.pdf
  Restricted Access
1.18 MBAdobe PDFView/Open Request a copy


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.