[ANNOUNCE] YARDstick - custom processor development toolset

Dear friends,

I am very pleased and proud to announce YARDstick

formatting link
a custom processor development toolset with an impressive list of features.

YARDstick is a novel design automation tool for custom processor development flows that focuces on the hard part: generating and evaluating application-specific hardware extensions. YARDstick is a powerful building block for ASIP development, since it integrates application analysis, ultra-fast algorithms for custom instruction generation and selection with user-defined compiler intermediate representations. As of September 2007, YARDstick integrates retargetable compiler features for the targeted IRs/architectures. Remarkable features of YARDstick are the following:

- retargetable to used-defined IRs by machine description.

- can be targeted to low-level compiler IRs, assembly-level representations of virtual machines, or assembly code for existing processors.

- fully parameterized custom instruction generation and selection engine.

- lightning-fast code selector for multiple-input multiple-output patterns based on graph matching. It is known that the code selector scales very well with the instruction node count of basic block data- dependence graphs (successfully tested with custom instruction patterns of more than 30 nodes).

- virtual register assignment for virtual machine targets.

- an extensive set of backends including assembly code emitter, C backend, visualization backends for Graphviz and VCG (or aiSee), an XML format amenable to graph rewriting and others.

YARDstick comes along with a cross-platform GUI written in Tcl/Tk 8.5.

The ultimate goal of YARDstick is to liberate the designer's development infrastructure from compiler and simulator idiosyncrasies. With YARDstick, the ASIP designer is empowered with the freedom of specifying the target architecture of choice and adding new implementations of analyses and custom instruction generation/selection methods.

At this moment, YARDstick is being heavily used for developing a new processor architecture of mine with many never-being-seen features, mostly aiming FPGAs. Status update report on the processor architecture should be expected near late October 2007.

Typically, 2x to 15x speedups for benchmark applications (ANSI C optimized source code) can be fully automatically obtained by using YARDstick depending on the target architecture. Speedups are evaluated against a typical scalar RISC architecture.

Detailed feature list:

  1. Analysis engines generating both static and dynamic statistics: - Data types - Operation-level statistics - Basic block statistics (ranking) - Performance estimations with/without custom instructions.
  2. Generation of CDFGs (Control-Data Flow Graphs).
  3. Backend engines: - ANSI C - dot (Graphviz) - VCG (GDL, aiSee) - XML (GGX for the AGG graph rewriting tool) - Retargetable assembly emitter for entire translation units (single files with multiple functions/procedures). - CDFG formats for various RTL synthesis tools.
  4. Custom instruction engines: - Full-parameterized MIMO custom instruction generation algorithm. Features: * Fast heuristic !!! * Configurable number of inputs * Configurable number of outputs * List of forbidden nodes * Node sorting strategies (3 different strategies!) * Transformation rule library for applying CFG transformation strategies
  5. Custom instruction selection: - Based on priority metrics (2 choices at the moment).
  6. Graph (and graph-subgraph) isomorphism features for eliminating redundant patterns. Multiple algorithms supported.
  7. Visualization of custom instructions, basic blocks, control-flow graphs and control-data flow graphs (basic block nodes expanded to their constituent instructions).
  8. Basic retargetable compiler features (alpha state): - Code selector for MIMO instructions (tested with large cases). - Virtual register assignment (allocation for a VM). - Hard register allocator in the works.
  9. Miscellaneous features: - single constant multiplication optimizer - elimination of false data-dependences in assembly-level CDFGs. - beautification options for visualization - interfacing (co-operation) with external tools such as peephole optimizers, profilers, code generators etc. - features related to the custom processor architecture (not to be disclosed yet)

Here is a list of application benchmarks that have been tested with YARDstick (compiler features not fully tested): - ADPCM encoder and decoder (typically: 4x speedup) - Video processing kernels: full-search block-matching motion estimation, logarithmic search motion estimation, motion compensation - Image processing kernels: steganography (hide/uncover), edge detection, matrix multiplication - Cryptographic kernels: crc32, rc5, raiden (7x speedup, 12x for unrolled version)

At the YARDstick homepage

formatting link
nkavv/yardstick/) you can find some additional material:

- 2-page brochure - 2-page abstract for the DATE'07 University Booth - a more extended presentation on YARDstick

The above material refers to the status of April 2007.

Expected enhancements to YARDstick in the near future: - linear-scan and integer-linear programming based register allocators - bitwidth analysis - CDFG->VHDL generation of custom instruction hardware - algorithm implementation for CDFG pipelining

Interested parties are welcome to contact me for details on how to get access to a demo version of the YARDstick toolset.

Kind regards

Nikolaos Kavvadias Computer Architecture Specialist - Compiler Developer Ph.D. candidate M.Sc. Eletronics Engineering B.Sc. Physics

You may contact me at: Nikolaos Kavvadias

formatting link
formatting link

Reply to
Uncle Noah
Loading thread data ...

The URL:

formatting link

This will link directly for all.

Reply to
Uncle Noah

And an example for a custom instruction AUTO-GENERATED for an edge detection filter. Ckeck out, the VCG, Graphviz and C outputs from the corresponding backends.

VCG output:

graph: { title: "main_9"

x: 30 y: 30 height: 380 width: 560 xspace: 20 yspace: 30 display_edge_labels: yes layoutalgorithm: minbackward port_sharing: no node.borderwidth: 3 node.color: white node.textcolor: black node.bordercolor: black edge.color: black

node: { title:"0" shape: ellipse label:" ior" color:yellow } node: { title:"1" shape: ellipse label:" sl" color:yellow } node: { title:"2" shape: ellipse label:" abs" color:yellow } node: { title:"3" shape: ellipse label:" sub" color:yellow } node: { title:"4" shape: ellipse label:" sl" color:yellow } node: { title:"5" shape: ellipse label:" abs" color:yellow } node: { title:"6" shape: ellipse label:" sub" color:yellow } node: { title:"7" shape: ellipse label:" ldc" color:yellow } node: { title:"8" shape: rhomb label:" 1" color:magenta } edge: {sourcename:"8" targetname:"7"} node: { title:"9" shape: triangle label:" vr234.s32" color:cyan } edge: {sourcename:"0" targetname:"9"} node: { title:"10" shape: triangle label:" vr235.s32" color:cyan } edge: {sourcename:"7" targetname:"10"} node: { title:"11" shape: box label:" vr60.s32" color:green } edge: {sourcename:"11" targetname:"1"} node: { title:"12" shape: box label:" vr207.s32" color:green } edge: {sourcename:"12" targetname:"3"} node: { title:"13" shape: box label:" vr210.s32" color:green } edge: {sourcename:"13" targetname:"3"} edge: {sourcename:"11" targetname:"4"} edge: {sourcename:"12" targetname:"6"} node: { title:"14" shape: box label:" vr220.s32" color:green } edge: {sourcename:"14" targetname:"6"}

edge: {sourcename:"3" targetname:"2" label:"vr228.s32" } edge: {sourcename:"2" targetname:"1" label:"vr229.s32" } edge: {sourcename:"1" targetname:"0" label:"vr230.s32" } edge: {sourcename:"6" targetname:"5" label:"vr231.s32" } edge: {sourcename:"5" targetname:"4" label:"vr232.s32" } edge: {sourcename:"4" targetname:"0" label:"vr233.s32" }

}

Graphviz output: digraph main_9 {

node [fontname=Courier,fontsize=14,style=filled]; 0 [shape=ellipse,label="ior",fillcolor=yellow] 1 [shape=ellipse,label="sl",fillcolor=yellow] 2 [shape=ellipse,label="abs",fillcolor=yellow] 3 [shape=ellipse,label="sub",fillcolor=yellow] 4 [shape=ellipse,label="sl",fillcolor=yellow] 5 [shape=ellipse,label="abs",fillcolor=yellow] 6 [shape=ellipse,label="sub",fillcolor=yellow] 7 [shape=ellipse,label="ldc",fillcolor=yellow] 8 [shape=diamond,label="1",fillcolor=magenta] 8 -> 7; 9 [shape=triangle,label="vr234.s32",fillcolor=cyan] 0 -> 9; 10 [shape=triangle,label="vr235.s32",fillcolor=cyan] 7 -> 10; 11 [shape=invtriangle,label="vr60.s32",fillcolor=green] 11 -> 1; 12 [shape=invtriangle,label="vr207.s32",fillcolor=green] 12 -> 3; 13 [shape=invtriangle,label="vr210.s32",fillcolor=green] 13 -> 3; 11 -> 4; 12 -> 6; 14 [shape=invtriangle,label="vr220.s32",fillcolor=green] 14 -> 6;

3 -> 2 [label="vr228.s32"]; 2 -> 1 [label="vr229.s32"]; 1 -> 0 [label="vr230.s32"]; 6 -> 5 [label="vr231.s32"]; 5 -> 4 [label="vr232.s32"]; 4 -> 0 [label="vr233.s32"]; }

C output:

void ci_9( int *d0 ,int *d1 ,int s0 ,int s1 ,int s2 ,int s3 ) { int vr228_s32; int vr229_s32; int vr230_s32; int vr231_s32; int vr232_s32; int vr233_s32; *d1 = 1; vr231_s32 = s1-s3; vr232_s32 = ((vr231_s32 < 0) ? -vr231_s32 : vr231_s32); vr233_s32 = s0

Reply to
Uncle Noah

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.