• Engineering Mathematics
  • Discrete Mathematics
  • Operating System
  • Computer Networks
  • Digital Logic and Design
  • C Programming
  • Data Structures
  • Theory of Computation
  • Compiler Design
  • Computer Org and Architecture

Static Single Assignment (with relevant examples)

Static Single Assignment was presented in 1988 by Barry K. Rosen, Mark N, Wegman, and F. Kenneth Zadeck. 

In compiler design, Static Single Assignment ( shortened SSA) is a means of structuring the IR (intermediate representation) such that every variable is allotted a value only once and every variable is defined before it’s use. The prime use of SSA is it simplifies and improves the results of compiler optimisation algorithms, simultaneously by simplifying the variable properties. Some Algorithms improved by application of SSA – 

  • Constant Propagation –   Translation of calculations from runtime to compile time. E.g. – the instruction v = 2*7+13 is treated like v = 27
  • Value Range Propagation –   Finding the possible range of values a calculation could result in.
  • Dead Code Elimination – Removing the code which is not accessible and will have no effect on results whatsoever.
  • Strength Reduction – Replacing computationally expensive calculations by inexpensive ones.
  • Register Allocation – Optimising the use of registers for calculations.

Any code can be converted to SSA form by simply replacing the target variable of each code segment with a new variable and substituting each use of a variable with the new edition of the variable reaching that point. Versions are created by splitting the original variables existing in IR and are represented by original name with a subscript such that every variable gets its own version.

Example #1:

Convert the following code segment to SSA form:

Here x,y,z,s,p,q are original variables and x 2 , s 2 , s 3 , s 4 are versions of x and s. 

Example #2:

Here a,b,c,d,e,q,s are original variables and a 2 , q 2 , q 3 are versions of a and q. 

Phi function and SSA codes

The three address codes may also contain goto statements, and thus a variable may assume value from two different paths.

Consider the following example:-

Example #3:

When we try to convert the above three address code to SSA form, the output looks like:-

Attempt #3:

We need to be able to decide what value shall y take, out of x 1 and x 2 . We thus introduce the notion of phi functions, which resolves the correct value of the variable from two different computation paths due to branching.

Hence, the correct SSA codes for the example will be:-

Solution #3:

Thus, whenever a three address code has a branch and control may flow along two different paths, we need to use phi functions for appropriate addresses.

author

Please Login to comment...

Similar reads, improve your coding skills with practice.

 alt=

What kind of Experience do you want to share?

ENOSUCHBLOG

Programming, philosophy, pedaling., understanding static single assignment forms, oct 23, 2020     tags: llvm , programming    .

This post is at least a year old.

With thanks to Niki Carroll , winny, and kurufu for their invaluable proofreading and advice.

By popular demand , I’m doing another LLVM post. This time, it’s single static assignment (or SSA) form, a common feature in the intermediate representations of optimizing compilers.

Like the last one , SSA is a topic in compiler and IR design that I mostly understand but could benefit from some self-guided education on. So here we are.

How to represent a program

At the highest level, a compiler’s job is singular: to turn some source language input into some machine language output . Internally, this breaks down into a sequence of clearly delineated 1 tasks:

  • Lexing the source into a sequence of tokens
  • Parsing the token stream into an abstract syntax tree , or AST 2
  • Validating the AST (e.g., ensuring that all uses of identifiers are consistent with the source language’s scoping and definition rules) 3
  • Translating the AST into machine code, with all of its complexities (instruction selection, register allocation, frame generation, &c)

In a single-pass compiler, (4) is monolithic: machine code is generated as the compiler walks the AST, with no revisiting of previously generated code. This is extremely fast (in terms of compiler performance) in exchange for some a few significant limitations:

Optimization potential: because machine code is generated in a single pass, it can’t be revisited for optimizations. Single-pass compilers tend to generate extremely slow and conservative machine code.

By way of example: the System V ABI (used by Linux and macOS) defines a special 128-byte region beyond the current stack pointer ( %rsp ) that can be used by leaf functions whose stack frames fit within it. This, in turn, saves a few stack management instructions in the function prologue and epilogue.

A single-pass compiler will struggle to take advantage of this ABI-supplied optimization: it needs to emit a stack slot for each automatic variable as they’re visited, and cannot revisit its function prologue for erasure if all variables fit within the red zone.

Language limitations: single-pass compilers struggle with common language design decisions, like allowing use of identifiers before their declaration or definition. For example, the following is valid C++:

Rect { public: int area() { return width() * height(); } int width() { return 5; } int height() { return 5; } };

C and C++ generally require pre-declaration and/or definition for identifiers, but member function bodies may reference the entire class scope. This will frustrate a single-pass compiler, which expects Rect::width and Rect::height to already exist in some symbol lookup table for call generation.

Consequently, (virtually) all modern compilers are multi-pass .

Pictured: Leeloo Dallas from The Fifth Element holding up her multi-pass.

Multi-pass compilers break the translation phase down even more:

  • The AST is lowered into an intermediate representation , or IR
  • Analyses (or passes) are performed on the IR, refining it according to some optimization profile (code size, performance, &c)
  • The IR is either translated to machine code or lowered to another IR, for further target specialization or optimization 4

So, we want an IR that’s easy to correctly transform and that’s amenable to optimization. Let’s talk about why IRs that have the static single assignment property fill that niche.

At its core, the SSA form of any program source program introduces only one new constraint: all variables are assigned (i.e., stored to) exactly once .

By way of example: the following (not actually very helpful) function is not in a valid SSA form with respect to the flags variable:

helpful_open(char *fname) { int flags = O_RDWR; if (!access(fname, F_OK)) { flags |= O_CREAT; } int fd = open(fname, flags, 0644); return fd; }

Why? Because flags is stored to twice: once for initialization, and (potentially) again inside the conditional body.

As programmers, we could rewrite helpful_open to only ever store once to each automatic variable:

helpful_open(char *fname) { if (!access(fname, F_OK)) { int flags = O_RDWR | O_CREAT; return open(fname, flags, 0644); } else { int flags = O_RDWR; return open(fname, flags, 0644); } }

But this is clumsy and repetitive: we essentially need to duplicate every chain of uses that follow any variable that is stored to more than once. That’s not great for readability, maintainability, or code size.

So, we do what we always do: make the compiler do the hard work for us. Fortunately there exists a transformation from every valid program into an equivalent SSA form, conditioned on two simple rules.

Rule #1: Whenever we see a store to an already-stored variable, we replace it with a brand new “version” of that variable.

Using rule #1 and the example above, we can rewrite flags using _N suffixes to indicate versions:

helpful_open(char *fname) { int flags_0 = O_RDWR; // Declared up here to avoid dealing with C scopes. int flags_1; if (!access(fname, F_OK)) { flags_1 = flags_0 | O_CREAT; } int fd = open(fname, flags_1, 0644); return fd; }

But wait a second: we’ve made a mistake!

  • open(..., flags_1, ...) is incorrect: it unconditionally assigns O_CREAT , which wasn’t in the original function semantics.
  • open(..., flags_0, ...) is also incorrect: it never assigns O_CREAT , and thus is wrong for the same reason.

So, what do we do? We use rule 2!

Rule #2: Whenever we need to choose a variable based on control flow, we use the Phi function (φ) to introduce a new variable based on our choice.

Using our example once more:

helpful_open(char *fname) { int flags_0 = O_RDWR; // Declared up here to avoid dealing with C scopes. int flags_1; if (!access(fname, F_OK)) { flags_1 = flags_0 | O_CREAT; } int flags_2 = φ(flags_0, flags_1); int fd = open(fname, flags_2, 0644); return fd; }

Our quandary is resolved: open always takes flags_2 , where flags_2 is a fresh SSA variable produced applying φ to flags_0 and flags_1 .

Observe, too, that φ is a symbolic function: compilers that use SSA forms internally do not emit real φ functions in generated code 5 . φ exists solely to reconcile rule #1 with the existence of control flow.

As such, it’s a little bit silly to talk about SSA forms with C examples (since C and other high-level languages are what we’re translating from in the first place). Let’s dive into how LLVM’s IR actually represents them.

SSA in LLVM

First of all, let’s see what happens when we run our very first helpful_open through clang with no optimizations:

dso_local i32 @helpful_open(i8* %fname) #0 { entry: %fname.addr = alloca i8*, align 8 %flags = alloca i32, align 4 %fd = alloca i32, align 4 store i8* %fname, i8** %fname.addr, align 8 store i32 2, i32* %flags, align 4 %0 = load i8*, i8** %fname.addr, align 8 %call = call i32 @access(i8* %0, i32 0) #4 %tobool = icmp ne i32 %call, 0 br i1 %tobool, label %if.end, label %if.then if.then: ; preds = %entry %1 = load i32, i32* %flags, align 4 %or = or i32 %1, 64 store i32 %or, i32* %flags, align 4 br label %if.end if.end: ; preds = %if.then, %entry %2 = load i8*, i8** %fname.addr, align 8 %3 = load i32, i32* %flags, align 4 %call1 = call i32 (i8*, i32, ...) @open(i8* %2, i32 %3, i32 420) store i32 %call1, i32* %fd, align 4 %4 = load i32, i32* %fd, align 4 ret i32 %4 }

(View it on Godbolt .)

So, we call open with %3 , which comes from…a load from an i32* named %flags ? Where’s the φ?

This is something that consistently slips me up when reading LLVM’s IR: only values , not memory, are in SSA form. Because we’ve compiled with optimizations disabled, %flags is just a stack slot that we can store into as many times as we please, and that’s exactly what LLVM has elected to do above.

As such, LLVM’s SSA-based optimizations aren’t all that useful when passed IR that makes direct use of stack slots. We want to maximize our use of SSA variables, whenever possible, to make future optimization passes as effective as possible.

This is where mem2reg comes in:

This file (optimization pass) promotes memory references to be register references. It promotes alloca instructions which only have loads and stores as uses. An alloca is transformed by using dominator frontiers to place phi nodes, then traversing the function in depth-first order to rewrite loads and stores as appropriate. This is just the standard SSA construction algorithm to construct “pruned” SSA form.

(Parenthetical mine.)

mem2reg gets run at -O1 and higher, so let’s do exactly that:

dso_local i32 @helpful_open(i8* nocapture readonly %fname) local_unnamed_addr #0 { entry: %call = call i32 @access(i8* %fname, i32 0) #4 %tobool.not = icmp eq i32 %call, 0 %spec.select = select i1 %tobool.not, i32 66, i32 2 %call1 = call i32 (i8*, i32, ...) @open(i8* %fname, i32 %spec.select, i32 420) #4, !dbg !22 ret i32 %call1, !dbg !23 }

Foiled again! Our stack slots are gone thanks to mem2reg , but LLVM has actually optimized too far : it figured out that our flags value is wholly dependent on the return value of our access call and erased the conditional entirely.

Instead of a φ node, we got this select :

= select i1 %tobool.not, i32 66, i32 2

which the LLVM Language Reference describes concisely:

The ‘select’ instruction is used to choose one value based on a condition, without IR-level branching.

So we need a better example. Let’s do something that LLVM can’t trivially optimize into a select (or sequence of select s), like adding an else if with a function that we’ve only provided the declaration for:

filesize(char *); int helpful_open(char *fname) { int flags = O_RDWR; if (!access(fname, F_OK)) { flags |= O_CREAT; } else if (filesize(fname) > 0) { flags |= O_TRUNC; } int fd = open(fname, flags, 0644); return fd; }
dso_local i32 @helpful_open(i8* %fname) local_unnamed_addr #0 { entry: %call = call i32 @access(i8* %fname, i32 0) #5 %tobool.not = icmp eq i32 %call, 0 br i1 %tobool.not, label %if.end4, label %if.else if.else: ; preds = %entry %call1 = call i64 @filesize(i8* %fname) #5 %cmp.not = icmp eq i64 %call1, 0 %spec.select = select i1 %cmp.not, i32 2, i32 514 br label %if.end4 if.end4: ; preds = %if.else, %entry %flags.0 = phi i32 [ 66, %entry ], [ %spec.select, %if.else ] %call5 = call i32 (i8*, i32, ...) @open(i8* %fname, i32 %flags.0, i32 420) #5 ret i32 %call5 }

That’s more like it! Here’s our magical φ:

= phi i32 [ 66, %entry ], [ %spec.select, %if.else ]

LLVM’s phi is slightly more complicated than the φ(flags_0, flags_1) that I made up before, but not by much: it takes a list of pairs (two, in this case), with each pair containing a possible value and that value’s originating basic block (which, by construction, is always a predecessor block in the context of the φ node).

The Language Reference backs us up:

The type of the incoming values is specified with the first type field. After this, the ‘phi’ instruction takes a list of pairs as arguments, with one pair for each predecessor basic block of the current block. Only values of first class type may be used as the value arguments to the PHI node. Only labels may be used as the label arguments. There must be no non-phi instructions between the start of a basic block and the PHI instructions: i.e. PHI instructions must be first in a basic block.

Observe, too, that LLVM is still being clever: one of our φ choices is a computed select ( %spec.select ), so LLVM still managed to partially erase the original control flow.

So that’s cool. But there’s a piece of control flow that we’ve conspicuously ignored.

What about loops?

do_math(int count, int base) { for (int i = 0; i < count; i++) { base += base; } return base; }
dso_local i32 @do_math(i32 %count, i32 %base) local_unnamed_addr #0 { entry: %cmp5 = icmp sgt i32 %count, 0 br i1 %cmp5, label %for.body, label %for.cond.cleanup for.cond.cleanup: ; preds = %for.body, %entry %base.addr.0.lcssa = phi i32 [ %base, %entry ], [ %add, %for.body ] ret i32 %base.addr.0.lcssa for.body: ; preds = %entry, %for.body %i.07 = phi i32 [ %inc, %for.body ], [ 0, %entry ] %base.addr.06 = phi i32 [ %add, %for.body ], [ %base, %entry ] %add = shl nsw i32 %base.addr.06, 1 %inc = add nuw nsw i32 %i.07, 1 %exitcond.not = icmp eq i32 %inc, %count br i1 %exitcond.not, label %for.cond.cleanup, label %for.body, !llvm.loop !26 }

Not one, not two, but three φs! In order of appearance:

Because we supply the loop bounds via count , LLVM has no way to ensure that we actually enter the loop body. Consequently, our very first φ selects between the initial %base and %add . LLVM’s phi syntax helpfully tells us that %base comes from the entry block and %add from the loop, just as we expect. I have no idea why LLVM selected such a hideous name for the resulting value ( %base.addr.0.lcssa ).

Our index variable is initialized once and then updated with each for iteration, so it also needs a φ. Our selections here are %inc (which each body computes from %i.07 ) and the 0 literal (i.e., our initialization value).

Finally, the heart of our loop body: we need to get base , where base is either the initial base value ( %base ) or the value computed as part of the prior loop ( %add ). One last φ gets us there.

The rest of the IR is bookkeeping: we need separate SSA variables to compute the addition ( %add ), increment ( %inc ), and exit check ( %exitcond.not ) with each loop iteration.

So now we know what an SSA form is , and how LLVM represents them 6 . Why should we care?

As I briefly alluded to early in the post, it comes down to optimization potential: the SSA forms of programs are particularly suited to a number of effective optimizations.

Let’s go through a select few of them.

Dead code elimination

One of the simplest things that an optimizing compiler can do is remove code that cannot possibly be executed . This makes the resulting binary smaller (and usually faster, since more of it can fit in the instruction cache).

“Dead” code falls into several categories 7 , but a common one is assignments that cannot affect program behavior, like redundant initialization:

main(void) { int x = 100; if (rand() % 2) { x = 200; } else if (rand() % 2) { x = 300; } else { x = 400; } return x; }

Without an SSA form, an optimizing compiler would need to check whether any use of x reaches its original definition ( x = 100 ). Tedious. In SSA form, the impossibility of that is obvious:

main(void) { int x_0 = 100; // Just ignore the scoping. Computers aren't real life. if (rand() % 2) { int x_1 = 200; } else if (rand() % 2) { int x_2 = 300; } else { int x_3 = 400; } return φ(x_1, x_2, x_3); }

And sure enough, LLVM eliminates the initial assignment of 100 entirely:

dso_local i32 @main() local_unnamed_addr #0 { entry: %call = call i32 @rand() #3 %0 = and i32 %call, 1 %tobool.not = icmp eq i32 %0, 0 br i1 %tobool.not, label %if.else, label %if.end6 if.else: ; preds = %entry %call1 = call i32 @rand() #3 %1 = and i32 %call1, 1 %tobool3.not = icmp eq i32 %1, 0 %. = select i1 %tobool3.not, i32 400, i32 300 br label %if.end6 if.end6: ; preds = %if.else, %entry %x.0 = phi i32 [ 200, %entry ], [ %., %if.else ] ret i32 %x.0 }

Constant propagation

Compilers can also optimize a program by substituting uses of a constant variable for the constant value itself. Let’s take a look at another blob of C:

some_math(int x) { int y = 7; int z = 10; int a; if (rand() % 2) { a = y + z; } else if (rand() % 2) { a = y + z; } else { a = y - z; } return x + a; }

As humans, we can see that y and z are trivially assigned and never modified 8 . For the compiler, however, this is a variant of the reaching definition problem from above: before it can replace y and z with 7 and 10 respectively, it needs to make sure that y and z are never assigned a different value.

Let’s do our SSA reduction:

some_math(int x) { int y_0 = 7; int z_0 = 10; int a_0; if (rand() % 2) { int a_1 = y_0 + z_0; } else if (rand() % 2) { int a_2 = y_0 + z_0; } else { int a_3 = y_0 - z_0; } int a_4 = φ(a_1, a_2, a_3); return x + a_4; }

This is virtually identical to our original form, but with one critical difference: the compiler can now see that every load of y and z is the original assignment. In other words, they’re all safe to replace!

some_math(int x) { int y = 7; int z = 10; int a_0; if (rand() % 2) { int a_1 = 7 + 10; } else if (rand() % 2) { int a_2 = 7 + 10; } else { int a_3 = 7 - 10; } int a_4 = φ(a_1, a_2, a_3); return x + a_4; }

So we’ve gotten rid of a few potential register operations, which is nice. But here’s the really critical part: we’ve set ourselves up for several other optimizations :

Now that we’ve propagated some of our constants, we can do some trivial constant folding : 7 + 10 becomes 17 , and so forth.

In SSA form, it’s trivial to observe that only x and a_{1..4} can affect the program’s behavior. So we can apply our dead code elimination from above and delete y and z entirely!

This is the real magic of an optimizing compiler: each individual optimization is simple and largely independent, but together they produce a virtuous cycle that can be repeated until gains diminish.

One potential virtuous cycle.

Register allocation

Register allocation (alternatively: register scheduling) is less of an optimization itself , and more of an unavoidable problem in compiler engineering: it’s fun to pretend to have access to an infinite number of addressable variables, but the compiler eventually insists that we boil our operations down to a small, fixed set of CPU registers .

The constraints and complexities of register allocation vary by architecture: x86 (prior to AMD64) is notoriously starved for registers 9 (only 8 full general purpose registers, of which 6 might be usable within a function’s scope 10 ), while RISC architectures typically employ larger numbers of registers to compensate for the lack of register-memory operations.

Just as above, reductions to SSA form have both indirect and direct advantages for the register allocator:

Indirectly: Eliminations of redundant loads and stores reduces the overall pressure on the register allocator, allowing it to avoid expensive spills (i.e., having to temporarily transfer a live register to main memory to accommodate another instruction).

Directly: Compilers have historically lowered φs into copies before register allocation, meaning that register allocators traditionally haven’t benefited from the SSA form itself 11 . There is, however, (semi-)recent research on direct application of SSA forms to both linear and coloring allocators 12 13 .

A concrete example: modern JavaScript engines use JITs to accelerate program evaluation. These JITs frequently use linear register allocators for their acceptable tradeoff between register selection speed (linear, as the name suggests) and acceptable register scheduling. Converting out of SSA form is a timely operation of its own, so linear allocation on the SSA representation itself is appealing in JITs and other contexts where compile time is part of execution time.

There are many things about SSA that I didn’t cover in this post: dominance frontiers , tradeoffs between “pruned” and less optimal SSA forms, and feedback mechanisms between the SSA form of a program and the compiler’s decision to cease optimizing, among others. Each of these could be its own blog post, and maybe will be in the future!

In the sense that each task is conceptually isolated and has well-defined inputs and outputs. Individual compilers have some flexibility with respect to whether they combine or further split the tasks.  ↩

The distinction between an AST and an intermediate representation is hazy: Rust converts their AST to HIR early in the compilation process, and languages can be designed to have ASTs that are amendable to analyses that would otherwise be best on an IR.  ↩

This can be broken up into lexical validation (e.g. use of an undeclared identifier) and semantic validation (e.g. incorrect initialization of a type).  ↩

This is what LLVM does: LLVM IR is lowered to MIR (not to be confused with Rust’s MIR ), which is subsequently lowered to machine code.  ↩

Not because they can’t: the SSA form of a program can be executed by evaluating φ with concrete control flow.  ↩

We haven’t talked at all about minimal or pruned SSAs, and I don’t plan on doing so in this post. The TL;DR of them: naïve SSA form generation can lead to lots of unnecessary φ nodes, impeding analyses. LLVM (and GCC, and anything else that uses SSAs probably) will attempt to translate any initial SSA form into one with a minimally viable number of φs. For LLVM, this tied directly to the rest of mem2reg .  ↩

Including removing code that has undefined behavior in it, since “doesn’t run at all” is a valid consequence of invoking UB.  ↩

And are also function scoped, meaning that another translation unit can’t address them.  ↩

x86 makes up for this by not being a load-store architecture : many instructions can pay the price of a memory round-trip in exchange for saving a register.  ↩

Assuming that %esp and %ebp are being used by the compiler to manage the function’s frame.  ↩

LLVM, for example, lowers all φs as one of its very first preparations for register allocation. See this 2009 LLVM Developers’ Meeting talk .  ↩

Wimmer 2010a: “Linear Scan Register Allocation on SSA Form” ( PDF )  ↩

Hack 2005: “Towards Register Allocation for Programs in SSA-form” ( PDF )  ↩

Lab 7: Static Single-Assignment Form

In this lab, you will build a static single-assignment form (SSA) for your Tiger compiler to perform optimizations. SSA is a state-of-the-art IR also used in production compilers such as GCC or LLVM.

This lab consists of two parts: first, in part A, you will design and implement data structures defining the static single-assigment form. You will also implement translations from CFG to SSA as well as translations from SSA back to CFG. Second, in part B, you will implement several classic data-flow analysis and optimizations on SSA.

Getting Started

First check out the source we offered you for lab7:

these commands will first commit your changes to the lab6 branch of your local Git repository, and then create a local lab7 branch and check out the remote lab7 branch into the new local lab7 branch.

Do not forget to resolve any conflicts before commit to the local lab7 branch:

You should first import the new lab7 code into your favorite IDE, and make sure the code compiles. There are a bunch of new files that you should browse through:

Hand-in Procedure

When you finished your lab, zip you code and submit to the online teaching system .

Part A: Static Single-Assignment Form (SSA)

The static single-assignment form (SSA) is an important compiler intermediate representation, in which each variable is assigned (statically) at most once. SSA is more advantageous over other compiler IRs in that its single-assignment property makes program analysis and optimization simpler and more efficient. As a result, SSA has become a de-factor IR in modern optimizing compilers for imperative, OO, or even functional languages.

In this part of the lab, you will build a static single-assigment form for your Tiger compiler, and conduct optimizations based on your SSA IR. To aid in your implementation, we have given some hints for most exercises, but keep in mind that these hints are not mandatory, and you are free (and encouraged) to propose your own ideas.

SSA Data Structure

To better reuse your existing code base, you will implement SSA by leveraging data structures already designed for the control-flow graph. This design decision makes the interfaces cleaner and more elegant hence simplifying subsequent compilation phases.

Dominator Trees

You will be constructing the SSA form with the following 5 steps: 1) calculate dominators; 2) build dominator trees; 3) calculate dominance frontiers; 4) place φ-functions; and 5) rename variables.

In this part, you will finish the first two steps, that is, you will first calculate dominators then build a dominator tree for a given CFG.

Dominance Frontier

In this part of the lab, you will be implementing the third step to build an SSA, that is, you will calculate the dominance frontier for each node. The dominance frontier DF[n] of a node n is the set of all nodes d such that n dominates an immediate predecessor of d , but n does not strictly dominate d . Intuitively, the dominance frontier DF[n] for a node n is the set of nodes where n 's dominance stops.

φ-function Placement

In this part, you will implement the fourth step to build an SSA, that is, you will be implementing the φ-function placement. The φ-function placement algorithm places φ-functions at the top of corresponding blocks. Specifically, for a definition

to a variable x in a basic block n , this algorithm will place a φ-function

at the top of each n 's dominance frontier d ∈ DF[n] .

Renaming the Variables

In this part of the lab, you will be implementing the fifth and last step to build an SSA, that is, you will rename variables to make its definition unique.

To this point, your Tiger compiler can convert all legal MiniJava programs to SSA forms. Do regression test your Tiger compiler and fix potential bugs.

Translation Out of SSA Forms

Modern machines do not support the φ-functions directly, hence φ-functions must be eliminated before execution. This task is accomplished by the translation out of SSA forms.

To this point, your Tiger compiler can convert any SSA form back to a corresponding CFG. Do not forget to regression test your Tiger compiler.

Part B: SSA-based Optimizations

SSA makes compiler optimizations not only easier to engineer but also more efficient to execute, largely due to its single-assignment property. This nice property make it much easier to calculate the data-flow information required to perform optimizations.

In this part of the lab, you will re-implement several optimizations that we have done on the CFG on SSA again: dead code-elimination, constant propagation, copy propagation, among others. And you will also implement several novel optimizations that are particularly suitable for SSA: conditional constant propagation, global value numbering, among others.

Dead-code Elimination

In SSA, a statement x = e is dead, if the variable x is not used by any other statement (and e does not have side effects). As a result, this statement x = e can be safely eliminated.

Constant Propagation

In SSA, given a statement of the form x = c in which c is a constant, we can propagate the constant c to any use of variable x .

Copy Propagation

In SSA, given a copy statement x = y , then any use of x can be replaced by y .

φ-function Elimination

In SSA, if a φ-function is of a special form taking a list of same arguments:

where c is a constant, then this φ-function can be eliminated by rewriting it to a normal assignment to x :

Note that, often the time, this special form of φ-function x = φ(c, c, ..., c) might be generated by constant propagation optimizations, which substituted φ-function arguments by constants.

Similarly, the φ-function:

can be eliminated by rewriting it to an assignment statement:

where y is a variable.

Conditional Constant Propagation

Global value numbering, partial redundancy elimination.

Do not forget to test Tiger compiler after finishing all these optimizations.

Part C: SSA Program Analysis

This completes the lab. Remember to hand in your solution to the online teaching system.

Lesson 5: Global Analysis & SSA

  • global analysis & optimization
  • static single assignment
  • SSA slides from Todd Mowry at CMU another presentation of the pseudocode for various algorithms herein
  • Revisiting Out-of-SSA Translation for Correctness, Code Quality, and Efficiency by Boissinot on more sophisticated was to translate out of SSA form
  • tasks due October 7

Lots of definitions!

  • Reminders: Successors & predecessors. Paths in CFGs.
  • A dominates B iff all paths from the entry to B include A .
  • The dominator tree is a convenient data structure for storing the dominance relationships in an entire function. The recursive children of a given node in a tree are the nodes that that node dominates.
  • A strictly dominates B iff A dominates B and A ≠ B . (Dominance is reflexive, so "strict" dominance just takes that part away.)
  • A immediately dominates B iff A dominates B but A does not strictly dominate any other node that strictly dominates B . (In which case A is B 's direct parent in the dominator tree.)
  • A dominance frontier is the set of nodes that are just "one edge away" from being dominated by a given node. Put differently, A 's dominance frontier contains B iff A does not strictly dominate B , but A does dominate some predecessor of B .
  • Post-dominance is the reverse of dominance. A post-dominates B iff all paths from B to the exit include A . (You can extend the strict version, the immediate version, trees, etc. to post-dominance.)

An algorithm for finding dominators:

The dom relation will, in the end, map each block to its set of dominators. We initialize it as the "complete" relation, i.e., mapping every block to the set of all blocks. The loop pares down the sets by iterating to convergence.

The running time is O(n²) in the worst case. But there's a trick: if you iterate over the CFG in reverse post-order , and the CFG is well behaved (reducible), it runs in linear time—the outer loop runs a constant number of times.

Natural Loops

Some things about loops:

  • Natural loops are strongly connected components in the CFG with a single entry.
  • Natural loops are formed around backedges , which are edges from A to B where B dominates A .
  • A natural loop is the smallest set of vertices L including A and B such that, for every v in L , either all the predecessors of v are in L or v = B .
  • A language that only has for , while , if , break , continue , etc. can only generate reducible CFGs. You need goto or something to generate irreducible CFGs.

Loop-Invariant Code Motion (LICM)

And finally, loop-invariant code motion (LICM) is an optimization that works on natural loops. It moves code from inside a loop to before the loop, if the computation always does the same thing on every iteration of the loop.

A loop's preheader is its header's unique predecessor. LICM moves code to the preheader. But while natural loops need to have a unique header, the header does not necessarily have a unique predecessor. So it's often convenient to invent an empty preheader block that jumps directly to the header, and then move all the in-edges to the header to point there instead.

LICM needs two ingredients: identifying loop-invariant instructions in the loop body, and deciding when it's safe to move one from the body to the preheader.

To identify loop-invariant instructions:

(This determination requires that you already calculated reaching definitions! Presumably using data flow.)

It's safe to move a loop-invariant instruction to the preheader iff:

  • The definition dominates all of its uses, and
  • No other definitions of the same variable exist in the loop, and
  • The instruction dominates all loop exits.

The last criterion is somewhat tricky: it ensures that the computation would have been computed eventually anyway, so it's safe to just do it earlier. But it's not true of loops that may execute zero times, which, when you think about it, rules out most for loops! It's possible to relax this condition if:

  • The assigned-to variable is dead after the loop, and
  • The instruction can't have side effects, including exceptions—generally ruling out division because it might divide by zero. (A thing that you generally need to be careful of in such speculative optimizations that do computations that might not actually be necessary.)

Static Single Assignment (SSA)

You have undoubtedly noticed by now that many of the annoying problems in implementing analyses & optimizations stem from variable name conflicts. Wouldn't it be nice if every assignment in a program used a unique variable name? Of course, people don't write programs that way, so we're out of luck. Right?

Wrong! Many compilers convert programs into static single assignment (SSA) form, which does exactly what it says: it ensures that, globally, every variable has exactly one static assignment location. (Of course, that statement might be executed multiple times, which is why it's not dynamic single assignment.) In Bril terms, we convert a program like this:

Into a program like this, by renaming all the variables:

Of course, things will get a little more complicated when there is control flow. And because real machines are not SSA, using separate variables (i.e., memory locations and registers) for everything is bound to be inefficient. The idea in SSA is to convert general programs into SSA form, do all our optimization there, and then convert back to a standard mutating form before we generate backend code.

Just renaming assignments willy-nilly will quickly run into problems. Consider this program:

If we start renaming all the occurrences of a , everything goes fine until we try to write that last print a . Which "version" of a should it use?

To match the expressiveness of unrestricted programs, SSA adds a new kind of instruction: a ϕ-node . ϕ-nodes are flow-sensitive copy instructions: they get a value from one of several variables, depending on which incoming CFG edge was most recently taken to get to them.

In Bril, a ϕ-node appears as a phi instruction:

The phi instruction chooses between any number of variables, and it picks between them based on labels. If the program most recently executed a basic block with the given label, then the phi instruction takes its value from the corresponding variable.

You can write the above program in SSA like this:

It can also be useful to see how ϕ-nodes crop up in loops.

(An aside: some recent SSA-form IRs, such as MLIR and Swift's IR , use an alternative to ϕ-nodes called basic block arguments . Instead of making ϕ-nodes look like weird instructions, these IRs bake the need for ϕ-like conditional copies into the structure of the CFG. Basic blocks have named parameters, and whenever you jump to a block, you must provide arguments for those parameters. With ϕ-nodes, a basic block enumerates all the possible sources for a given variable, one for each in-edge in the CFG; with basic block arguments, the sources are distributed to the "other end" of the CFG edge. Basic block arguments are a nice alternative for "SSA-native" IRs because they avoid messy problems that arise when needing to treat ϕ-nodes differently from every other kind of instruction.)

Bril in SSA

Bril has an SSA extension . It adds support for a phi instruction. Beyond that, SSA form is just a restriction on the normal expressiveness of Bril—if you solemnly promise never to assign statically to the same variable twice, you are writing "SSA Bril."

The reference interpreter has built-in support for phi , so you can execute your SSA-form Bril programs without fuss.

The SSA Philosophy

In addition to a language form, SSA is also a philosophy! It can fundamentally change the way you think about programs. In the SSA philosophy:

  • definitions == variables
  • instructions == values
  • arguments == data flow graph edges

In LLVM, for example, instructions do not refer to argument variables by name—an argument is a pointer to defining instruction.

Converting to SSA

To convert to SSA, we want to insert ϕ-nodes whenever there are distinct paths containing distinct definitions of a variable. We don't need ϕ-nodes in places that are dominated by a definition of the variable. So what's a way to know when control reachable from a definition is not dominated by that definition? The dominance frontier!

We do it in two steps. First, insert ϕ-nodes:

Then, rename variables:

Converting from SSA

Eventually, we need to convert out of SSA form to generate efficient code for real machines that don't have phi -nodes and do have finite space for variable storage.

The basic algorithm is pretty straightforward. If you have a ϕ-node:

Then there must be assignments to x and y (recursively) preceding this statement in the CFG. The paths from x to the phi -containing block and from y to the same block must "converge" at that block. So insert code into the phi -containing block's immediate predecessors along each of those two paths: one that does v = id x and one that does v = id y . Then you can delete the phi instruction.

This basic approach can introduce some redundant copying. (Take a look at the code it generates after you implement it!) Non-SSA copy propagation optimization can work well as a post-processing step. For a more extensive take on how to translate out of SSA efficiently, see “Revisiting Out-of-SSA Translation for Correctness, Code Quality, and Efficiency” by Boissinot et al.

  • Find dominators for a function.
  • Construct the dominance tree.
  • Compute the dominance frontier.
  • One thing to watch out for: a tricky part of the translation from the pseudocode to the real world is dealing with variables that are undefined along some paths.
  • You will want to make sure the output of your "to SSA" pass is actually in SSA form. There's a really simple is_ssa.py script that can check that for you.
  • You'll also want to make sure that programs do the same thing when converted to SSA form and back again. Fortunately, brili supports the phi instruction, so you can interpret your SSA-form programs if you want to check the midpoint of that round trip.
  • For bonus "points," implement global value numbering for SSA-form Bril code.
  • Docs »
  • Control-Flow Constructs »
  • Single-Static Assignment Form and PHI
  • Edit on GitHub

Single-Static Assignment Form and PHI ¶

We’ll take a look at the same very simple max function, as in the previous section.

Translated to LLVM IR:

We can see that the function allocates space on the stack with alloca [2] , where the bigger value is stored. In one branch %a is stored, while in the other branch %b is stored to the stack allocated memory. However, we want to avoid using memory load/store operation and use registers instead, whenever possible. So we would like to write something like this:

This is not valid LLVM IR, because it violates the static single assignment form (SSA, [1] ) of the LLVM IR. SSA form requires that every variable is assigned only exactly once. SSA form enables and simplifies a vast number of compiler optimizations, and is the de-facto standard for intermediate representations in compilers of imperative programming languages.

Now how would one implement the above code in proper SSA form LLVM IR? The answer is the magic phi instruction. The phi instruction is named after the φ function used in the theory of SSA. This functions magically chooses the right value, depending on the control flow. In LLVM you have to manually specify the name of the value and the previous basic block.

Here we instruct the phi instruction to choose %a if the previous basic block was %btrue . If the previous basic block was %bfalse , then %b will be used. The value is then assigned to a new variable %retval . Here you can see the full code listing:

PHI in the Back End ¶

Let’s have a look how the @max function now maps to actual machine code. We’ll have a look what kind of assembly code is generated by the compiler back end. In this case we’ll look at the code generated for x86 64-bit, compiled with different optimization levels. We’ll start with a non-optimizing backend ( llc -O0 -filetype=asm ). We will get something like this assembly:

The parameters %a and %b are passed in %edi and %esi respectively. We can see that the compiler back end generated code that uses the stack to store the bigger value. So the code generated by the compiler back end is not what we had in mind, when we wrote the LLVM IR. The reason for this is that the compiler back end needs to implement the phi instruction with real machine instructions. Usually that is done by assigning to one register or storing to one common stack memory location. Usually the compiler back end will use the stack for implementing the phi instruction. However, if we use a little more optimization in the back end (i.e., llc -O1 ), we can get a more optimized version:

Here the phi function is implemented by using the %edi register. In one branch %edi already contains the desired value, so nothing happens. In the other branch %esi is copied to %edi . At the %end basic block, %edi contains the desired value from both branches. This is more like what we had in mind. We can see that optimization is something that needs to be applied through the whole compilation pipeline.

[3]

American Psychological Association

Title Page Setup

A title page is required for all APA Style papers. There are both student and professional versions of the title page. Students should use the student version of the title page unless their instructor or institution has requested they use the professional version. APA provides a student title page guide (PDF, 199KB) to assist students in creating their title pages.

Student title page

The student title page includes the paper title, author names (the byline), author affiliation, course number and name for which the paper is being submitted, instructor name, assignment due date, and page number, as shown in this example.

diagram of a student page

Title page setup is covered in the seventh edition APA Style manuals in the Publication Manual Section 2.3 and the Concise Guide Section 1.6

what is single static assignment

Related handouts

  • Student Title Page Guide (PDF, 263KB)
  • Student Paper Setup Guide (PDF, 3MB)

Student papers do not include a running head unless requested by the instructor or institution.

Follow the guidelines described next to format each element of the student title page.

Paper title

Place the title three to four lines down from the top of the title page. Center it and type it in bold font. Capitalize of the title. Place the main title and any subtitle on separate double-spaced lines if desired. There is no maximum length for titles; however, keep titles focused and include key terms.

Author names

Place one double-spaced blank line between the paper title and the author names. Center author names on their own line. If there are two authors, use the word “and” between authors; if there are three or more authors, place a comma between author names and use the word “and” before the final author name.

Cecily J. Sinclair and Adam Gonzaga

Author affiliation

For a student paper, the affiliation is the institution where the student attends school. Include both the name of any department and the name of the college, university, or other institution, separated by a comma. Center the affiliation on the next double-spaced line after the author name(s).

Department of Psychology, University of Georgia

Course number and name

Provide the course number as shown on instructional materials, followed by a colon and the course name. Center the course number and name on the next double-spaced line after the author affiliation.

PSY 201: Introduction to Psychology

Instructor name

Provide the name of the instructor for the course using the format shown on instructional materials. Center the instructor name on the next double-spaced line after the course number and name.

Dr. Rowan J. Estes

Assignment due date

Provide the due date for the assignment. Center the due date on the next double-spaced line after the instructor name. Use the date format commonly used in your country.

October 18, 2020
18 October 2020

Use the page number 1 on the title page. Use the automatic page-numbering function of your word processing program to insert page numbers in the top right corner of the page header.

1

Professional title page

The professional title page includes the paper title, author names (the byline), author affiliation(s), author note, running head, and page number, as shown in the following example.

diagram of a professional title page

Follow the guidelines described next to format each element of the professional title page.

Paper title

Place the title three to four lines down from the top of the title page. Center it and type it in bold font. Capitalize of the title. Place the main title and any subtitle on separate double-spaced lines if desired. There is no maximum length for titles; however, keep titles focused and include key terms.

Author names

 

Place one double-spaced blank line between the paper title and the author names. Center author names on their own line. If there are two authors, use the word “and” between authors; if there are three or more authors, place a comma between author names and use the word “and” before the final author name.

Francesca Humboldt

When different authors have different affiliations, use superscript numerals after author names to connect the names to the appropriate affiliation(s). If all authors have the same affiliation, superscript numerals are not used (see Section 2.3 of the for more on how to set up bylines and affiliations).

Tracy Reuter , Arielle Borovsky , and Casey Lew-Williams

Author affiliation

 

For a professional paper, the affiliation is the institution at which the research was conducted. Include both the name of any department and the name of the college, university, or other institution, separated by a comma. Center the affiliation on the next double-spaced line after the author names; when there are multiple affiliations, center each affiliation on its own line.

 

Department of Nursing, Morrigan University

When different authors have different affiliations, use superscript numerals before affiliations to connect the affiliations to the appropriate author(s). Do not use superscript numerals if all authors share the same affiliations (see Section 2.3 of the for more).

Department of Psychology, Princeton University
Department of Speech, Language, and Hearing Sciences, Purdue University

Author note

Place the author note in the bottom half of the title page. Center and bold the label “Author Note.” Align the paragraphs of the author note to the left. For further information on the contents of the author note, see Section 2.7 of the .

n/a

The running head appears in all-capital letters in the page header of all pages, including the title page. Align the running head to the left margin. Do not use the label “Running head:” before the running head.

Prediction errors support children’s word learning

Use the page number 1 on the title page. Use the automatic page-numbering function of your word processing program to insert page numbers in the top right corner of the page header.

1

IMAGES

  1. PPT

    what is single static assignment

  2. PPT

    what is single static assignment

  3. PPT

    what is single static assignment

  4. Static Single Assignment

    what is single static assignment

  5. An example of static single assignment representations in which the

    what is single static assignment

  6. PPT

    what is single static assignment

COMMENTS

  1. Static Single Assignment (with relevant examples)

    Static Single Assignment was presented in 1988 by Barry K. Rosen, Mark N, Wegman, and F. Kenneth Zadeck. In compiler design, Static Single Assignment ( shortened SSA) is a means of structuring the IR (intermediate representation) such that every variable is allotted a value only once and every variable is defined before it's use. The prime ...

  2. Static single-assignment form

    Static single-assignment form. In compiler design, static single assignment form (often abbreviated as SSA form or simply SSA) is a type of intermediate representation (IR) where each variable is assigned exactly once. SSA is used in most high-quality optimizing compilers for imperative languages, including LLVM, the GNU Compiler Collection ...

  3. CS 6120: Static Single Assignment

    Many compilers convert programs into static single assignment (SSA) form, which does exactly what it says: it ensures that, globally, every variable has exactly one static assignment location. (Of course, that statement might be executed multiple times, which is why it's not dynamic single assignment.) In Bril terms, we convert a program like ...

  4. Understanding static single assignment forms

    By popular demand, I'm doing another LLVM post. This time, it's single static assignment (or SSA) form, a common feature in the intermediate representations of optimizing compilers. Like the last one, SSA is a topic in compiler and IR design that I mostly understand but could benefit from some self-guided education on.

  5. PDF Static Single Assignment

    SSA form. Static single-assignment form arranges for every value computed by a program to have. aa unique assignment (aka, "definition") A procedure is in SSA form if every variable has (statically) exactly one definition. SSA form simplifies several important optimizations, including various forms of redundancy elimination. Example.

  6. Single Static Assignment: A Key Concept In Compiler Design

    Welcome to our comprehensive guide on Single Static Assignment (SSA) in Compiler Design. This video is designed to provide a deep dive into the concept of SS...

  7. PDF CS153: Compilers Lecture 23: Static Single Assignment Form

    •Static Single Assignment (SSA) •CFGs but with immutable variables •Plus a slight "hack" to make graphs work out •Now widely used (e.g., LLVM) •Intra-procedural representation only •An SSA representation for whole program is possible (i.e., each global variable and memory location has static single

  8. PDF CS 380C Lecture 7 2 Static Single Assignment

    Static Single Assignment •Induction variables (standard vs. SSA) •Loop Invariant Code Motion with SSA CS 380C Lecture 7 21 Static Single Assignment Cytron et al. Dominance Frontier Algorithm let SUCC(S) = [s∈S SUCC(s) DOM!−1(v) = DOM−1(v) - v, then

  9. PDF Lecture 13 Introduction to Static Single Assignment (SSA)

    SSA. Static single assignment is an IR where every variable is assigned a value at most once in the program text. E as y for a b asi c bl ock : assign to a fresh variable at each stmt. each use uses the most recently defined var. (Si mil ar to V al ue N umb eri ng) Straight-line SSA. . + y.

  10. PDF Static Single Assignment Form

    Static Single Assignment Form (and dominators, post-dominators, dominance frontiers…) CS252r Spring 2011 ... •If node X contains assignment to a, put Φ function for a in dominance frontier of X •Adding Φ fn may require introducing additional Φ fn •Step 2: Rename variables so only one definition ...

  11. PDF Static Single Assignment Form

    In Static Single Assignment (SSA) Form each assignment to a variable, v, is changed into a unique assignment to new variable, v i. If variable v has n assignments to it throughout the program, then (at least) n new variables, v 1 to v n, are created to replace v. All uses of v are

  12. Lab 7: Static Single-Assignment Form

    The static single-assignment form (SSA) is an important compiler intermediate representation, in which each variable is assigned (statically) at most once. SSA is more advantageous over other compiler IRs in that its single-assignment property makes program analysis and optimization simpler and more efficient.

  13. PDF CSC D70: Compiler Optimization Static Single Assignment (SSA)

    Static Single Assignment (SSA) • Static single assignment is an IR where every variable is assigned a value at most once in the program text • Easy for a basic block (reminiscent of Value Numbering): -Visit each instruction in program order: •LHS: assign to a fresh version of the variable

  14. PDF Lecture Notes on Static Single Assignment Form

    Static Single Assignment Form L10.2 2 Basic Blocks As before, a basic block is a sequence of instructions with one entry point and one exit point. In particular, from nowhere in the program do we jump into the middle of the basic block, nor do we exit the block from the middle. In our language, the

  15. PDF Static Single Assignment Form

    Static Single Assignment •Idea: each assignment of variables to be given a unique name; each use of a variable to be reached by only one definition of that variable •Need to create F functions at join nodes reached by more than one definition of same variable -Dominance frontier is a subset of these join nodes where F functions are ...

  16. PDF Lecture 13 Static Single Assignment & Intro to Satisfiability Modulo

    Static Single Assignment form: type of intermediate representation. Each variable is assigned statically (in code) exactly once. Each definition is assigned a unique name. Properties: Makes def-use chains explicit. Definitions dominate uses (key property) This makes certain optimizations simpler or more efficient.

  17. PDF The Static Single Information Form

    Thirty years later, the Static Single Assign-ment (SSA) form was introduced by Alpern, Rosen, Wegman and Zadeck as a tool for efficient optimization in a pair of POPL papers [2, 35], and three years after that Cytron and Ferrante joined Rosen, Wegman, and Zadeck in explaining how to compute SSA form efficiently in what has since be- ...

  18. Difference between static single assignment and 3 address code

    Single static assignment is a type of intermediate representation..But the constraint is that each variable is assigned a value at most once but can be used any number of times..Only thing it has to be written once only.. t1 = t1 + 3 , then it is an invalid SSA code as t1 is assigned second time which should not be the case.. https://books ...

  19. PDF Static Single Assignment for Decompilation

    Static Single Assignment enables the e cient implementation of many important de-ompilerc ompconents, including expression propagation, preservation analysis, type an-alysis, and the analysis of indircte jumps and alcls. Source code is an essential part of all software development. It is so aluablev that

  20. CS 6120: Global Analysis & SSA

    Many compilers convert programs into static single assignment (SSA) form, which does exactly what it says: it ensures that, globally, every variable has exactly one static assignment location. (Of course, that statement might be executed multiple times, which is why it's not dynamic single assignment.) In Bril terms, we convert a program like this:

  21. PDF Lecture Notes on Static Single Assignment Form

    Static Single Assignment Form L6.2 2 Basic Blocks A basic block is a sequence of instructions with one entry point and one exit point. In particular, from nowhere in the program do we jump into the middle of the

  22. Single-Static Assignment Form and PHI

    In LLVM you have to manually specify the name of the value and the previous basic block. end: %retval = phi i32 [%a, %btrue], [%b, %bfalse] Here we instruct the phi instruction to choose %a if the previous basic block was %btrue. If the previous basic block was %bfalse, then %b will be used. The value is then assigned to a new variable %retval.

  23. Title page setup

    Assignment due date. Provide the due date for the assignment. Center the due date on the next double-spaced line after the instructor name. Use the date format commonly used in your country. October 18, 2020 18 October 2020. Page number. Use the page number 1 on the title page. Use the automatic page-numbering function of your word processing ...