编程语言代写 | COMS W4115: Programming Assignment 3 (Control Flow Analysis)

本次代写是CFG相关的一个Assignment

In the previous programming assignments for syntax analysis and sematic analysis, you worked with clang , the LLVM front-end for C, C++, and Objective-C.

The core of LLVM is the intermediate representation (IR), a low-level programming language similar to assembly, which abstracts away most details of the high-level programming language and also the target-machine-specific nuances. When compiling a programming language under LLVM, it will first convert the specific

programming language into IR and then perform analysis/optimization techniques against the IR. Finally, it will generate the target binary code (e . g ., x86, ARM, etc.).Optimizations are implemented as p a s s e s that traverse some portion of a program to either collect information or transform the program. For more details about L L V M P a s s, check out Writing an LLVM Pass.

We will work with the IR and the LLVM Pass Framework for this assignment. More specifically, we will learn how to create a control flow graph (CFG) from an LLVM IR, as well as how to perform lightweight analysis on the CFG. The analysis of control flow graphs is an essential part of the compiler for program optimization.

LLVM Version (IMPORTANT)LLVM Version (IMPORTANT)

This assignment requires that your llvm version is 12.0.0.This assignment requires that your llvm version is 12.0.0. To check your llvm version, please run

llvm-config –version

1. Convert the bubble.c C program (from the examples directory) to an IR by running the following commands:

export LLVM_HOME=”<the absolute path to llvm-project>”;

export PATH=”$LLVM_HOME/build/bin:$PATH”;

clang -O0 -emit-llvm -c bubble.c llvm-dis bubble.bc

You will now receive a bubble.bc file, which contains the IR in binary format. You will also see a bubble.ll file, which contains the IR in human-readable format. Go ahead and take a look at the contents of those files, and especially try to understand the structure of the bubble.ll file.

2. Generate the CFG from the bubble.bc file by running the following commands (again, examples is the directory we provided for you):

cd ./examples

opt -dot-cfg < bubble.bc dot -Tpdf .bubbleSort.dot -o bubbleSortDetailed.pdf

opt -dot-cfg-only < bubble.bc dot -Tpdf .bubbleSort.dot -o bubbleSort.pdf

Now, you should be able to view the CFG of the function bubbleSort() in bubble.c by looking at the generated PDFs. Do you notice the difference between the two PDFs (h i n t: look at the names of the PDFs)? Similarly, you can view the CFGs of other functions via the dot command (e . g ., dot -Tpdf .printArray.dot -oprintArray.pdf for the printArray function).

3. Create a directory named clang-hw3 in llvm-project/llvm/lib/Transforms for this assignment, and copy the files from the src directory to this new directory, as follows:

cp -r ./src “$LLVM_HOME/llvm/lib/Transforms/clang-hw3”

4. Append add_subdirectory(clang-hw3) to the $LLVM_HOME/llvm/lib/Transforms/CMakeLists.txt file.

5. Build clang-hw3 by running the following commands (you should do this every time you make changes):

cd “$LLVM_HOME/build”

6. The nodes or vertices of the CFG are called basic blocks. In the next two subsections, we will describe your tasks for this assignment; you may assume that for each function, exactly one basic block will have ret as its terminator instruction, whereas other basic blocks will have br as their terminator instruction.

You may also assume that br will have a t m o s t 2 s u c c e s s o r s.

Task 1: CFG Generation (50 points)Task 1: CFG Generation (50 points)

In this task, you will need to create a function pass and construct the CFG of a function by analyzing its basic blocks. We have provided the skeleton code for this task in src/hw3-cfg.cpp to help you get started.

Instead of generating a dot file yourself, please use the OFile class to output all of the edges you found. The results will automatically be saved to <function_name>.txt . You can run the pass you created as follows:

opt -load $LLVM_HOME/build/lib/LLVMcfg.so -hw3-cfg < bubble.bc

As a sanity check, you can compare the edges you found with those in the bubbleSort.pdf file you generated earlier.

Task 2: CFG Analysis (50 points)Task 2: CFG Analysis (50 points)

It should not be a surprise that there are various types of analysis/transformation techniques that can be performed on a CFG (e . g ., dead block elimination, loop detection/simplification, etc.). Given that you generated a CFG in task 1, task 2 requires that you identify all the k e y b l o c k s in a function. We define a k e y b l o c k as a basic block of a function such that every path from the entry of the function to the exit of the function MUSTMUST go through this basic block. Remember that you may assume that exactly oneone basic block will have ret as its terminator instruction in each function, and ret is considered to be the exit of a function.

Please use the OFile class to output the k e y b l o c k s you found. As before, the results will automatically be saved to <function_name>.txt . You can run the pass you created as follows: opt -load $LLVM_HOME/build/lib/LLVMcfg.so -hw3-cfg < bubble.bc A few important notes: 1. Note that the entry basic block and the basic block containing the ret instruction are also k e y b l o c k s, by definition. 2.

H i n t: a basic block is a k e y b l o c k if and only if “removing that block makes the exit of the function unreachable.” 3. You need to figure out which LLVM APIs you should use for this task. It

is also OKOK if you prefer to construct a graph structure yourself and analyze it manually.

Only one file should be submitted: src/hw3-cfg.cpp . Other files will be ignored when grading. Please make sure that your code is properly committed and pushed to the master branch.