Replies: 23 comments 6 replies
-
Summarization of workFor this task, I implemented global dead code elimination to remove the variables never used after definition, local optimization to remove redefinition of variables that are not used, and also local value numbering. For local value numbering, I tried to solve copy propagation by first finding the local number of the variable assigned by the id, and rewrite the id instruction using the variable of that local number or change it to a const. For common subexpression elimination, I checked the 'op' field for commutativity and sort the arguments if it is add or multiple. For constant folding, I only implemented it for add/mul/sub/div, and each time if the value is not stored in the table, I try to replace the args using the values stored in the table, and by induction it should be already be const value if it is calculated in the same block before, and then do the calculation, and replace the instruction with const operation. Finally I combine all the optimization together. I first run local optimization and local value numbering, and then run global dead code elimination until convergence. TestingI first manually run my program to produce bril files and run it using turnt to see if the results are correct. Then I run the test cases in core and examples folder using brench, and compare the profiling results from existing dce.py, lvn.py, and my implementation. I noticed that my implementation has better performance as the existing dce.py, but worse than the lvn.py. Using the test suites in examples/lvn, my implementation has similar performance as the provided lvn.py. But for more complex tests there are differences. My guess is that my lvn function did not implement all the optimization techniques and thus not optimizing as aggressively as the existing implementation. For instance for constant folding there are operations like eq, le that I didn't replace. Hardest partWhat was the hardest part of the task? How did you solve this problem? |
Beta Was this translation helpful? Give feedback.
-
Summarize what you did.
Explain how you know your implementation works—how did you test it? Which test inputs did you use? Do you have any quantitative results to report?
The What was the hardest part of the task? How did you solve this problem?
to
This was because both
Handling function calls as side effect operations fixed the issue. |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
Summary
Implementation DetailsFor TDCE, I first did local dead code elimination and then performed the global, trivial pass. For LVN, I used four hashmaps. One corresponds to the environment which maps variable names to value numbers. The next two correspond to the 3-way table in the lecture. One maps value expressions to their value number, and a second maps value numbers to their home location. A final map was used for constant folding which maps value numbers to literals or Slightly different from in class, I decided that whenever an instruction would overwrite a variable that was the "home" to a value, I would first insert an For any instructions that I didn't support (function calls, loads, floating point), I gave them unique value numbers that could never be repeated to make sure they didn't break anything. Testing/EvaluationOnce again, I wrote a few manual tests, mostly for checking specific aspects of LVN and DCE, but mainly tested correctness by ensuring all preexisting benchmarks and benchmarks from the previous task worked using TURNT. To test LVN was actually doing something, I did a few ad-hoc tests manually by looking at the generated BRIL instructions and used brench to collect run statistics on all benchmarks. In my brench run, I followed up lvn with local and trivial dce. Overall, I found that all but 2 tests had reduced dynamic instructions. The slowdown here is likely due to the fact that I used a greedy algorithm to reconstruct CFGs back into programs, which may have found a worse ordering of basic blocks than the original program. I measured the speedup by computing
Many tests resulted in instruction counts similar to the original, however, some had large improvements such as DifficultiesOne issue is that, while I didn't support floating point arithmetic for value numbering, I did want it to not break those programs and still perform copy propagation. The issue I was running across is that integer constants could be used to initialize float variables. To solve this, I created a separate value key for A similarly related issue was figuring out how to treat things I wasn't/couldn't support such as function calls and floating point ops. Take this for example:
Here it needs to know that |
Beta Was this translation helpful? Give feedback.
-
Summary Implementation details/testing LVN was very complicated and is not currently working. For my own small toy examples that included the examples from class, the code was optimized and still gave the correct output. However, when I tried to test with brench on all the
I think in generating new instructions, I may have done poor type conversion. I'm not sure what is causing the first error, but it seems to be indicating that I'm accidentally deleting an argument somewhere. There was only one benchmark from the bril benchmarks folder that showed an improvement. The Difficulties
Second, pointers proved difficult because my LVN code wanted to treat two different pointer alloc's of the same size as the exact same "expression" when in reality the memory underneath the hood was different. I ended up just skipping all pointer instructions with the idea that since pointer values are under the surface, it wasn't safe to try to optimize. This led to instructions whose pointer values were not in the lvn table/environment. I ended up just throwing in any arguments that had not been defined as "noop" variables into the table. For example,
I handled function arguments the same since it's unknown what the expression for function arguments would be. |
Beta Was this translation helpful? Give feedback.
-
SummaryDetailsThe TDCE implementation follows the straightforward approach discussed in lecture. The LVN implementation includes the basic algorithm and some limited semantic extensions, namely, special handling for commutative operations and TestingTo test correctness, I used To test the quality of the optimizations, I first inspected their behavior on the test cases under Across all benchmarks, I observed a minimum speedup of 1.0× (which is to say, I never observed any slowdowns), and a maximum speedup of 1.97×. Average speedups for the various benchmark suites were as follows:
DifficultiesAs has been mentioned a lot already, there were a lot of edge cases to consider when applying the high-level algorithm to actual code. Some fun ones included instructions with side effects (or otherwise impure functions) and variables defined outside the basic block. Some edge cases only arose when testing with one of the various Bril extensions. To try to accommodate extensions without having to consider them each individually, I adopted a conservative "opt-in" approach to optimization, where only instructions that are explicitly supported will be rewritten. |
Beta Was this translation helpful? Give feedback.
-
Summary what you did Testing Difficulties |
Beta Was this translation helpful? Give feedback.
-
Summary
Implementation details
How did you test it? Which test inputs did you use? Do you have any quantitative results to report?
What was the hardest part of the task? How did you solve this problem?
|
Beta Was this translation helpful? Give feedback.
-
SummaryTrivial Dead Code Elimination DetailsThe trivial dead code elimination implementation was pretty straightforward, it was pretty much just the approach we covered in class. Local value numbering, on the other hand, was more complicated. I began implementation with very simple test cases in mind, and had a pretty good time. The general implementation involves an array of "table entries" each containing a value and a list of variables that refer to it (the first variable being treated as the canonical). I then created indexes for both values and variables to jump to the corresponding table entry. Finally, it was just a matter of creating temporary variable names for clobbered variables and basically updating the table/index data structure when appropriate. I also solved commutativity by sorting the arguments within each value. I had a lot of fun implementing constant folding! I didn't have time to implement all the operations that are possible to fold on, such as those involving booleans, but I believe the structure is there and implementing the rest should be a breeze. This felt like writing a super basic interpreter. As I started to test my optimization on more parts of the language, I realized there were a lot more considerations to take care of, such as function arguments and call instructions which I hadn't considered in the beginning. As a result, I ended up with some hacky approaches to these cases, such as when handling arguments that have been assigned to in the block being processed. TestingI started by writing a couple of extremely contrived examples and manually ran my optimizations to compare the outputs while implementing/fixing basic functionality. I then used the programs in Finally, I used DifficultiesI chose to implement LVN in TypeScript, which I thought struck a good balance for developer experience, having Bril typings available while also being able to run as a script without a compilation step. However, JavaScript is just totally messed up -- for example, the Map data structure has no way of checking deep equality, meaning that I cannot use "tuples" (ie lists) as or within keys. I managed to overcome this by stringifying my keys, but I do not feel good about this workaround at all. I think I'm going to migrate to some other language like rust for later projects, if I have the time to port all my util code over 🥲. One tricky "problem" (moreso aesthetic annoyance) I encountered and was not able to fully solve was using the "correct" variable naming when dealing with clobbered variable names. My implementation creates new temporary variable names for write instructions that eventually will be overwritten later, and since these replacement names come first, they end up being the canonical name for the value. For example:
becomes
while aesthetically I'd like it to be
One approach I tried was to update the canonical variable of a value with the latest variable for that value, however this would increase the number of instructions, making the optimization worse. |
Beta Was this translation helpful? Give feedback.
-
SummaryMe and @JohnDRubio worked on writing the trivial dead code elimination and LVN algorithms discussed in class. Our LVN algorithm follows the basic implementation along with a few extensions. Implementation details
Testing
Difficulties
|
Beta Was this translation helpful? Give feedback.
-
I worked with @obhalerao on this project. SummaryDetailsFor this assigment, we changed the C++ JSON parsing library we used from RapidJSON to nlohmann/json, as the latter had much cleaner syntax. We can safely say that we did not regret this decision. For the trivial dead code elimination assignment, we followed a similar algorithm as the one that was discussed in class: simply iterate backwards through the function, maintain a set of dead variables that are assigned to, and remove them from the set. We iterated until convergence here. Local value numbering was more tricky. The data structure we chose for the value table was a vector of pairs, the first of which contained a Value representation, and the second of which contained a queue of potential homes for each value, the first of which would be the canonical home. In our haste to write a Value representation, we overlooked how to process constant values, and came up with the admittedly janky solution of reusing the Value representation we made, but putting the value of the constant itself instead of the number of the relevant value. This also means that we do not support processing of float constants; this is something that we can easily change. We do, however, support commutativity of relevant operations. In addition to the value table, we also maintain maps from each variable to its value number and also from each Value itself to its value number, and update these on each iteration of the overall LVN pass. Then, after we update the table for each instruction, we rewrite the args of each instruction with the home of each value, after popping names off the queue corresponding to the value that are not actually its home. Lastly, after the LVN pass is performed for each basic block and all instructions are rewritten, we perform global DCE (the version that simply removes all unused assignments) until convergence on the entire program, and then the same local DCE implemented previously on all basic blocks. TestingWe tested our programs using Here are the average speedups we observed in all of the Bril benchmarks. DifficultiesThe primary difficulty we had with this assignment was, as per usual, getting used to C++. However, after working with it for an assignment at this scale, we feel as though we have a better handle of how it works now. In addition, there were many edge cases that we missed on our first pass through (e.g. how to handle numbering of |
Beta Was this translation helpful? Give feedback.
-
Summary:
Experiment results: Difficulties: |
Beta Was this translation helpful? Give feedback.
-
SummaryI started with LVN first. I felt it's more challenging and interesting. Details
TestingCurrent I only run on my trivial hand-made test cases and simple.bril and reassign.bril. DifficultiesAlthough I am kind of familiar with C style programming and pointers, C++ is pretty hard to manipulate. I was pretty used to Java and sometimes when I want to do polymorphism (eg. isinstanceof function) in C++, it's pretty annoying.
Generative AIOnly ask chatGpt for creating Makefile with given file directory structure. Since Makefile is pretty hard to write and read, it's helpful in this case. |
Beta Was this translation helpful? Give feedback.
-
SummaryTrivial Dead Code Elimination & LVN (CSE, Copy Prop) [repo] DetailsMy optimizations use the LVN was a lot more involved for me. To represent the table, I made a struct ChallengesTDCE was quite straightforward to implement. LVN/CSE were a little more complicated, but after wrapping my mind around and designing the data structure, it wasn't too bad. In this initial implementation, I used a last_def table to keep track of whether the variable an instruction def'd (if any) was the last_def of that variable. However, this ran into numerous issues when trying to do Copy Prop, as it wouldn't consider a variable defined outside a preceding basic block and redefined in the current block. To workaround this, I kept a list of all canonical names as I mentioned before. Testing & VerificationTo sanity check while coding, I tested against some simple examples designed for DCE/CSE/Copy Prop. Afterwards, I used Full Suite
By Suite (mean speedup)
|
Beta Was this translation helpful? Give feedback.
-
SummaryDetailsFor TDCE I followed the pseudocode outlined in the lecture. Did not have too much time this time round so fell back onto Python. I also added some types for the Bril to make it easier to work with. LVN was more challenging. Implementation-wise four hashtables mapping from num2val, val2var, var2num, and a num2var to handle the cases with clobbered names. I also batched the block mutations. Finally, after the LVN pass per block, I did a global + local DCE to clean everything up DifficultiesTDCE was not too bad, but LVN was much more challenging. My first few implementations fully modeled the pseudocode using the three-way hashtable but I ran into issues with variables being defined in a preceding block and used in a subsequent block. This resulted in patches upon patches and made everything quite messy. Thinking back, I should have perhaps spent more time rethinking the my structs. TestingI first tested my TDCE and LVN implementation on some small handwritten test cases (can be found under
|
Beta Was this translation helpful? Give feedback.
-
Partners@emwangs and I worked on this assignment together. SummaryImplementation Details
Challenges
Testing
|
Beta Was this translation helpful? Give feedback.
-
@xalbt and I worked together on lesson 3. SummaryImplementation
Difficulties
Testing
|
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
Summary
Implementation Details
Testing Strategy
ChallengesThe LVN algorithm is not easy to tackle with many details to polish, so it relies heavily on testing to find out flaws in my implementation. Therefore, I created unit tests, and kept adding test cases whenever anything goes wrong. I also made an effort to understand which kind of coding style in the program caused issues to my optimzation algorithms to create a minimal test case for it, instead of simply copying the whole benchmark program. |
Beta Was this translation helpful? Give feedback.
-
SummaryFor this task I implemented
Implementation & TestingI followed the template from class and carefully reconstructed a few examples myself to get the intuition behind how the solution worked, before starting to code it up. As we discussed in class, I used my tdce tool to clean-up unused instructions after my LVN pass. I implemented commutativity by canonicalizing the order of my arguments in the tuple for commutative operations. I initially implemented copy propagation, but was spending a lot of time working on "variable clobbering", so I left it for another day. After doing a few of these assignments, I don't think python is particularly well suited for these tasks, considering the types of annoying runtime errors I was encountering that a "smarter" compiler could point out upfront. This led me to write some pretty hacky code. I might either switch up which language I use for the future assignments or start annotating more types in my programs. To start with testing, I used a few small handcrafted examples from the bril repo, just to make sure my LVN program was sensible. Then I used brench on all of the core benchmarks. A few of them broke, so I manually compared 2 or 3 to my own implementation and fixed a couple subtle bugs. Then, my implementation worked on all the benchmarks. The speed-up was interesting. I suppose it heavily depends on the source bril program because some only had a couple instructions fewer, but some had a few hundred instructions fewer. I think the compiler generated bril programs are particularly amenable to LVN speedup. DifficultiesI found quite a few challenges with transforming the pseudo-code from class into a real implementation. There were lots of corner cases I had not considered, and I had to be extra careful with how I dealt with certain types of instructions or else the whole thing would break. Debugging the LVN solution was particularly frustrating. It took a while to compare the "correct" bril program with the output of my LVN tool to see what was going wrong. A particularly funny bug I found was that python treats True and 1 as equivalent when hashing, which was breaking some of the programs. I had to add logic to treat them differently. Finally, I didn't think hard about how to optimize things like Ptrs, but I didn't want my LVN to break those programs, so I explicitly state in the code which instructions I support optimizing and which I don't. |
Beta Was this translation helpful? Give feedback.
-
For this task, my summary is that I implemented
For trivial dead code elimination, it's rather easy to eliminate globally unused variables and to kill any instruction that is reassigned before used. TDCE was "trivial" except when I erase the unused instructions, nlohmann json will leave an empty json object behind, which results in printing |
Beta Was this translation helpful? Give feedback.
-
SummaryTDCE/LVN Implementation with Extensions For this assignment, I implemented TDCE and LVN as described in class with additional constant propagation, copy propagation, algebraic identities, and constant folding optimizations. I decided to pivot and implement these tasks in Haskell, which I came to believe was an excellent language for the job. Although the algorithms were presented in an imperative manner during lecture, I used the Implementation Challenges
Testing and ResultsI leveraged brench to test my optimizations over all benchmarks and ensure they were correct and effective. I also wrote several small sanity-checking diff tests with turnt to ensure that my optimizations behaved as intended. The distribution of benchmark speedups is below, where the horizontal axis is the ratio "optimized dynamic insts / original dynamic insts`, and the vertical axis is frequency. Overall, I achieved a mean 13.4% speedup across all benchmarks, which is pretty impressive! The speedups were much better in benchmarks generated from the typescript frontend. |
Beta Was this translation helpful? Give feedback.
-
SourceTrivial dead code elimination Implementation and issuesFor this assignment, I implemented a trivial dead code elimination and a local value numbering algorithm. I initially attempted a functional approach in OCaml, which wasn’t successful due to the imperative nature of code presented in class. I kept getting issues on how to deal with my state, which was trivially solved by converting all my code to Python for TDCE. However, on LVN, my issues in the imperative world persisted. I am not sure exactly why my python LVN implementation keeps getting timed out on larger examples, albeit it seems to at least be functional (although not optimal) on smaller test cases. I would prefer to take some more time and re-implement LVN in C++ with a closer attention to the pseudo-code to get it working further. Benchmark analysisThe trivial dead code elimination seems to have worked as expected! I used the entire benchmark directory under the bril source code and the Brench tool for automation. Consistently for larger examples,TDCE would be more efficient by a bit. However, my attempt at a working LVN didn’t succeed fully, either timing out for bigger examples or having the same performance as no optimizations.
|
Beta Was this translation helpful? Give feedback.
-
The tasks for this lesson include implementing basic dead-code elimination (DCE) and, as the main event, implementing local value numbering (LVN). I ❤️ LVN!
Beta Was this translation helpful? Give feedback.
All reactions