Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat[venom]: multidimensional fencing #4066

Closed
wants to merge 71 commits into from
Closed
Show file tree
Hide file tree
Changes from 64 commits
Commits
Show all changes
71 commits
Select commit Hold shift + click to select a range
d552898
feat: multidimensional fencing
charles-cooper May 30, 2024
d4dad09
invalidate liveness
charles-cooper May 30, 2024
398a344
Merge branch 'master' into feat/fencing
charles-cooper May 30, 2024
a50473a
add log, revert to effects
charles-cooper May 30, 2024
c7da081
sha3 fence
charles-cooper May 30, 2024
80c5154
fix fence checker
charles-cooper May 30, 2024
6cf7b6b
feat[venom]: extract literals pass
charles-cooper May 30, 2024
04b53d0
don't reorder param instructions
charles-cooper May 30, 2024
8adf783
remove a comment
charles-cooper May 30, 2024
8506bbb
small perf
charles-cooper May 30, 2024
ef7c369
feat: store expansion pass
charles-cooper May 31, 2024
ff700b4
lint
charles-cooper May 31, 2024
ea9b1c5
remove inter-bb restriction
charles-cooper May 31, 2024
adbf01c
don't replace first use
charles-cooper May 31, 2024
f3acde1
fix terminator instruction
charles-cooper May 30, 2024
988a1a9
wip - fix fence
charles-cooper May 31, 2024
058c4db
fix can_reorder
charles-cooper May 31, 2024
edaf756
Merge branch 'master' into feat/fencing
charles-cooper May 31, 2024
3555fcb
force phi instructions first
charles-cooper May 31, 2024
3e096de
fix can_reorder(?)
charles-cooper May 31, 2024
7dc473d
clean up phi
charles-cooper May 31, 2024
e087c4f
update fence calculation
charles-cooper May 31, 2024
50e13cf
Revert "update fence calculation"
charles-cooper May 31, 2024
490c377
sort again
charles-cooper May 31, 2024
608bfaa
traverse out_vars
charles-cooper May 31, 2024
afb49a6
update can_reorder
charles-cooper May 31, 2024
3d0c4bb
fix a table
charles-cooper May 31, 2024
a44c0bd
wip - effects graph
charles-cooper May 31, 2024
f1bb354
update table
charles-cooper Jun 1, 2024
6e87f1f
minor cleanup
charles-cooper Jun 1, 2024
8b415ff
downstream_of data structure
charles-cooper Jun 1, 2024
eb55ab2
remove old fence member
charles-cooper Jun 1, 2024
733b1fb
lint
charles-cooper Jun 1, 2024
7e184e2
fix bad dependency
charles-cooper Jun 1, 2024
83ba491
traverse down effects graph
charles-cooper Jun 1, 2024
9ec2b4e
fix phi+param instructions
charles-cooper Jun 1, 2024
939d671
don't traverse downstream
charles-cooper Jun 1, 2024
6f370ca
cleanup
charles-cooper Jun 1, 2024
d50130e
reverse out_vars
charles-cooper Jun 1, 2024
881daf6
fiddle with stack layout, traversal order
charles-cooper Jun 1, 2024
aa2234c
fix bugs
charles-cooper Jun 1, 2024
b6b7aed
allow inter-bb
charles-cooper Jun 1, 2024
a71cad8
lint
charles-cooper Jun 1, 2024
ef8a56c
reverse use traversal
charles-cooper Jun 4, 2024
7065892
for debugging
charles-cooper Jun 4, 2024
2dad58b
add balance fence
charles-cooper Jun 4, 2024
20bd8b3
add balance fence
charles-cooper Jun 4, 2024
f89711e
Merge branch 'feat/store-expansion' into feat/fencing
charles-cooper Jun 4, 2024
7fa84b4
lift out prev=var
charles-cooper Jun 4, 2024
18192ce
fiddle with store expansion
charles-cooper Jun 5, 2024
372d007
tune order of passes
charles-cooper Jun 5, 2024
d83c886
add a degree of freedom
charles-cooper Jun 5, 2024
33f15e5
add a peephole optimization
charles-cooper Jun 5, 2024
6174be0
run dftpass twice!
charles-cooper Jun 5, 2024
c7d57f4
remove dead variables
charles-cooper Jun 5, 2024
eaa1c67
add returndata fencing
charles-cooper Jun 12, 2024
e385ecb
fix lint
charles-cooper Jun 12, 2024
993e875
fix returndata
charles-cooper Jun 12, 2024
1f8001f
Merge branch 'master' into feat/fencing
charles-cooper Jun 13, 2024
b84595a
improved sanity check
charles-cooper Jun 13, 2024
f7adb8a
Merge branch 'master' into feat/fencing
charles-cooper Sep 18, 2024
2a7fde7
use .first()
charles-cooper Sep 18, 2024
0d19fd5
another usage of first()
charles-cooper Sep 18, 2024
0582db5
wip - store-expand bb in vars
charles-cooper Sep 19, 2024
4f3b0ec
Merge branch 'master' into feat/fencing
charles-cooper Sep 20, 2024
04314e0
move normalization pass
charles-cooper Sep 19, 2024
32a5da5
debug show cost
charles-cooper Sep 19, 2024
35fe8ce
wip
charles-cooper Sep 20, 2024
f5b2609
revert cfg traversal order change
charles-cooper Sep 20, 2024
6129813
bring back store expansion
charles-cooper Sep 22, 2024
c6dd3ff
Merge branch 'master' into feat/fencing
charles-cooper Sep 28, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions vyper/ir/compile_ir.py
Original file line number Diff line number Diff line change
Expand Up @@ -1033,6 +1033,9 @@ def _stack_peephole_opts(assembly):
if assembly[i] == "SWAP1" and assembly[i + 1].lower() in COMMUTATIVE_OPS:
changed = True
del assembly[i]
if assembly[i] == "DUP1" and assembly[i + 1] == "SWAP1":
changed = True
del assembly[i + 1]
i += 1

return changed
Expand Down
7 changes: 6 additions & 1 deletion vyper/venom/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
from vyper.venom.passes.sccp import SCCP
from vyper.venom.passes.simplify_cfg import SimplifyCFGPass
from vyper.venom.passes.store_elimination import StoreElimination
from vyper.venom.passes.store_expansion import StoreExpansionPass
from vyper.venom.venom_to_assembly import VenomCompiler

DEFAULT_OPT_LEVEL = OptimizationLevel.default()
Expand Down Expand Up @@ -54,8 +55,12 @@ def _run_passes(fn: IRFunction, optimize: OptimizationLevel) -> None:
SimplifyCFGPass(ac, fn).run_pass()
AlgebraicOptimizationPass(ac, fn).run_pass()
BranchOptimizationPass(ac, fn).run_pass()
ExtractLiteralsPass(ac, fn).run_pass()
RemoveUnusedVariablesPass(ac, fn).run_pass()

# reorder and prepare for stack scheduling
DFTPass(ac, fn).run_pass()
StoreExpansionPass(ac, fn).run_pass()
ExtractLiteralsPass(ac, fn).run_pass()
DFTPass(ac, fn).run_pass()


Expand Down
3 changes: 2 additions & 1 deletion vyper/venom/analysis/liveness.py
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,8 @@ def _calculate_out_vars(self, bb: IRBasicBlock) -> bool:
bb.out_vars = OrderedSet()
for out_bb in bb.cfg_out:
target_vars = self.input_vars_from(bb, out_bb)
bb.out_vars = bb.out_vars.union(target_vars)
bb.out_vars |= target_vars

return out_vars != bb.out_vars

# calculate the input variables into self from source
Expand Down
2 changes: 0 additions & 2 deletions vyper/venom/basicblock.py
Original file line number Diff line number Diff line change
Expand Up @@ -211,7 +211,6 @@ class IRInstruction:
liveness: OrderedSet[IRVariable]
dup_requirements: OrderedSet[IRVariable]
parent: "IRBasicBlock"
fence_id: int
annotation: Optional[str]
ast_source: Optional[IRnode]
error_msg: Optional[str]
Expand All @@ -229,7 +228,6 @@ def __init__(
self.output = output
self.liveness = OrderedSet()
self.dup_requirements = OrderedSet()
self.fence_id = -1
self.annotation = None
self.ast_source = None
self.error_msg = None
Expand Down
236 changes: 189 additions & 47 deletions vyper/venom/passes/dft.py
Original file line number Diff line number Diff line change
@@ -1,81 +1,223 @@
from collections import defaultdict
from dataclasses import asdict, dataclass

from vyper.utils import OrderedSet
from vyper.venom.analysis.dfg import DFGAnalysis
from vyper.venom.analysis.liveness import LivenessAnalysis
from vyper.venom.basicblock import IRBasicBlock, IRInstruction, IRVariable
from vyper.venom.function import IRFunction
from vyper.venom.passes.base_pass import IRPass

_ALL = ("storage", "transient", "memory", "immutables", "balance", "returndata")

writes = {
"sstore": "storage",
"tstore": "transient",
"mstore": "memory",
"istore": "immutables",
"call": _ALL,
"delegatecall": _ALL,
"staticcall": "memory",
"create": _ALL,
"create2": _ALL,
"invoke": _ALL, # could be smarter, look up the effects of the invoked function
"dloadbytes": "memory",
"returndatacopy": "memory",
"calldatacopy": "memory",
"codecopy": "memory",
"extcodecopy": "memory",
"mcopy": "memory",
}
reads = {
"sload": "storage",
"tload": "transient",
"iload": "immutables",
"mload": "memory",
"mcopy": "memory",
"call": _ALL,
"delegatecall": _ALL,
"staticcall": _ALL,
"returndatasize": "returndata",
"returndatacopy": "returndata",
"balance": "balance",
"selfbalance": "balance",
"log": "memory",
"revert": "memory",
"return": "memory",
"sha3": "memory",
}


@dataclass
class Fence:
storage: int = 0
memory: int = 0
transient: int = 0
immutables: int = 0
balance: int = 0
returndata: int = 0


# effects graph
class EffectsG:
def __init__(self):
self._graph = defaultdict(list)

# not sure if this will be useful
self._outputs = defaultdict(list)

def analyze(self, bb):
fence = Fence()

read_groups = {}
terms = {}

for inst in bb.instructions:
reads = _get_reads(inst.opcode)
writes = _get_writes(inst.opcode)
for eff in reads:
fence_id = getattr(fence, eff)
group = read_groups.setdefault((eff, fence_id), [])
group.append(inst)

# collect writes in a separate dict
for eff in writes:
fence_id = getattr(fence, eff)
assert (eff, fence_id) not in terms
terms[(eff, fence_id)] = inst

fence = _compute_fence(inst.opcode, fence)

for (effect, fence_id), write_inst in terms.items():
reads = read_groups.get((effect, fence_id), [])
for read in reads:
if read == write_inst:
continue
self._graph[write_inst].append(read)

next_id = fence_id + 1

next_write = terms.get((effect, next_id))
if next_write is not None:
self._graph[next_write].append(write_inst)

next_reads = read_groups.get((effect, next_id), [])
for inst in next_reads:
self._graph[inst].append(write_inst)

# invert the graph, go the other way
for inst, dependencies in self._graph.items():
# sanity check the graph
assert inst not in dependencies, inst
for target in dependencies:
self._outputs[target].append(inst)

def required_by(self, inst):
return self._graph.get(inst, [])

def downstream_of(self, inst):
return self._outputs.get(inst, [])


def _get_reads(opcode):
ret = reads.get(opcode, ())
if not isinstance(ret, tuple):
ret = (ret,)
return ret


def _get_writes(opcode):
ret = writes.get(opcode, ())
if not isinstance(ret, tuple):
ret = (ret,)
return ret


def _compute_fence(opcode: str, fence: Fence) -> Fence:
if opcode not in writes:
return fence

effects = _get_writes(opcode)

tmp = asdict(fence)
for eff in effects:
tmp[eff] += 1

return Fence(**tmp)


class DFTPass(IRPass):
function: IRFunction
inst_order: dict[IRInstruction, int]
inst_order_num: int

def _process_instruction_r(self, bb: IRBasicBlock, inst: IRInstruction, offset: int = 0):
def _process_instruction_r(self, bb: IRBasicBlock, inst: IRInstruction):
if inst.parent != bb:
return
if inst in self.done:
return

for op in inst.get_outputs():
assert isinstance(op, IRVariable), f"expected variable, got {op}"
uses = self.dfg.get_uses(op)

for uses_this in uses:
if uses_this.parent != inst.parent or uses_this.fence_id != inst.fence_id:
# don't reorder across basic block or fence boundaries
continue

# if the instruction is a terminator, we need to place
# it at the end of the basic block
# along with all the instructions that "lead" to it
self._process_instruction_r(bb, uses_this, offset)
for use in reversed(uses):
self._process_instruction_r(bb, use)

if inst in self.visited_instructions:
if inst in self.started:
return
self.visited_instructions.add(inst)
self.inst_order_num += 1
self.started.add(inst)

if inst.is_bb_terminator:
offset = len(bb.instructions)

if inst.opcode == "phi":
# phi instructions stay at the beginning of the basic block
# and no input processing is needed
# bb.instructions.append(inst)
self.inst_order[inst] = 0
if inst.opcode in ("phi", "param"):
return

for op in inst.get_input_variables():
target = self.dfg.get_producing_instruction(op)
assert target is not None, f"no producing instruction for {op}"
if target.parent != inst.parent or target.fence_id != inst.fence_id:
# don't reorder across basic block or fence boundaries
continue
self._process_instruction_r(bb, target, offset)
self._process_instruction_r(bb, target)

for target in self._effects_g.required_by(inst):
self._process_instruction_r(bb, target)

self.inst_order[inst] = self.inst_order_num + offset
bb.instructions.append(inst)
self.done.add(inst)

def _process_basic_block(self, bb: IRBasicBlock) -> None:
self.function.append_basic_block(bb)
self._effects_g = EffectsG()
self._effects_g.analyze(bb)

for inst in bb.instructions:
inst.fence_id = self.fence_id
if inst.is_volatile:
self.fence_id += 1

# We go throught the instructions and calculate the order in which they should be executed
# based on the data flow graph. This order is stored in the inst_order dictionary.
# We then sort the instructions based on this order.
self.inst_order = {}
self.inst_order_num = 0
for inst in bb.instructions:
instructions = bb.instructions.copy()
bb.instructions = [inst for inst in bb.instructions if inst.opcode in ("phi", "param")]

# start with out liveness
if len(bb.cfg_out) > 0:
next_bb = bb.cfg_out.first()

Check warning

Code scanning / CodeQL

Unreachable code Warning

This statement is unreachable.
target_stack = self.liveness.input_vars_from(bb, next_bb)
Fixed Show fixed Hide fixed
for var in reversed(list(target_stack)):
inst = self.dfg.get_producing_instruction(var)
self._process_instruction_r(bb, inst)

for inst in instructions:
self._process_instruction_r(bb, inst)

bb.instructions.sort(key=lambda x: self.inst_order[x])
def key(inst):
if inst.is_bb_terminator:
return 2
return 1

bb.instructions.sort(key=key)

# sanity check: the instructions we started with are the same
# as we have now
assert set(bb.instructions) == set(instructions), (instructions, bb)

def run_pass(self) -> None:
self.dfg = self.analyses_cache.request_analysis(DFGAnalysis)
self.liveness = self.analyses_cache.request_analysis(LivenessAnalysis) # use out_vars

self.fence_id = 0
self.visited_instructions: OrderedSet[IRInstruction] = OrderedSet()
self.started: OrderedSet[IRInstruction] = OrderedSet()
self.done: OrderedSet[IRInstruction] = OrderedSet()

basic_blocks = list(self.function.get_basic_blocks())

self.function.clear_basic_blocks()
for bb in basic_blocks:
for bb in self.function.get_basic_blocks():
self._process_basic_block(bb)

# for repr
self.analyses_cache.force_analysis(LivenessAnalysis)
57 changes: 57 additions & 0 deletions vyper/venom/passes/store_expansion.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
from vyper.venom.analysis.cfg import CFGAnalysis
from vyper.venom.analysis.dfg import DFGAnalysis
from vyper.venom.analysis.liveness import LivenessAnalysis
from vyper.venom.basicblock import IRInstruction
from vyper.venom.passes.base_pass import IRPass


class StoreExpansionPass(IRPass):
"""
This pass expands variables to their uses though `store` instructions,
reducing pressure on the stack scheduler
"""

def run_pass(self):
dfg = self.analyses_cache.request_analysis(DFGAnalysis)
Fixed Show fixed Hide fixed
self.analyses_cache.request_analysis(CFGAnalysis)
liveness = self.analyses_cache.force_analysis(LivenessAnalysis)

Check notice

Code scanning / CodeQL

Unused local variable Note

Variable liveness is not used.

for bb in self.function.get_basic_blocks():
if len(bb.instructions) == 0:
continue

for var in bb.instructions[0].liveness:
self._process_var(dfg, bb, var, 0)

for idx, inst in enumerate(bb.instructions):
if inst.output is None:
continue

self._process_var(dfg, bb, inst.output, idx + 1)

bb.instructions.sort(key=lambda inst: inst.opcode not in ("phi", "param"))

self.analyses_cache.invalidate_analysis(LivenessAnalysis)
self.analyses_cache.invalidate_analysis(DFGAnalysis)

def _process_var(self, dfg, bb, var, idx):
"""
Process a variable, allocating a new variable for each use
and copying it to the new instruction
"""
uses = dfg.get_uses(var)

_cache = {}

Check notice

Code scanning / CodeQL

Unused local variable Note

Variable _cache is not used.

for use_inst in uses:
if use_inst.opcode == "phi":
continue
if use_inst.parent != bb:
continue

for i, operand in enumerate(use_inst.operands):
if operand == var:
new_var = self.function.get_next_variable()
new_inst = IRInstruction("store", [var], new_var)
bb.insert_instruction(new_inst, idx)
use_inst.operands[i] = new_var
Loading
Loading