Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mnemonic support & Improve Disassembler #21

Open
wants to merge 9 commits into
base: coredsl_exceptions
Choose a base branch
from

Conversation

PhilippvK
Copy link
Member

Changes

  • Support Mnomic field in CoreDSL Gramma & metamodel
  • Remove support for legacy args_disass in CoreDSL2 grammar (is this fine)
  • Rename disass to assembly everywhere
  • Disassembler
    • Add AsmFormatter class to parse assembly args content (enable using --format flag)
      • Add support for name() and fname() "intrinsic" to lookup register names
    • Use mnomic instead of instruction name if available
    • Skip repeated DII (0x0000) instructions

@PhilippvK PhilippvK requested a review from wysiwyng May 23, 2023 07:15
Comment on lines +23 to +91
NAMES = {
0: "zero",
1: "ra",
2: "sp",
3: "gp",
4: "tp",
5: "t0",
6: "t1",
7: "t2",
8: "s0",
9: "s1",
10: "a0",
11: "a1",
12: "a2",
13: "a3",
14: "a4",
15: "a5",
16: "a6",
17: "a7",
18: "s2",
19: "s3",
20: "s4",
21: "s5",
22: "s6",
23: "s7",
24: "s8",
25: "s9",
26: "s10",
27: "s11",
28: "t3",
29: "t4",
30: "t5",
31: "t6",
}

FNAMES = {
0: "f0",
1: "f1",
2: "f2",
3: "f3",
4: "f4",
5: "f5",
6: "f6",
7: "f7",
8: "fs0",
9: "fs1",
10: "fa0",
11: "fa1",
12: "fa2",
13: "fa3",
14: "fa4",
15: "fa5",
16: "fa6",
17: "fa7",
18: "fs2",
19: "fs3",
20: "fs4",
21: "fs5",
22: "fs6",
23: "fs7",
24: "fs8",
25: "fs9",
26: "fs10",
27: "fs11",
28: "ft8",
29: "ft9",
30: "ft10",
31: "ft11",
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these should be pulled from the actual model, not hardcoded here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea

Copy link
Collaborator

@wysiwyng wysiwyng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would not necessarily say this behavior is correct, how does e.g. objdump handle repeated 0x0000 codepoints?

@wysiwyng
Copy link
Collaborator

args_disass removal is fine, this was left in for compatibility with legacy CoreDSL2 files.

@PhilippvK
Copy link
Member Author

PhilippvK commented May 30, 2023

i would not necessarily say this behavior is correct, how does e.g. objdump handle repeated 0x0000 codepoints?

objdump reads ELF files instead raw .bin files, aka it only dumps the sections which make sense, skipping the zeros in between. Our disassembler also does not print the correct adresses (e.g. staring with 80000000) because that information in not contained in the binary.

RISCV objdump output with --disassemble-all for reference:

800003c0 <pass>:
800003c0:       0ff0000f                fence
800003c4:       00100193                li      gp,1
800003c8:       05d00893                li      a7,93
800003cc:       00000513                li      a0,0
800003d0:       00000073                ecall
800003d4:       c0001073                unimp
        ...

Disassembly of section .tohost:

80001000 <tohost>:
        ...

80001040 <fromhost>:
        ...

Disassembly of section .data:

80002000 <test_2_data>:
80002000:       0000                    .2byte  0x0
80002002:       4020                    .2byte  0x4020
80002004:       0000                    .2byte  0x0
80002006:       3f80                    .2byte  0x3f80
80002008:       0000                    .2byte  0x0
8000200a:       0000                    .2byte  0x0
8000200c:       0000                    .2byte  0x0
8000200e:       4060                    .2byte  0x4060

@wysiwyng
Copy link
Collaborator

objdump can also read binary files, the handling of long sequences of equal instruction words remains the same:

23a0:   557d                    li      a0,-1
23a2:   b7c1                    j       0x2362
...
23b0:   1101                    addi    sp,sp,-32
23b2:   cc22                    sw      s0,24(sp)

the disassembler backend was initially a toy project to test how decoders would be implemented using information directly from the M2-ISA-R model. if we want to use this backend seriously, it probably also needs some kind of ELF support.

for this PR, i think expanding the duplicate handling to all codewords is enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants