Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal for refactored Intrinsics configuration #155

Open
PhilippvK opened this issue Dec 30, 2024 · 2 comments
Open

Proposal for refactored Intrinsics configuration #155

PhilippvK opened this issue Dec 30, 2024 · 2 comments
Assignees

Comments

@PhilippvK
Copy link
Member

Proposed changes:

  • Move intrinsics from settings.intrinsics to settings.extensions.{ext_name}.intrinsics (easier lookup)
  • Add signedness information to arguments and return type
  • Allow specifying allowed value ranges (i.e. [0, 31] or [-16, 15] for 5-bit operand exposed via uint8_t argument)
  • Use c_type instead of ir_type as the default format to define argument types (includes sign information)
  • Allow specifying defaults for intrinsics/builtins (on per-extension level)
  • Allow just generating an LLVM intrinsic without generating the respective Clang builtin, i.e. via builtin: null
  • Define/override the builtin/intrinsic prefix per-instruction or per-extension, i.e. via builtin.prefix: "__builtin_riscv_{arch}" or intrinsic.prefix: "llvm.int.riscv.{arch}"
  • Support intrinsic properties (IntrNoMem, IntrWillReturn, ...) and builtin attributes (NoThrow, Const, ...). Either explicitly or automatically inferred by Seal5 passes.

This is mostly not implemented in Seal5 yet, so I am happy about any comment.

In the following, I will provide examples to compare the new and old configuration.

Example 1 (before):

---
extensions:
  XCoreVAlu:
    # feature: XCValu
    # arch: xcvalu
    version: "1.0"
    experimental: false
    vendor: true
intrinsics:
  # See: https://github.com/openhwgroup/core-v-sw/blob/master/specifications/corev-builtin-spec.md#listing-of-multiply-accumulate-builtins-xcvmac
  intrinsics:
  # unary
  - &ScalarCoreVAluGprIntrinsic
    args:
    - arg_name: rs1
      arg_type: i32
    set_name: XCoreVAlu
    instr_name: SEAL5_CV_ABS
    intrinsic_name: alu_abs
    ret_type: i32
  - <<: *ScalarCoreVAluGprIntrinsic
    instr_name: SEAL5_CV_EXTHS
    intrinsic_name: alu_exths
  - <<: *ScalarCoreVAluGprIntrinsic
    instr_name: SEAL5_CV_EXTHZ
    intrinsic_name: alu_exthz
  - <<: *ScalarCoreVAluGprIntrinsic
    instr_name: SEAL5_CV_EXTBS
    intrinsic_name: alu_extbs
  - <<: *ScalarCoreVAluGprIntrinsic
    instr_name: SEAL5_CV_EXTBZ
    intrinsic_name: alu_extbz
  # binary
  - &ScalarCoreVAluGprGprIntrinsic
    args:
    - arg_name: rs1
      arg_type: i32
    - arg_name: rs2
      arg_type: i32
    set_name: XCoreVAlu
    instr_name: SEAL5_CV_SLET
    intrinsic_name: alu_slet
    ret_type: i32
  - <<: *ScalarCoreVAluGprGprIntrinsic
    instr_name: SEAL5_CV_SLETU
    intrinsic_name: alu_sletu
  - <<: *ScalarCoreVAluGprGprIntrinsic
    instr_name: SEAL5_CV_MIN
    intrinsic_name: alu_min
  - <<: *ScalarCoreVAluGprGprIntrinsic
    instr_name: SEAL5_CV_MINU
    intrinsic_name: alu_minu
  - <<: *ScalarCoreVAluGprGprIntrinsic
    instr_name: SEAL5_CV_MAX
    intrinsic_name: alu_max
  - <<: *ScalarCoreVAluGprGprIntrinsic
    instr_name: SEAL5_CV_MAXU
    intrinsic_name: alu_maxu
  - <<: *ScalarCoreVAluGprGprIntrinsic
    instr_name: SEAL5_CV_CLIP
    intrinsic_name: alu_clip
  - <<: *ScalarCoreVAluGprGprIntrinsic
    instr_name: SEAL5_CV_CLIPU
    intrinsic_name: alu_clipu
  # ternary
  - &ScalarCoreVAluGprGprGprIntrinsic
    args:
    - arg_name: rs1
      arg_type: i32
    - arg_name: rs2
      arg_type: i32
    - arg_name: rs3  # TODO: Shouldn't this be Luimm5 and immediate: true?
      arg_type: i32
    # - arg_name: Is3
    #   arg_type: i32
    #   immediate: true
    set_name: XCoreVAlu
    instr_name: SEAL5_CV_ADDN
    intrinsic_name: alu_addn
    ret_type: i32
  - <<: *ScalarCoreVAluGprGprGprIntrinsic
    instr_name: SEAL5_CV_ADDUN
    intrinsic_name: alu_addun
  - <<: *ScalarCoreVAluGprGprGprIntrinsic
    instr_name: SEAL5_CV_ADDRN
    intrinsic_name: alu_addrn
  - <<: *ScalarCoreVAluGprGprGprIntrinsic
    instr_name: SEAL5_CV_ADDURN
    intrinsic_name: alu_addurn
  - <<: *ScalarCoreVAluGprGprGprIntrinsic
    instr_name: SEAL5_CV_SUBN
    intrinsic_name: alu_subn
  - <<: *ScalarCoreVAluGprGprGprIntrinsic
    instr_name: SEAL5_CV_SUBUN
    intrinsic_name: alu_subun
  - <<: *ScalarCoreVAluGprGprGprIntrinsic
    instr_name: SEAL5_CV_SUBRN
    intrinsic_name: alu_subrn
  - <<: *ScalarCoreVAluGprGprGprIntrinsic
    instr_name: SEAL5_CV_SUBURN
    intrinsic_name: alu_suburn

Example 2 (after, long):

---
extensions:
  XCoreVAlu:
    # feature: XCValu
    # arch: xcvalu
    version: "1.0"
    experimental: false
    vendor: true
    # See: https://github.com/openhwgroup/core-v-sw/blob/master/specifications/corev-builtin-spec.md#listing-of-multiply-accumulate-builtins-xcvmac
    llvm:
      defaults:
        intrinsic: &default_intrinsic
          prefix: "llvm.int.riscv.{arch}"
          properties: &default_intrinsic_properties
          - IntrNoMem
          - IntrWillReturn
          builtin: &default_builtin
            prefix: "__builtin_riscv_{arch}"
            attributes: &default_builtin_attributes
              - NoThrow
              - Const
      intrinsics:
      # unary
      - instr_name: SEAL5_CV_ABS
        intrinsic: &ScalarCoreVAluGprIntrinsic
          <<: *default_intrinsic 
          # name: auto/null -> instr_name.lower().replace(".", "_") -> seal5.cv.abs -> seal_cv_abs?
          name: alu_abs
          args: &unary_args_signed
          - name: x
            type:
              c_type: int32_t
              # specifier: SZ
              # ir_type: i32
              # td_type: llvm_i32_ty
              # cdsl_type: signed<32>
              # cdsl_type:
              #   data_type: U
              #   size: 32
              #   width: 0
            # op_name: rs1
            # op_idx: 0
            # range: null
            # immediate: false
          ret:
            type:
              c_type: int32_t
      - instr_name: SEAL5_CV_EXTHS
        intrinsic:
          <<: *ScalarCoreVAluGprIntrinsic
          name: alu_exths
      - instr_name: SEAL5_CV_EXTHZ
        intrinsic:
          <<: *ScalarCoreVAluGprIntrinsic
          name: alu_exthz
      - instr_name: SEAL5_CV_EXTBS
        intrinsic:
          <<: *ScalarCoreVAluGprIntrinsic
          name: alu_extbs
      - instr_name: SEAL5_CV_EXTBZ
        intrinsic:
          <<: *ScalarCoreVAluGprIntrinsic
          name: alu_extbz
      # binary
      - instr_name: SEAL5_CV_SLET
        intrinsic: &ScalarCoreVAluGprGprIntrinsic
          <<: *default_intrinsic
          name: alu_slet
          args: &binary_args_signed
          - name: a
            type:
              c_type: int32_t
          - name: b
            type:
              c_type: int32_t
          ret:
            type:
              c_type: int32_t
      - instr_name: SEAL5_CV_SLETU
        intrinsic:
          <<: *ScalarCoreVAluGprGprUIntrinsic
          name: alu_sletu
          args: &binary_args_unsigned
          - name: a
            type:
              c_type: uint32_t
          - name: b
            type:
              c_type: uint32_t
      - instr_name: SEAL5_CV_MIN
        intrinsic:
          <<: *ScalarCoreVAluGprGprIntrinsic
          name: alu_min
      - instr_name: SEAL5_CV_MINU
        intrinsic:
          <<: *ScalarCoreVAluGprGprIntrinsic
          name: alu_minu
          args: *binary_args_unsigned
      - instr_name: SEAL5_CV_MAX
        intrinsic:
          <<: *ScalarCoreVAluGprGprIntrinsic
          name: alu_max
      - instr_name: SEAL5_CV_MAXU
        intrinsic:
          <<: *ScalarCoreVAluGprGprIntrinsic
          name: alu_maxu
      - instr_name: SEAL5_CV_CLIP
        intrinsic:
          <<: *ScalarCoreVAluGprGprIntrinsic
          name: alu_clip
      - instr_name: SEAL5_CV_CLIPU
        intrinsic:
          <<: *ScalarCoreVAluGprGprIntrinsic
          name: alu_clipu
      # ternary
      - instr_name: SEAL5_CV_ADDN
        intrinsic: &ScalarCoreVAluGprGprGprIntrinsic
          <<: *default_intrinsic
          name: alu_addn
          args: &ternary_args_signed
          - name: a
            type:
              c_type: int32_t
          - name: b
            type:
              c_type: int32_t
          - name: c
            type:
              c_type: uint32_t
              # c_type: uint8_t
            # immediate: true
            # range: [0, 31]
          ret:
            type:
              c_type: int32_t
      - instr_name: SEAL5_CV_ADDUN
        intrinsic:
          <<: &ScalarCoreVAluGprGprGprIntrinsic
          name: alu_addun
          args: &ternary_args_unsigned
          - name: a
            type:
              c_type: uint32_t
          - name: b
            type:
              c_type: uint32_t
          - name: c
            type:
              c_type: uint32_t
              # c_type: uint8_t
            # immediate: true
            # range: [0, 31]
      - instr_name: SEAL5_CV_ADDRN
        intrinsic:
          <<: *ScalarCoreVAluGprGprGprIntrinsic
          name: alu_addrn
      - instr_name: SEAL5_CV_ADDURN
        intrinsic:
          <<: *ScalarCoreVAluGprGprGprIntrinsic
          name: alu_addurn
          args: *ternary_args_unsigned
      - instr_name: SEAL5_CV_SUBN
        intrinsic:
          <<: *ScalarCoreVAluGprGprGprIntrinsic
          name: alu_subn
      - instr_name: SEAL5_CV_SUBUN
        intrinsic:
          <<: *ScalarCoreVAluGprGprGprIntrinsic
          name: alu_subun
          args: *ternary_args_unsigned
      - instr_name: SEAL5_CV_SUBRN
        intrinsic:
          <<: *ScalarCoreVAluGprGprGprIntrinsic
          name: alu_subrn
      - instr_name: SEAL5_CV_SUBURN
        intrinsic:
          <<: *ScalarCoreVAluGprGprGprIntrinsic
          name: alu_suburn
          args: *ternary_args_unsigned

Note that despite the compact syntax using YAML anchors, this is actually a bit longer than the original. However, this can partly be explained by the fact, that this Syntax includes more information (i.e. signedness/range of arguments)

I also came up with a "shorthand" syntax that lets you specify just the prototype for the builtins. The actual configuration would then need to be inferred by Seal5. This will work in some cases, but it should be possible to provide the full information as shown above, too.

Example 3 (after, short):

---
extensions:
  XCoreVAlu:
    # feature: XCValu
    # arch: xcvalu
    version: "1.0"
    experimental: false
    vendor: true
    # See: https://github.com/openhwgroup/core-v-sw/blob/master/specifications/corev-builtin-spec.md#listing-of-multiply-accumulate-builtins-xcvmac
    llvm:
      # defaults:
      #   intrinsic: &default_intrinsic
      #     prefix: "llvm.int.riscv.{arch}"
      #     properties: &default_intrinsic_properties
      #     - IntrNoMem
      #     - IntrWillReturn
      #     builtin: &default_builtin
      #       prefix: "__builtin_riscv_{arch}"
      #       attributes: &default_builtin_attributes
      #         - NoThrow
      #         - Const
      intrinsics:
      # unary
      - instr_name: SEAL5_CV_ABS
        # Alternative with explicit mapping between arguments and operands
        # intrinsic: "int32_t __builtin_riscv_xcorevalu_alu_abs(int32_t x{:rs1})"
        intrinsic: "int32_t __builtin_riscv_xcorevalu_alu_abs(int32_t x)"
      - instr_name: SEAL5_CV_EXTHS
        intrinsic: "TODO"
      - instr_name: SEAL5_CV_EXTHZ
        intrinsic: "TODO"
      - instr_name: SEAL5_CV_EXTBS
        intrinsic: "TODO"
      - instr_name: SEAL5_CV_EXTBZ
        intrinsic: "TODO"
      # binary
      - instr_name: SEAL5_CV_SLET
        intrinsic: "TODO"
      - instr_name: SEAL5_CV_SLETU
        intrinsic: "TODO"
      - instr_name: SEAL5_CV_MIN
        intrinsic: "TODO"
      - instr_name: SEAL5_CV_MINU
        intrinsic: "TODO"
      - instr_name: SEAL5_CV_MAX
        intrinsic: "TODO"
      - instr_name: SEAL5_CV_MAXU
        intrinsic: "TODO"
      - instr_name: SEAL5_CV_CLIP
        intrinsic: "TODO"
      - instr_name: SEAL5_CV_CLIPU
        intrinsic: "TODO"
      # ternary
      - instr_name: SEAL5_CV_ADDN
        intrinsic:  "TODO"
      - instr_name: SEAL5_CV_ADDUN
        intrinsic: "TODO"
      - instr_name: SEAL5_CV_ADDRN
        intrinsic: "TODO"
      - instr_name: SEAL5_CV_ADDURN
        intrinsic: "TODO"
      - instr_name: SEAL5_CV_SUBN
        intrinsic: "TODO"
      - instr_name: SEAL5_CV_SUBUN
        intrinsic: "TODO"
      - instr_name: SEAL5_CV_SUBRN
        intrinsic: "TODO"
      - instr_name: SEAL5_CV_SUBURN
        intrinsic: "TODO"

This the compact syntax it should also be easy to embed the builtin settings (optionally of course) in the CDSL syntax via attributes. Here is an example:

SEAL5_CV_ABS [[builtin="int32_t __builtin_riscv_xcorevalu_alu_abs(int32_t x)"]] {
  encoding: ...
  assembly: ...
  behavior: {...}
}
@PhilippvK
Copy link
Member Author

@PhilippvK PhilippvK self-assigned this Dec 30, 2024
@thomasgoodfellow
Copy link
Collaborator

Looks ambitiously comprehensive, or comprehensively ambitious :-) Scattershot shallow comments...

  • Is the motivation for using c_type over ir_type principally to include the signed-ness info? Seems good anyway since the types are more obvious, maybe more dominant in future LLVM evolution
  • Number ranges might also be useful for subsetting - eg like restricting the register numbers in RV32E to [0, 15] (though that's a poor example though since those are also the RV32I instruction encodings so could only work if the toolchain is being built to only support RV32E)
  • Custom immediate handling in general is (was? did it get better in LLVM19?) regrettably ugly with the need to generate C++ rather than expressing purely through TableGen
  • With some different layers of config possible, e.g. the CoreDSL annotations from Support automatically generated intrinsics/builtins #69, prefix overrides here, clear diagnostics will be a blessing for understanding the results
  • As you note the wordiness of the configuration is an existing problem with too much information being duplicated from the CoreDSL types. And some of those anchor patterns like "binary_args_unsigned" might be common/generic enough to be worth promoting to some "standard_intrinsic_patterns.yml" (though

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants