Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for 64bit immediate with type 2 #820

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

Alan-Jowett
Copy link
Contributor

@Alan-Jowett Alan-Jowett commented Jan 7, 2025

From the ISA RFC:
5.4.  64-bit immediate instructions

   Instructions with the IMM 'mode' modifier use the wide instruction
   encoding defined in Instruction encoding (Section 3), and use the
   'src_reg' field of the basic instruction to hold an opcode subtype.

   The following table defines a set of {IMM, DW, LD} instructions with
   opcode subtypes in the 'src_reg' field, using new terms such as "map"
   defined further below:

    +=========+================================+==========+==========+
    | src_reg | pseudocode                     | imm type | dst type |
    +=========+================================+==========+==========+
    | 0x0     | dst = (next_imm << 32) | imm   | integer  | integer  |
    +---------+--------------------------------+----------+----------+
    | 0x1     | dst = map_by_fd(imm)           | map fd   | map      |
    +---------+--------------------------------+----------+----------+
    | 0x2     | dst = map_val(map_by_fd(imm))  | map fd   | data     |
    |         | + next_imm                     |          | address  |
    +---------+--------------------------------+----------+----------+
    | 0x3     | dst = var_addr(imm)            | variable | data     |
    |         |                                | id       | address  |
    +---------+--------------------------------+----------+----------+
    | 0x4     | dst = code_addr(imm)           | integer  | code     |
    |         |                                |          | address  |
    +---------+--------------------------------+----------+----------+
    | 0x5     | dst = map_by_idx(imm)          | map      | map      |
    |         |                                | index    |          |
    +---------+--------------------------------+----------+----------+
    | 0x6     | dst = map_val(map_by_idx(imm)) | map      | data     |
    |         | + next_imm                     | index    | address  |
    +---------+--------------------------------+----------+----------+

                 Table 12: 64-bit immediate instructions

Summary by CodeRabbit

Release Notes

  • New Features

    • Added support for loading map addresses and map file descriptors.
    • Enhanced instruction parsing for map-related operations.
    • Improved handling of global variable relocations.
  • Bug Fixes

    • Updated instruction type classification to include new load instructions.
    • Refined relocation logic for map and global variable instructions.
  • Documentation

    • Added test cases for new map-related functionality.
    • Updated YAML test configurations.
  • Tests

    • Expanded test coverage for map address and file descriptor loading.
    • Modified verification tests for global variables.

Copy link

coderabbitai bot commented Jan 7, 2025

Walkthrough

The pull request introduces support for a new instruction type LoadMapAddress in the eBPF verifier. This change extends the instruction set to handle loading map addresses with an offset, enhancing the flexibility of map-related operations. The modifications span multiple source files, adding new parsing, marshaling, and transformation logic to support this instruction type across the verification and compilation process.

Changes

File Change Summary
src/asm_syntax.hpp Added LoadMapAddress struct to instruction variant
src/asm_parse.cpp Added regex and parsing support for map value and map fd instructions
src/asm_marshal.cpp, src/asm_unmarshal.cpp Updated makeLddw to handle new instruction mode
src/ebpf_vm_isa.hpp Introduced new instruction mode constants
src/crab/ebpf_transformer.cpp Added method to process map address loading
src/test/* Added test cases for new map address functionality

Possibly related issues

  • vbpf/ebpf-verifier#763: This PR partially addresses the issue by implementing support for additional 64-bit immediate instruction modes, specifically type 2 (map value loading). The implementation aligns with the IETF BPF ISA draft's specification for map value address loading.

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 320dee7 and 10b9737.

📒 Files selected for processing (1)
  • src/crab/ebpf_transformer.cpp (3 hunks)
🧰 Additional context used
📓 Learnings (1)
src/crab/ebpf_transformer.cpp (2)
Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#820
File: src/crab/ebpf_transformer.cpp:1844-1858
Timestamp: 2025-01-23T17:17:50.711Z
Learning: The platform's `get_map_descriptor` method is responsible for map descriptor validation and throws `UnmarshalError` for invalid file descriptors. This is the appropriate layer for validation rather than in the transformer.
Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#820
File: src/crab/ebpf_transformer.cpp:1844-1858
Timestamp: 2025-01-23T17:17:50.711Z
Learning: Map descriptor validation should be handled by the platform's `get_map_descriptor` method, which should throw an exception for invalid file descriptors.
⏰ Context from checks skipped due to timeout of 90000ms (10)
  • GitHub Check: build_windows (Release, library)
  • GitHub Check: build_ubuntu (Release, library)
  • GitHub Check: build_windows (Release, tests)
  • GitHub Check: build_ubuntu (Release, tests)
  • GitHub Check: build_windows (Debug, library)
  • GitHub Check: build_windows (Debug, tests)
  • GitHub Check: build_ubuntu (Debug, tests)
  • GitHub Check: build_ubuntu (Release)
  • GitHub Check: build_ubuntu (Debug)
  • GitHub Check: Analyze (cpp)
🔇 Additional comments (2)
src/crab/ebpf_transformer.cpp (2)

1844-1858: LGTM! The implementation correctly handles map address loading.

The method properly validates map types and sets up shared region parameters for data maps.


1860-1860: LGTM! Simple and correct delegation.

The implementation follows the established pattern of delegating to the core implementation method.

✨ Finishing Touches
  • 📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 46344ae and a864629.

📒 Files selected for processing (15)
  • external/libbtf (1 hunks)
  • src/asm_cfg.cpp (1 hunks)
  • src/asm_files.cpp (8 hunks)
  • src/asm_marshal.cpp (3 hunks)
  • src/asm_ostream.cpp (2 hunks)
  • src/asm_parse.cpp (2 hunks)
  • src/asm_syntax.hpp (2 hunks)
  • src/asm_unmarshal.cpp (2 hunks)
  • src/assertions.cpp (1 hunks)
  • src/crab/ebpf_transformer.cpp (3 hunks)
  • src/ebpf_vm_isa.hpp (1 hunks)
  • src/test/test_marshal.cpp (2 hunks)
  • src/test/test_verify.cpp (1 hunks)
  • src/test/test_yaml.cpp (1 hunks)
  • test-data/map.yaml (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: build_windows (Release, library)
  • GitHub Check: build_ubuntu (Release, tests)
  • GitHub Check: build_windows (Release, tests)
  • GitHub Check: build_ubuntu (Debug, tests)
  • GitHub Check: build_windows (Debug, library)
  • GitHub Check: build_windows (Debug, tests)
  • GitHub Check: build_ubuntu (Release)
  • GitHub Check: Analyze (cpp)
  • GitHub Check: build_ubuntu (Debug)
🔇 Additional comments (29)
src/asm_files.cpp (7)

174-174: Assignment of INST_LD_MODE_MAP_FD

inst.src is correctly set to INST_LD_MODE_MAP_FD for loading map file descriptors.


199-223: Implementation of relocate_global_variable function

Function correctly handles the relocation of global variables.


350-350: Initialization of global_variable_section_indices

Declaration and initialization are appropriate.


370-377: Checking for old-style map sections

Loop correctly identifies the presence of old-style map sections.


Line range hint 380-416: Handling BTF data and global variable sections

Proper handling of BTF data and initialization of section indices for .maps, .data, .bss, and .rodata.


499-504: Relocating global variables

Relocation for global variables using relocate_global_variable is correctly implemented.


477-479: Verify inst_next assignment logic

Ensure that the calculation of inst_next handles boundary conditions correctly.

Review the assignment:

ebpf_inst& inst_next = offset / sizeof(ebpf_inst) + 1 < prog.prog.size()
                           ? prog.prog[offset / sizeof(ebpf_inst) + 1]
                           : inst;

Confirm that when offset / sizeof(ebpf_inst) + 1 equals prog.prog.size(), inst_next correctly references inst to avoid out-of-bounds access.

✅ Verification successful

Boundary condition handling is correct

The assignment logic safely handles the boundary case by falling back to inst when at the end of the program, preventing out-of-bounds access. This is consistent with the codebase's instruction handling patterns and is validated by the test suite.

🏁 Scripts executed

The following scripts were executed for the analysis:

Error: No shell scripts found


Script:

#!/bin/bash
# Get the complete context of the function containing this code
ast-grep --pattern 'function $FUNC_NAME($$$) {
  $$$
  inst_next = $$$
  $$$
}'

# Search for similar patterns and usage of prog.prog
rg "prog\.prog\[" -A 2 -B 2

# Look for test files
fd "test.*cpp$"

Length of output: 1097


Script:

#!/bin/bash
# Check test_marshal.cpp for relevant test cases
cat src/test/test_marshal.cpp

# Look for verification functions
rg "verify|validate" src/test/test_marshal.cpp -A 2 -B 2

Length of output: 44773

src/test/test_yaml.cpp (1)

34-34: Addition of map.yaml to YAML test suite

New test case YAML_CASE("test-data/map.yaml") added to the test suite.

src/ebpf_vm_isa.hpp (1)

91-93: Defining new load modes

New constants INST_LD_MODE_IMM, INST_LD_MODE_MAP_FD, and INST_LD_MODE_MAP_VALUE are appropriately defined and adhere to the instruction set.

src/asm_syntax.hpp (2)

96-103: LGTM: Clean struct definition.

The struct follows the established pattern with appropriate member types and a defaulted equality operator.


261-261: LGTM: Proper variant type update.

LoadMapAddress is correctly added to the Instruction variant type.

src/asm_marshal.cpp (3)

Line range hint 90-97: LGTM: Enhanced flexibility with type parameter.

The function now supports multiple operation modes through the type parameter instead of a boolean flag.


108-108: LGTM: Proper mode constant usage.

Correctly uses INST_LD_MODE_MAP_FD constant with the updated makeLddw function.


110-112: LGTM: Clean implementation of LoadMapAddress marshaling.

The implementation properly handles both mapfd and offset parameters using the correct mode constant.

src/assertions.cpp (1)

48-48: LGTM: Consistent with LoadMapFd behavior.

Correctly returns an empty vector of assertions, matching the pattern of similar operations.

src/asm_cfg.cpp (1)

402-403: LGTM: Consistent instruction type categorization.

LoadMapAddress is correctly categorized as an "assign" type instruction, matching the behavior of similar operations.

src/asm_ostream.cpp (1)

362-363: LGTM!

The string representation matches the assembly syntax defined by the MAP_VAL regex pattern.

src/asm_unmarshal.cpp (3)

Line range hint 432-441: LGTM!

The validation logic for instruction fields is correct and comprehensive.


442-449: LGTM!

The handling of map file descriptor mode is correct.


450-452: LGTM!

The creation of LoadMapAddress instruction is correct.

src/test/test_verify.cpp (1)

427-427: Verify global variable support in the verifier.

The change from TEST_SECTION_FAIL to TEST_SECTION indicates that global variables are now supported.

Run this script to verify global variable support:

✅ Verification successful

Global variable support is properly implemented

  • Implementation in src/asm_files.cpp handles global variable relocation, section management (.data, .bss, .rodata), and validates operations
  • The change from TEST_SECTION_FAIL to TEST_SECTION is consistent with the codebase
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check if global variables are properly supported in the verifier
# Test: Search for global variable handling code
ast-grep --pattern $'class $_$ {
  $$$
  handle_global_variable($_, $_) {
    $$$
  }
  $$$
}'

Length of output: 95


Script:

#!/bin/bash
# Search for test files and code related to global variables
echo "=== Test files with 'global' in name ==="
fd -t f "global" --exec echo {}

echo -e "\n=== Code mentions of global variables ==="
rg -i "global.*variable" -A 2

echo -e "\n=== Global symbol handling ==="
ast-grep --pattern 'global_symbol'

Length of output: 2087

src/test/test_marshal.cpp (1)

44-45: LGTM: Added support for LDDW with src_reg=2.

The addition of {{0x18, DST, 2, 0, IMM}, bpf_conformance_groups_t::base64} to the instruction template table correctly implements support for 64-bit immediate with type 2.

src/crab/ebpf_transformer.cpp (2)

1844-1854: Implementation correctly handles map address loading.

The do_load_map_address method:

  1. Retrieves map descriptor and type
  2. Sets shared region size and offset
  3. Assigns valid pointer type (non-null)

1856-1857: LGTM: Simple delegation to do_load_map_address.

The operator correctly forwards the call to the implementation method.

external/libbtf (1)

1-1: Verify the submodule commit update

The submodule update needs verification to ensure it aligns with the 64-bit immediate instruction support.

Run this script to verify the commit:

✅ Verification successful

Submodule update aligns with 64-bit immediate support

The commit adds parsing support for data sections into array maps, which is consistent with the implementation of 64-bit immediate instruction support.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the submodule commit exists and examine its changes

# Check if the commit exists and is accessible
git ls-remote https://github.com/vbpf/libbtf.git 6aba1609ce53e11639cecb4fa8750dd01d5ce209

# Get the commit message and changed files
gh api repos/vbpf/libbtf/commits/6aba1609ce53e11639cecb4fa8750dd01d5ce209 --jq '.commit.message, .files[].filename'

Length of output: 448


Script:

#!/bin/bash
# Get the actual submodule URL and configuration
cat .gitmodules
git config --file .gitmodules --get-regexp url | grep libbtf
git submodule status external/libbtf

Length of output: 688


Script:

#!/bin/bash
# Verify commit in the correct repository
git ls-remote https://github.com/Alan-Jowett/libbtf.git 6aba1609ce53e11639cecb4fa8750dd01d5ce209

# Get commit details
gh api repos/Alan-Jowett/libbtf/commits/6aba1609ce53e11639cecb4fa8750dd01d5ce209 --jq '.commit.message, .files[].filename'

Length of output: 383

test-data/map.yaml (4)

1-2: License headers are properly formatted.


11-12: Document magic numbers and verify region size.

  1. Document the significance of 2147418112 in svalue array
  2. Confirm if shared_region_size=4 is sufficient for 64-bit operations mentioned in PR objectives

8-8: Verify map_val(1) existence.

Add precondition to verify that map_val(1) exists before the test.


23-26: Verify map_fd postconditions.

  1. Document the significance of svalue=[1, 2147418112]
  2. Add assertions to verify map_fd validity
✅ Verification successful

Map FD value range [1, 2147418112] ensures valid map operations.

  1. svalue=[1, 2147418112] represents valid map descriptor range:

    • Lower bound 1: Ensures map_fd is valid (0 is invalid)
    • Upper bound 2147418112 (0x7FFF_FFF0): Maximum system-allowed map FD
  2. Built-in validation:

    • Type system enforces map_fd type constraints
    • Range checks prevent invalid map operations
    • System validates map descriptors during creation
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for map_fd validation logic
rg "map_fd.*validation|verify" -A 5

Length of output: 14496


Script:

#!/bin/bash
# Search for map-related code and documentation
echo "=== Map Creation and Initialization ==="
ast-grep --pattern 'create_map'

echo -e "\n=== Map Descriptor and FD Handling ==="
rg "map.*descriptor|map.*fd" -A 3

echo -e "\n=== Map Value Documentation ==="
rg -l "\.md$" | xargs rg "map.*value|map.*fd"

Length of output: 71240

src/asm_files.cpp Outdated Show resolved Hide resolved
src/asm_parse.cpp Show resolved Hide resolved
src/asm_parse.cpp Show resolved Hide resolved
src/asm_parse.cpp Show resolved Hide resolved
test-data/map.yaml Show resolved Hide resolved
@coveralls
Copy link

coveralls commented Jan 7, 2025

Coverage Status

coverage: 88.133% (-0.04%) from 88.169%
when pulling 10b9737 on Alan-Jowett:load_map_address
into 15e1925 on vbpf:main.

src/ebpf_vm_isa.hpp Outdated Show resolved Hide resolved
src/test/test_verify.cpp Outdated Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a864629 and 76a6a71.

📒 Files selected for processing (3)
  • src/asm_files.cpp (7 hunks)
  • src/ebpf_vm_isa.hpp (1 hunks)
  • src/test/test_verify.cpp (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: build_windows (Release, library)
  • GitHub Check: build_windows (Release, tests)
  • GitHub Check: build_ubuntu (Release, tests)
  • GitHub Check: build_windows (Debug, library)
  • GitHub Check: build_windows (Debug, tests)
  • GitHub Check: build_ubuntu (Debug, tests)
  • GitHub Check: build_ubuntu (Release)
  • GitHub Check: Analyze (cpp)
  • GitHub Check: build_ubuntu (Debug)
🔇 Additional comments (7)
src/ebpf_vm_isa.hpp (1)

90-93: LGTM! Well-defined instruction load mode constants.

The new constants follow a logical sequence and are well-documented.

src/asm_files.cpp (4)

174-174: LGTM! Correct use of the new load mode constant.

The source register is properly set to INST_LD_MODE_MAP_FD for map relocations.


199-223: LGTM! Well-structured global variable relocation handling.

The function properly handles global variable relocations with appropriate error checking and uses the correct load mode constant.


368-374: LGTM! Clear separation of old-style map section detection.

The code cleanly identifies old-style map sections before processing.


400-414: LGTM! Comprehensive global variable section handling.

All standard global variable sections (.data, .bss, .rodata) are properly tracked.

src/test/test_verify.cpp (2)

528-528: Move this test case to the appropriate section.

Move this line down to around line 519 where the other TEST_SECTION("build", ...) lines are grouped.


528-528: Verify the change in test expectation.

The test case has been changed from TEST_SECTION_FAIL to TEST_SECTION, indicating that global variables are now expected to pass verification.

Run the following script to verify the change:

✅ Verification successful

Test change from TEST_SECTION_FAIL to TEST_SECTION is correct

The codebase shows proper implementation of global variable support, including section handling (.data, .bss, .rodata) and relocation logic. The test expectation change reflects this implementation.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the change in global variable handling.

# Test: Search for any related changes in global variable handling.
rg -A 5 'global_variable' --type cpp --type hpp

# Test: Search for any related test cases.
rg -A 5 'TEST_SECTION.*global_variable'

Length of output: 618


Script:

#!/bin/bash
# Find the TEST_SECTION_FAIL occurrences and macro definitions
rg -A 2 'TEST_SECTION_FAIL'

# Find the test macro definitions
rg -B 2 -A 5 '#define TEST_SECTION'

# Look for global variable related changes in recent commits
git log -n 5 --oneline --grep="global.*variable"

Length of output: 7945


Script:

#!/bin/bash
# Search for global variable verification implementation
ast-grep --pattern 'global' src/

# Search for recent changes in verifier code
rg -l 'global.*variable' src/

Length of output: 107


Script:

#!/bin/bash
# Check implementation in asm_files.cpp
rg -B 3 -A 3 'global.*variable' src/asm_files.cpp

# Look for recent changes in the file
git log -p src/asm_files.cpp | rg -B 3 -A 3 'global'

Length of output: 1651

src/asm_files.cpp Outdated Show resolved Hide resolved
src/asm_files.cpp Outdated Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 76a6a71 and 14b2b3f.

📒 Files selected for processing (1)
  • src/asm_files.cpp (7 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: build_windows (Release, library)
  • GitHub Check: build_ubuntu (Release, tests)
  • GitHub Check: build_windows (Release, tests)
  • GitHub Check: build_windows (Debug, library)
  • GitHub Check: build_ubuntu (Debug, tests)
  • GitHub Check: build_windows (Debug, tests)
  • GitHub Check: build_ubuntu (Release)
  • GitHub Check: Analyze (cpp)
  • GitHub Check: build_ubuntu (Debug)
🔇 Additional comments (1)
src/asm_files.cpp (1)

501-504: Potential inconsistency in symbol names

In relocate_global_variable, using reader.sections[symbol_section_index]->get_name() as symbol_name may not match the original symbol_name used elsewhere. Verify that the correct symbol name is passed to maintain consistency.

src/asm_files.cpp Show resolved Hide resolved
src/asm_files.cpp Outdated Show resolved Hide resolved
src/asm_files.cpp Show resolved Hide resolved
src/asm_files.cpp Outdated Show resolved Hide resolved
@Alan-Jowett
Copy link
Contributor Author

@coderabbitai review

Copy link

coderabbitai bot commented Jan 8, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 14b2b3f and 0eec122.

📒 Files selected for processing (1)
  • src/asm_files.cpp (7 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: build_ubuntu (Debug, tests)
  • GitHub Check: build_windows (Debug, tests)
  • GitHub Check: build_ubuntu (Debug)
🔇 Additional comments (5)
src/asm_files.cpp (5)

165-170: LGTM! Well-structured helper function.

The function effectively centralizes load instruction validation logic, reducing code duplication.


Line range hint 172-203: LGTM! Improved error handling and clarity.

The function is now more robust with better error messages and clearer parameter naming.


355-355: LGTM! Comprehensive global variable section handling.

The implementation correctly handles all standard global variable sections (.data, .bss, .rodata).

Also applies to: 411-421


Line range hint 481-516: LGTM! Robust relocation handling.

The relocation logic is well-structured with appropriate error checking and clear separation between map and global variable handling.


205-214: ⚠️ Potential issue

Add bounds check for next instruction access.

The function assumes the next instruction exists but doesn't verify this assumption before accessing it.

Apply this diff to add bounds checking:

     verify_load_instruction(reloc_inst, symbol_name, offset);
 
+    // Verify next instruction exists as this is a wide instruction
+    if (offset / sizeof(ebpf_inst) + 1 >= info.map_descriptors.size()) {
+        throw UnmarshalError("Missing next instruction for wide load at offset " +
+                            std::to_string(offset / sizeof(ebpf_inst)));
+    }
     // Copy the immediate value to the next instruction.
     next_reloc_inst.imm = reloc_inst.imm;

Likely invalid or redundant comment.

src/asm_files.cpp Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
src/asm_files.cpp (1)

499-504: ⚠️ Potential issue

Ensure next_reloc_inst is initialized before use

Before calling relocate_global_variable, next_reloc_inst should be properly defined to prevent undefined behavior. Currently, next_reloc_inst may not be initialized when relocate_global_variable is called.

Apply this diff to define next_reloc_inst before calling relocate_global_variable:

     ebpf_inst& reloc_inst = prog.prog[offset / sizeof(ebpf_inst)];
+    // Load instructions are two instructions long, so we need to check the next instruction.
+    if (prog.prog.size() <= offset / sizeof(ebpf_inst) + 1) {
+        throw UnmarshalError("Invalid relocation data");
+    }
+    ebpf_inst& next_reloc_inst = prog.prog[offset / sizeof(ebpf_inst) + 1];

     if (map_section_indices.contains(symbol_section_index)) {
         relocate_map(reloc_inst, symbol_name, map_record_size_or_map_offsets, info, offset, index, symbols);
         continue;
     }

     if (global_variable_section_indices.contains(symbol_section_index)) {
         relocate_global_variable(reloc_inst, next_reloc_inst,
                                  reader.sections[symbol_section_index]->get_name(), info,
                                  map_record_size_or_map_offsets, offset);
         continue;
     }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0eec122 and f863c07.

📒 Files selected for processing (15)
  • external/libbtf (1 hunks)
  • src/asm_cfg.cpp (1 hunks)
  • src/asm_files.cpp (7 hunks)
  • src/asm_marshal.cpp (3 hunks)
  • src/asm_ostream.cpp (2 hunks)
  • src/asm_parse.cpp (2 hunks)
  • src/asm_syntax.hpp (2 hunks)
  • src/asm_unmarshal.cpp (2 hunks)
  • src/assertions.cpp (1 hunks)
  • src/crab/ebpf_transformer.cpp (3 hunks)
  • src/ebpf_vm_isa.hpp (1 hunks)
  • src/test/test_marshal.cpp (2 hunks)
  • src/test/test_verify.cpp (1 hunks)
  • src/test/test_yaml.cpp (1 hunks)
  • test-data/map.yaml (1 hunks)
🧰 Additional context used
📓 Learnings (3)
src/asm_parse.cpp (3)
Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#820
File: src/asm_parse.cpp:177-179
Timestamp: 2025-01-08T16:39:55.117Z
Learning: The ASM parser in src/asm_parse.cpp is only used for parsing hand-written assembly snippets from YAML test files, not for parsing untrusted input, so input validation is unnecessary.
Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#820
File: src/asm_parse.cpp:173-176
Timestamp: 2025-01-08T16:39:47.584Z
Learning: In ebpf-verifier, ASM parsing (parse_instruction) is specifically used in src/test/ebpf_yaml.cpp for parsing controlled test inputs from YAML files in the test-data directory, and is not exposed to external inputs that would require validation.
Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#820
File: src/asm_parse.cpp:173-176
Timestamp: 2025-01-08T16:39:47.584Z
Learning: In ebpf-verifier, ASM parsing is only used for parsing hand-written test snippets in YAML files, not for parsing arbitrary input from external sources. Therefore, extensive input validation is not required.
test-data/map.yaml (1)
Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#820
File: test-data/map.yaml:20-20
Timestamp: 2025-01-08T16:32:48.309Z
Learning: The eBPF verifier test framework does not support testing invalid file descriptors, making it impossible to add test cases for invalid map_fd values or non-existent file descriptors.
src/asm_files.cpp (1)
Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#820
File: src/asm_files.cpp:205-229
Timestamp: 2025-01-08T17:37:07.223Z
Learning: Global variable relocations in eBPF are only supported with the new map types that use BTF data, not with the old-style map sections.
⏰ Context from checks skipped due to timeout of 90000ms (8)
  • GitHub Check: build_windows (Release, library)
  • GitHub Check: build_ubuntu (Release, tests)
  • GitHub Check: build_windows (Release, tests)
  • GitHub Check: build_ubuntu (Debug, tests)
  • GitHub Check: build_windows (Debug, tests)
  • GitHub Check: build_ubuntu (Release)
  • GitHub Check: build_ubuntu (Debug)
  • GitHub Check: Analyze (cpp)
🔇 Additional comments (23)
src/ebpf_vm_isa.hpp (1)

91-93: New load mode constants defined

Constants INST_LD_MODE_IMM, INST_LD_MODE_MAP_FD, and INST_LD_MODE_MAP_VALUE are added to represent new load modes.

src/asm_syntax.hpp (1)

96-104: LoadMapAddress struct integrated into instruction set

The LoadMapAddress struct is defined and included in the Instruction variant to support loading map addresses.

src/assertions.cpp (1)

48-48: LGTM!

The empty assertion vector for LoadMapAddress is consistent with LoadMapFd's behavior.

src/asm_cfg.cpp (1)

402-403: LGTM!

Categorizing LoadMapAddress as "assign" type is consistent with LoadMapFd handling.

src/asm_parse.cpp (2)

57-62: LGTM!

The regex patterns clearly define the expected formats for map value and map file descriptor instructions.


173-179: LGTM!

The instruction parsing logic correctly creates LoadMapAddress and LoadMapFd instructions from the matched patterns.

src/crab/ebpf_transformer.cpp (3)

47-47: LGTM!

The declaration follows the class's operator overloading pattern and is well-placed with other map-related operations.


1844-1854: LGTM!

The implementation correctly:

  • Retrieves map information
  • Sets up shared memory region
  • Initializes type and pointer validity

1856-1856: LGTM!

The operator follows the single responsibility principle by delegating to the implementation method.

external/libbtf (1)

1-1: Verify the submodule update impact.

The submodule update needs verification to ensure compatibility with the new 64-bit immediate support.

Run this script to analyze the changes between the old and new commits:

✅ Verification successful

Submodule update appears safe

The libbtf integration is straightforward, using standard BTF interfaces for debug information, line numbers, and map definitions. No version-specific dependencies found that could be affected by the update.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Analyze the submodule update impact

# Store the commit hashes
OLD_COMMIT="3c0cbd6ca0920e4d5df6fb88a892f0b5ac5425d7"
NEW_COMMIT="6aba1609ce53e11639cecb4fa8750dd01d5ce209"

# Check if the submodule exists and is initialized
if [ -d "external/libbtf/.git" ]; then
    cd external/libbtf

    # Fetch the commits if they don't exist
    git fetch origin $OLD_COMMIT $NEW_COMMIT

    # Show commit messages between the two versions
    echo "Commit messages between versions:"
    git log --oneline $OLD_COMMIT..$NEW_COMMIT

    # Show breaking changes in public APIs
    echo -e "\nPotential breaking changes in public headers:"
    git diff $OLD_COMMIT..$NEW_COMMIT -- '*.h'

    # Show changes in build system files
    echo -e "\nBuild system changes:"
    git diff $OLD_COMMIT..$NEW_COMMIT -- '*CMakeLists.txt' '*.cmake' 'Makefile*'
fi

Length of output: 141


Script:

#!/bin/bash
# Check submodule directory contents and integration
echo "Checking submodule directory contents:"
fd . external/libbtf -t f

echo -e "\nSearching for libbtf references in build files:"
rg -g 'CMakeLists.txt' -g 'Makefile' -g '*.cmake' 'libbtf'

echo -e "\nSearching for libbtf usage in source files:"
rg -t cpp -t c -t h 'libbtf|btf_' -A 2

Length of output: 10421

test-data/map.yaml (2)

4-14: LGTM! Comprehensive test case for map value loading.

The test case thoroughly verifies the map value loading functionality, including offset, region size, value array, type, and value matching.


16-26: LGTM! Comprehensive test case for map file descriptor loading.

The test case thoroughly verifies the map file descriptor loading functionality, including fd value, type, and value array.

src/test/test_yaml.cpp (1)

34-34: LGTM! Test case added in correct order.

The map.yaml test case is properly integrated into the test suite.

src/asm_marshal.cpp (4)

90-93: LGTM! Improved type safety with explicit mode constants.

The change from boolean flag to type parameter enhances code clarity and maintainability.


108-108: LGTM! Consistent use of mode constant.

The operator correctly uses INST_LD_MODE_MAP_FD, aligning with the new type system.


110-112: LGTM! Well-structured LoadMapAddress operator.

The operator follows the established pattern and correctly uses INST_LD_MODE_MAP_VALUE.


311-313: LGTM! Correct instruction size handling.

LoadMapAddress is properly handled as a double-word instruction.

src/asm_ostream.cpp (2)

362-363: LGTM! Clear and consistent output format.

The operator generates readable assembly syntax for LoadMapAddress instructions.


556-558: LGTM! Consistent size handling.

LoadMapAddress size handling matches the implementation in asm_marshal.cpp.

src/asm_unmarshal.cpp (3)

Line range hint 432-441: LGTM! Improved validation logic.

The validation correctly handles the extended range of source types with clear error messages.


442-449: LGTM! Clear LoadMapFd handling.

The code properly validates and creates LoadMapFd instructions with helpful comments.


450-452: LGTM! Complete LoadMapAddress handling.

The code properly creates LoadMapAddress instructions with all necessary fields.

src/test/test_marshal.cpp (1)

44-45: LGTM! Template addition aligns with ISA RFC.

The new instruction template for LDDW with src_reg=2 is correctly placed in sequence.

src/test/test_verify.cpp Show resolved Hide resolved
src/test/test_marshal.cpp Show resolved Hide resolved
@Alan-Jowett Alan-Jowett force-pushed the load_map_address branch 2 times, most recently from 831cc0a to 23edeee Compare January 13, 2025 18:37
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
src/asm_files.cpp (1)

195-195: 🧹 Nitpick (assertive)

Include instruction offset in error messages

Include the instruction offset in the error messages to aid in debugging.

Apply this diff:

-            throw UnmarshalError("Map descriptor not found for symbol " + symbol_name);
+            throw UnmarshalError("Map descriptor not found for symbol " + symbol_name + " at offset " + std::to_string(offset / sizeof(ebpf_inst)));

Also applies to: 222-222

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f863c07 and 23edeee.

📒 Files selected for processing (15)
  • external/libbtf (1 hunks)
  • src/asm_cfg.cpp (1 hunks)
  • src/asm_files.cpp (7 hunks)
  • src/asm_marshal.cpp (3 hunks)
  • src/asm_ostream.cpp (2 hunks)
  • src/asm_parse.cpp (2 hunks)
  • src/asm_syntax.hpp (2 hunks)
  • src/asm_unmarshal.cpp (2 hunks)
  • src/assertions.cpp (1 hunks)
  • src/crab/ebpf_transformer.cpp (3 hunks)
  • src/ebpf_vm_isa.hpp (1 hunks)
  • src/test/test_marshal.cpp (2 hunks)
  • src/test/test_verify.cpp (1 hunks)
  • src/test/test_yaml.cpp (1 hunks)
  • test-data/map.yaml (1 hunks)
🧰 Additional context used
📓 Learnings (4)
src/test/test_marshal.cpp (1)
Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#820
File: src/test/test_marshal.cpp:395-397
Timestamp: 2025-01-08T20:19:15.070Z
Learning: Test coverage improvements should be handled in separate PRs to maintain focused changes, as demonstrated in PR #820 where additional test cases for LoadMapAddress were deferred.
src/asm_parse.cpp (3)
Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#820
File: src/asm_parse.cpp:173-176
Timestamp: 2025-01-08T16:39:47.584Z
Learning: In ebpf-verifier, ASM parsing (parse_instruction) is specifically used in src/test/ebpf_yaml.cpp for parsing controlled test inputs from YAML files in the test-data directory, and is not exposed to external inputs that would require validation.
Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#820
File: src/asm_parse.cpp:177-179
Timestamp: 2025-01-08T16:39:55.117Z
Learning: The ASM parser in src/asm_parse.cpp is only used for parsing hand-written assembly snippets from YAML test files, not for parsing untrusted input, so input validation is unnecessary.
Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#820
File: src/asm_parse.cpp:173-176
Timestamp: 2025-01-08T16:39:47.584Z
Learning: In ebpf-verifier, ASM parsing is only used for parsing hand-written test snippets in YAML files, not for parsing arbitrary input from external sources. Therefore, extensive input validation is not required.
test-data/map.yaml (1)
Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#820
File: test-data/map.yaml:20-20
Timestamp: 2025-01-08T16:32:48.309Z
Learning: The eBPF verifier test framework does not support testing invalid file descriptors, making it impossible to add test cases for invalid map_fd values or non-existent file descriptors.
src/asm_files.cpp (1)
Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#820
File: src/asm_files.cpp:205-229
Timestamp: 2025-01-08T17:37:07.223Z
Learning: Global variable relocations in eBPF are only supported with the new map types that use BTF data, not with the old-style map sections.
⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: build_ubuntu (Release, tests)
  • GitHub Check: build_windows (Release, library)
  • GitHub Check: build_windows (Release, tests)
  • GitHub Check: build_ubuntu (Debug, tests)
  • GitHub Check: build_windows (Debug, library)
  • GitHub Check: build_windows (Debug, tests)
  • GitHub Check: build_ubuntu (Release)
  • GitHub Check: build_ubuntu (Debug)
  • GitHub Check: Analyze (cpp)
🔇 Additional comments (26)
src/test/test_yaml.cpp (1)

34-34: Test case added correctly

The new test case for test-data/map.yaml is properly added and enhances test coverage.

src/asm_syntax.hpp (2)

96-103: LGTM: Clean struct definition for LoadMapAddress.

The struct follows the established pattern with appropriate member types and a default equality operator.


261-261: LGTM: Proper integration into Instruction variant.

LoadMapAddress is correctly added to the variant type while maintaining alphabetical ordering.

src/asm_marshal.cpp (3)

Line range hint 90-97: LGTM: Enhanced makeLddw signature for flexible operation modes.

The parameter change from bool to uint8_t enables support for multiple operation modes, aligning with the 64-bit immediate support requirements.


110-112: LGTM: Clean LoadMapAddress marshaling implementation.

The operator correctly uses makeLddw with appropriate mode and parameters.


311-313: LGTM: Consistent size handling for LoadMapAddress.

The size of 2 matches other similar instruction types like LoadMapFd.

src/assertions.cpp (1)

48-48: LGTM: Consistent assertion handling for LoadMapAddress.

The empty assertion set matches LoadMapFd's behavior, which is appropriate for this instruction type.

src/asm_cfg.cpp (1)

402-403: LGTM: Proper instruction type categorization for LoadMapAddress.

Categorizing LoadMapAddress as "assign" is consistent with LoadMapFd and other similar instructions.

src/asm_ostream.cpp (2)

362-363: LGTM!

The operator implementation correctly formats the LoadMapAddress instruction output.


556-558: LGTM!

The size() function correctly returns 2 for LoadMapAddress instructions, matching the LDDW instruction format.

src/asm_unmarshal.cpp (3)

Line range hint 432-441: LGTM!

The validation logic correctly checks src field values and offset field.


442-449: LGTM!

The LoadMapFd handling is correctly implemented.


450-452: LGTM!

The LoadMapAddress handling is correctly implemented.

src/test/test_verify.cpp (1)

528-528: LGTM!

Test case updated to expect successful verification of global_variable.o.

src/test/test_marshal.cpp (2)

44-45: LGTM!

Added instruction template for LDDW with src_reg = 2.


395-397: LGTM!

Basic test coverage for LoadMapAddress instruction.

Run the following script to verify test coverage:

✅ Verification successful

Test coverage for LoadMapAddress is sufficient

The test covers marshaling and unmarshaling of all fields in the LoadMapAddress struct. The implementation is straightforward with no complex edge cases requiring additional tests.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify test coverage for LoadMapAddress instruction.

# Check if there are any untested code paths
rg -A 5 'LoadMapAddress' | grep -v 'test'

Length of output: 4496

src/crab/ebpf_transformer.cpp (3)

117-117: Method declaration follows class conventions.

The declaration is properly placed with other similar methods in the private section.


1848-1852: Ensure proper initialization of shared region.

The shared region size and offset are correctly set based on the map descriptor.


1844-1854: Verify map descriptor and type access.

The implementation needs bounds checking for the map file descriptor and offset.

Run this script to check map descriptor access patterns:

✅ Verification successful

Map descriptor access is properly validated.

The implementation is safe through platform-level validation. Map descriptor access is protected by:

  • Type system validation of map_fd
  • Platform bounds checking in get_map_descriptor_linux()
  • Runtime descriptor validation in find_map_descriptor()
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check if map descriptor access is validated before use
# Look for validation patterns before accessing map descriptors
ast-grep --pattern 'if ($_ < 0) { $$$; return; }
  const EbpfMapDescriptor& desc = $_->get_map_descriptor($_);'

Length of output: 113


Script:

#!/bin/bash
# Search for all map descriptor access patterns
ast-grep --pattern 'get_map_descriptor($_)'

# Search for mapfd validation patterns
rg "mapfd|map_fd" -A 2 -B 2

# Look for error handling around map operations
rg "throw.*map|return.*map.*error" -A 1

Length of output: 54322

external/libbtf (1)

1-1: Verify the submodule update impact.

The submodule update might introduce breaking changes.

Run this script to analyze the changes between the old and new commits:

✅ Verification successful

Submodule update is backward compatible

The changes add support for global variable sections while maintaining existing map parsing functionality. No breaking changes detected.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Analyze changes in the libbtf submodule between commits.

# Check if the submodule exists and contains both commits
cd external/libbtf 2>/dev/null || {
  echo "Error: Submodule directory not found"
  exit 1
}

# Fetch the commit details
git cat-file -e 3c0cbd6ca0920e4d5df6fb88a892f0b5ac5425d7 2>/dev/null && \
git cat-file -e 760a80490a4c67ca03046797868691cb81454d81 2>/dev/null && {
  # Show commit messages and changed files between the commits
  git log --oneline --no-merges 3c0cbd6ca0920e4d5df6fb88a892f0b5ac5425d7..760a80490a4c67ca03046797868691cb81454d81
  echo "Changed files:"
  git diff --name-status 3c0cbd6ca0920e4d5df6fb88a892f0b5ac5425d7..760a80490a4c67ca03046797868691cb81454d81
} || echo "Error: One or both commits not found"

Length of output: 162


Script:

#!/bin/bash
# Initialize and update the submodule before analyzing changes
git submodule init external/libbtf
git submodule update external/libbtf
cd external/libbtf || exit 1
git fetch --all
git log --oneline --no-merges 3c0cbd6ca0920e4d5df6fb88a892f0b5ac5425d7..760a80490a4c67ca03046797868691cb81454d81 || echo "Failed to get commit history"
echo "Changed files:"
git diff --name-status 3c0cbd6ca0920e4d5df6fb88a892f0b5ac5425d7..760a80490a4c67ca03046797868691cb81454d81 || echo "Failed to get changes"

Length of output: 863


Script:

#!/bin/bash
cd external/libbtf || exit 1
echo "=== Changes in libbtf/btf_map.cpp ==="
git diff 3c0cbd6ca0920e4d5df6fb88a892f0b5ac5425d7..760a80490a4c67ca03046797868691cb81454d81 -- libbtf/btf_map.cpp

echo -e "\n=== Changes in test/test.cpp ==="
git diff 3c0cbd6ca0920e4d5df6fb88a892f0b5ac5425d7..760a80490a4c67ca03046797868691cb81454d81 -- test/test.cpp

Length of output: 7337

test-data/map.yaml (2)

4-14: LGTM!

The test case thoroughly verifies the map address loading operation with appropriate postconditions.


16-26: LGTM!

The test case thoroughly verifies the map file descriptor loading operation with appropriate postconditions.

src/asm_parse.cpp (4)

57-58: LGTM!

The regex pattern correctly matches the map_val(fd) + offset format used in the test cases.


60-61: LGTM!

The regex pattern correctly matches the map_fd fd format used in the test cases.


173-176: LGTM!

The parsing logic correctly extracts and converts the components for LoadMapAddress instruction.


177-179: LGTM!

The parsing logic correctly extracts and converts the components for LoadMapFd instruction.

src/ebpf_vm_isa.hpp Show resolved Hide resolved
src/crab/ebpf_transformer.cpp Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 23edeee and 462cc3c.

📒 Files selected for processing (15)
  • external/libbtf (1 hunks)
  • src/asm_cfg.cpp (1 hunks)
  • src/asm_files.cpp (7 hunks)
  • src/asm_marshal.cpp (3 hunks)
  • src/asm_ostream.cpp (2 hunks)
  • src/asm_parse.cpp (2 hunks)
  • src/asm_syntax.hpp (2 hunks)
  • src/asm_unmarshal.cpp (2 hunks)
  • src/assertions.cpp (1 hunks)
  • src/crab/ebpf_transformer.cpp (3 hunks)
  • src/ebpf_vm_isa.hpp (1 hunks)
  • src/test/test_marshal.cpp (2 hunks)
  • src/test/test_verify.cpp (1 hunks)
  • src/test/test_yaml.cpp (1 hunks)
  • test-data/map.yaml (1 hunks)
🧰 Additional context used
📓 Learnings (4)
src/test/test_marshal.cpp (1)
Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#820
File: src/test/test_marshal.cpp:395-397
Timestamp: 2025-01-08T20:19:15.070Z
Learning: Test coverage improvements should be handled in separate PRs to maintain focused changes, as demonstrated in PR #820 where additional test cases for LoadMapAddress were deferred.
src/asm_parse.cpp (3)
Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#820
File: src/asm_parse.cpp:177-179
Timestamp: 2025-01-08T16:39:55.117Z
Learning: The ASM parser in src/asm_parse.cpp is only used for parsing hand-written assembly snippets from YAML test files, not for parsing untrusted input, so input validation is unnecessary.
Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#820
File: src/asm_parse.cpp:173-176
Timestamp: 2025-01-08T16:39:47.584Z
Learning: In ebpf-verifier, ASM parsing (parse_instruction) is specifically used in src/test/ebpf_yaml.cpp for parsing controlled test inputs from YAML files in the test-data directory, and is not exposed to external inputs that would require validation.
Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#820
File: src/asm_parse.cpp:173-176
Timestamp: 2025-01-08T16:39:47.584Z
Learning: In ebpf-verifier, ASM parsing is only used for parsing hand-written test snippets in YAML files, not for parsing arbitrary input from external sources. Therefore, extensive input validation is not required.
test-data/map.yaml (1)
Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#820
File: test-data/map.yaml:20-20
Timestamp: 2025-01-08T16:32:48.309Z
Learning: The eBPF verifier test framework does not support testing invalid file descriptors, making it impossible to add test cases for invalid map_fd values or non-existent file descriptors.
src/asm_files.cpp (1)
Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#820
File: src/asm_files.cpp:205-229
Timestamp: 2025-01-08T17:37:07.223Z
Learning: Global variable relocations in eBPF are only supported with the new map types that use BTF data, not with the old-style map sections.
⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: build_windows (Release, library)
  • GitHub Check: build_windows (Release, tests)
  • GitHub Check: build_ubuntu (Release, tests)
  • GitHub Check: build_windows (Debug, library)
  • GitHub Check: build_windows (Debug, tests)
  • GitHub Check: build_ubuntu (Debug, tests)
  • GitHub Check: build_ubuntu (Release)
  • GitHub Check: Analyze (cpp)
  • GitHub Check: build_ubuntu (Debug)
🔇 Additional comments (25)
src/asm_files.cpp (2)

165-170: Refactored opcode check into helper function

The introduction of verify_load_instruction reduces code duplication and improves maintainability.


205-229: Ensure validity of next_reloc_inst

In relocate_global_variable, ensure that next_reloc_inst is a valid instruction. Although the caller checks for the instruction length before passing next_reloc_inst, consider adding an assertion or comment to clarify this assumption within the function.

src/test/test_yaml.cpp (1)

34-34: Test case added for map functionality

Including YAML_CASE("test-data/map.yaml") enhances test coverage for map-related features.

src/ebpf_vm_isa.hpp (1)

91-93: Avoid naming confusion between INST_MODE_IMM and INST_LD_MODE_IMM

INST_LD_MODE_IMM may be confused with INST_MODE_IMM, as both constants have similar names but different purposes. Consider renaming INST_LD_MODE_IMM to enhance clarity.

src/asm_syntax.hpp (1)

261-261: LGTM!

The LoadMapAddress instruction is correctly integrated into the Instruction variant type.

src/asm_marshal.cpp (3)

90-93: LGTM!

The function signature change from boolean to type constant improves code clarity and extensibility.


110-112: LGTM!

The LoadMapAddress operator implementation correctly marshals the instruction.


311-313: LGTM!

The size function correctly handles the LoadMapAddress instruction type.

src/assertions.cpp (1)

48-48: LGTM!

The LoadMapAddress operator correctly returns an empty vector, consistent with LoadMapFd behavior.

src/asm_cfg.cpp (1)

402-403: LGTM!

The LoadMapAddress instruction is correctly categorized as "assign" type.

src/asm_parse.cpp (2)

173-176: LGTM!

The LoadMapAddress instruction creation correctly extracts the destination register, map file descriptor and offset from the regex matches.


177-179: LGTM!

The LoadMapFd instruction creation correctly extracts the destination register and map file descriptor from the regex matches.

src/asm_ostream.cpp (2)

362-363: LGTM!

The operator correctly formats the LoadMapAddress instruction back to its assembly representation.


556-558: LGTM!

The size function correctly returns 2 for LoadMapAddress instructions, consistent with other similar instructions like LoadMapFd.

src/asm_unmarshal.cpp (2)

Line range hint 432-441: LGTM!

The validation logic correctly checks:

  • inst.src against INST_LD_MODE_MAP_VALUE
  • inst.offset is zero
  • inst.dst is within valid register range

442-452: LGTM!

The instruction creation logic correctly handles:

  • LoadMapFd when inst.src is INST_LD_MODE_MAP_FD
  • LoadMapAddress when inst.src is INST_LD_MODE_MAP_VALUE
src/test/test_verify.cpp (1)

528-528: LGTM!

The test case update reflects that global_variable.o now passes verification.

src/test/test_marshal.cpp (1)

395-397: Test case only covers basic functionality.

The test case only verifies basic marshaling/unmarshaling. However, based on the learnings from past reviews, additional test coverage should be handled in a separate PR.

src/crab/ebpf_transformer.cpp (3)

47-47: LGTM!

The LoadMapAddress operator is correctly declared in the operator list.


1856-1856: LGTM!

The LoadMapAddress operator correctly forwards to do_load_map_address.


1844-1854: 🧹 Nitpick (assertive)

Add validation for mapfd and offset parameters.

The implementation directly uses mapfd and offset without validation. Consider adding parameter validation to ensure they are within valid ranges.

 void ebpf_transformer::do_load_map_address(const Reg& dst_reg, const int mapfd, int32_t offset) {
+    // Validate parameters
+    if (mapfd < 0) {
+        m_inv.set_to_bottom();
+        return;
+    }
+
     const EbpfMapDescriptor& desc = thread_local_program_info->platform->get_map_descriptor(mapfd);
     const EbpfMapType& type = thread_local_program_info->platform->get_map_type(desc.type);
 
     // Set the shared region size and offset for the map.
     type_inv.assign_type(m_inv, dst_reg, T_SHARED);
     const reg_pack_t& dst = reg_pack(dst_reg);
     m_inv.assign(dst.shared_offset, offset);
     m_inv.assign(dst.shared_region_size, desc.value_size);
     assign_valid_ptr(dst_reg, false);
 }

Likely invalid or redundant comment.

external/libbtf (1)

1-1: Verify the libbtf dependency update.

Let's check the changes between the old and new commits to ensure compatibility with the LoadMapAddress functionality.

test-data/map.yaml (3)

1-3: LGTM!

Standard MIT license header with correct attribution.


16-26: LGTM! Noted framework limitations for FD testing.

Test case correctly verifies map FD loading functionality within framework constraints.


11-11: Verify the shared_region_size value.

The shared_region_size=4 seems restrictive. Verify if this size is sufficient for all potential map value operations.

src/asm_syntax.hpp Outdated Show resolved Hide resolved
src/asm_parse.cpp Show resolved Hide resolved
src/test/test_marshal.cpp Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 462cc3c and 8373d84.

📒 Files selected for processing (14)
  • src/asm_cfg.cpp (1 hunks)
  • src/asm_files.cpp (7 hunks)
  • src/asm_marshal.cpp (3 hunks)
  • src/asm_ostream.cpp (2 hunks)
  • src/asm_parse.cpp (2 hunks)
  • src/asm_syntax.hpp (2 hunks)
  • src/asm_unmarshal.cpp (2 hunks)
  • src/assertions.cpp (1 hunks)
  • src/crab/ebpf_transformer.cpp (3 hunks)
  • src/ebpf_vm_isa.hpp (1 hunks)
  • src/test/test_marshal.cpp (2 hunks)
  • src/test/test_verify.cpp (1 hunks)
  • src/test/test_yaml.cpp (1 hunks)
  • test-data/map.yaml (1 hunks)
🧰 Additional context used
📓 Learnings (4)
src/test/test_marshal.cpp (1)
Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#820
File: src/test/test_marshal.cpp:395-397
Timestamp: 2025-01-08T20:19:15.070Z
Learning: Test coverage improvements should be handled in separate PRs to maintain focused changes, as demonstrated in PR #820 where additional test cases for LoadMapAddress were deferred.
test-data/map.yaml (1)
Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#820
File: test-data/map.yaml:20-20
Timestamp: 2025-01-08T16:32:48.309Z
Learning: The eBPF verifier test framework does not support testing invalid file descriptors, making it impossible to add test cases for invalid map_fd values or non-existent file descriptors.
src/asm_parse.cpp (3)
Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#820
File: src/asm_parse.cpp:173-176
Timestamp: 2025-01-08T16:39:47.584Z
Learning: In ebpf-verifier, ASM parsing (parse_instruction) is specifically used in src/test/ebpf_yaml.cpp for parsing controlled test inputs from YAML files in the test-data directory, and is not exposed to external inputs that would require validation.
Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#820
File: src/asm_parse.cpp:177-179
Timestamp: 2025-01-08T16:39:55.117Z
Learning: The ASM parser in src/asm_parse.cpp is only used for parsing hand-written assembly snippets from YAML test files, not for parsing untrusted input, so input validation is unnecessary.
Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#820
File: src/asm_parse.cpp:173-176
Timestamp: 2025-01-08T16:39:47.584Z
Learning: In ebpf-verifier, ASM parsing is only used for parsing hand-written test snippets in YAML files, not for parsing arbitrary input from external sources. Therefore, extensive input validation is not required.
src/asm_files.cpp (1)
Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#820
File: src/asm_files.cpp:205-229
Timestamp: 2025-01-08T17:37:07.223Z
Learning: Global variable relocations in eBPF are only supported with the new map types that use BTF data, not with the old-style map sections.
⏰ Context from checks skipped due to timeout of 90000ms (10)
  • GitHub Check: build_windows (Release, library)
  • GitHub Check: build_ubuntu (Release, library)
  • GitHub Check: build_windows (Release, tests)
  • GitHub Check: build_windows (Debug, library)
  • GitHub Check: build_ubuntu (Release, tests)
  • GitHub Check: build_windows (Debug, tests)
  • GitHub Check: build_ubuntu (Debug, tests)
  • GitHub Check: build_ubuntu (Release)
  • GitHub Check: build_ubuntu (Debug)
  • GitHub Check: Analyze (cpp)
🔇 Additional comments (22)
src/asm_files.cpp (1)

412-421: ⚠️ Potential issue

Check for the existence of global variable sections before insertion

When inserting into global_variable_section_indices, ensure that the section pointers returned by reader.sections[".data"], .bss, and .rodata are not null to prevent potential null pointer dereferences.

Apply this diff to add null checks:

     if (reader.sections[".maps"]) {
         map_section_indices.insert(reader.sections[".maps"]->get_index());
     }

-    if (reader.sections[".data"]) {
+    if (reader.sections[".data"] != nullptr) {
         global_variable_section_indices.insert(reader.sections[".data"]->get_index());
     }

-    if (reader.sections[".bss"]) {
+    if (reader.sections[".bss"] != nullptr) {
         global_variable_section_indices.insert(reader.sections[".bss"]->get_index());
     }

-    if (reader.sections[".rodata"]) {
+    if (reader.sections[".rodata"] != nullptr) {
         global_variable_section_indices.insert(reader.sections[".rodata"]->get_index());
     }

Likely invalid or redundant comment.

src/test/test_yaml.cpp (1)

34-34: Ensure the new YAML test case is valid

Verify that test-data/map.yaml exists and contains valid test cases to prevent test failures.

src/ebpf_vm_isa.hpp (1)

91-93: Prevent naming confusion between INST_MODE_IMM and INST_LD_MODE_IMM

Consider renaming INST_LD_MODE_IMM to avoid confusion with INST_MODE_IMM, as they serve different purposes but have similar names.

src/asm_syntax.hpp (2)

261-261: LGTM!

LoadMapAddress is correctly added to the Instruction variant type.


96-103: 🧹 Nitpick (assertive)

Add bounds documentation for offset field.

Document the valid range for the offset field and any alignment requirements.

 // Load the address of a map value into a register.
 struct LoadMapAddress {
     Reg dst;          // Destination register to store the address of the map value.
     int32_t mapfd{};  // File descriptor of the map to load the address from.
-    int32_t offset{}; // Offset within the map, must be within bounds.
+    int32_t offset{}; // Offset within the map value in bytes. Must be non-negative and
+                      // aligned to the access size. The sum of offset and access size
+                      // must not exceed the map value size.

Likely invalid or redundant comment.

src/asm_marshal.cpp (2)

90-93: LGTM!

The change from bool to uint8_t for the type parameter improves type safety and extensibility.


110-112: LGTM!

The LoadMapAddress operator implementation correctly marshals the instruction.

src/assertions.cpp (1)

48-48: LGTM!

The LoadMapAddress operator correctly returns no assertions, consistent with LoadMapFd behavior.

src/asm_cfg.cpp (1)

402-403: LGTM!

LoadMapAddress is correctly categorized as an "assign" instruction type.

src/asm_parse.cpp (3)

173-176: LGTM! The LoadMapAddress instruction creation looks correct.

The implementation properly extracts the destination register, map file descriptor, and offset from the regex match groups.


177-179: LGTM! The LoadMapFd instruction creation looks correct.

The implementation properly extracts the destination register and map file descriptor from the regex match groups.


57-62: 🧹 Nitpick (assertive)

Add comments to explain the regex pattern constraints.

The regex patterns need documentation to explain:

  • MAP_VAL: Only matches positive integers for fd and offset
  • MAP_FD: Only matches positive integers for fd
 // Match map_val(fd) + offset
+// fd and offset must be positive integers
 #define MAP_VAL R"_(\s*map_val\((\d+)\)\s*\+\s*(\d+)\s*)_"
 
 // Match map_fd fd
+// fd must be a positive integer
 #define MAP_FD R"_(\s*map_fd\s+(\d+)\s*)_"

Likely invalid or redundant comment.

src/asm_ostream.cpp (2)

362-363: LGTM! The LoadMapAddress output formatting looks correct.

The implementation follows the established pattern for instruction formatting.


556-558: LGTM! The size calculation for LoadMapAddress looks correct.

The implementation correctly returns size 2, consistent with similar instructions like LoadMapFd.

src/asm_unmarshal.cpp (2)

Line range hint 432-442: LGTM! The validation logic looks correct.

The implementation properly validates the instruction source register against the allowed modes.


450-452: LGTM! The LoadMapAddress creation looks correct.

The implementation properly creates the LoadMapAddress instruction with the correct fields.

src/test/test_verify.cpp (1)

528-528: LGTM! The test case addition looks correct.

The test case is properly added using the TEST_SECTION macro.

src/test/test_marshal.cpp (2)

395-397: LGTM!

Basic test coverage for LoadMapAddress marshal/unmarshal functionality.


44-45: LGTM!

The instruction template follows the established pattern.

test-data/map.yaml (1)

1-26: LGTM!

Test cases properly validate type assignments, offsets, and value propagation for both map_fd and map_val operations.

src/crab/ebpf_transformer.cpp (2)

1856-1856: LGTM!

Clean and straightforward operator implementation.


1844-1854: 🧹 Nitpick (assertive)

Add validation for mapfd parameter.

The method directly uses mapfd to retrieve the map descriptor without validation.

 void ebpf_transformer::do_load_map_address(const Reg& dst_reg, const int mapfd, int32_t offset) {
+    if (mapfd < 0) {
+        m_inv.set_to_bottom();
+        return;
+    }
     const EbpfMapDescriptor& desc = thread_local_program_info->platform->get_map_descriptor(mapfd);
     const EbpfMapType& type = thread_local_program_info->platform->get_map_type(desc.type);

Likely invalid or redundant comment.

src/asm_files.cpp Show resolved Hide resolved
src/asm_files.cpp Show resolved Hide resolved
src/asm_files.cpp Outdated Show resolved Hide resolved
@Alan-Jowett
Copy link
Contributor Author

@dthaler and @elazarg any chance I can get this reviewed please?

@elazarg
Copy link
Collaborator

elazarg commented Jan 21, 2025

@dthaler and @elazarg any chance I can get this reviewed please?

I'm sorry, I really don't have the capacity, probably until March. I'll try to find some time sooner but it's unlikely.

Signed-off-by: Alan Jowett <[email protected]>
Copy link
Collaborator

@elazarg elazarg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good in general.

Is it better to have a Load instruction with variant value? I thought it would be, but I see that the assignment is not the same operation, so your implementation is probably the right way. Just wanted to make this an explicit choice.

src/asm_files.cpp Outdated Show resolved Hide resolved
src/crab/ebpf_transformer.cpp Outdated Show resolved Hide resolved
src/crab/ebpf_transformer.cpp Show resolved Hide resolved
src/asm_files.cpp Outdated Show resolved Hide resolved
src/asm_ostream.cpp Outdated Show resolved Hide resolved
src/asm_unmarshal.cpp Outdated Show resolved Hide resolved
From the ISA RFC:
5.4.  64-bit immediate instructions

   Instructions with the IMM 'mode' modifier use the wide instruction
   encoding defined in Instruction encoding (Section 3), and use the
   'src_reg' field of the basic instruction to hold an opcode subtype.

   The following table defines a set of {IMM, DW, LD} instructions with
   opcode subtypes in the 'src_reg' field, using new terms such as "map"
   defined further below:

    +=========+================================+==========+==========+
    | src_reg | pseudocode                     | imm type | dst type |
    +=========+================================+==========+==========+
    | 0x0     | dst = (next_imm << 32) | imm   | integer  | integer  |
    +---------+--------------------------------+----------+----------+
    | 0x1     | dst = map_by_fd(imm)           | map fd   | map      |
    +---------+--------------------------------+----------+----------+
    | 0x2     | dst = map_val(map_by_fd(imm))  | map fd   | data     |
    |         | + next_imm                     |          | address  |
    +---------+--------------------------------+----------+----------+
    | 0x3     | dst = var_addr(imm)            | variable | data     |
    |         |                                | id       | address  |
    +---------+--------------------------------+----------+----------+
    | 0x4     | dst = code_addr(imm)           | integer  | code     |
    |         |                                |          | address  |
    +---------+--------------------------------+----------+----------+
    | 0x5     | dst = map_by_idx(imm)          | map      | map      |
    |         |                                | index    |          |
    +---------+--------------------------------+----------+----------+
    | 0x6     | dst = map_val(map_by_idx(imm)) | map      | data     |
    |         | + next_imm                     | index    | address  |
    +---------+--------------------------------+----------+----------+

                 Table 12: 64-bit immediate instructions

Signed-off-by: Alan Jowett <[email protected]>
Signed-off-by: Alan Jowett <[email protected]>
Signed-off-by: Alan Jowett <[email protected]>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (2)
src/asm_files.cpp (2)

165-170: 🧹 Nitpick (assertive)

Extend validation in verify_load_instruction

Currently, verify_load_instruction only checks if the opcode class is INST_CLS_LD. Enhance the validation to check for specific load opcodes or include additional conditions to prevent invalid or unexpected instructions from passing through.


490-492: 🧹 Nitpick (assertive)

Use >= in boundary check for clarity

In the condition checking vector bounds, use >= instead of <= for better readability and consistency.

Apply this diff:

-    if (prog.prog.size() <= offset / sizeof(ebpf_inst) + 1) {
+    if (offset / sizeof(ebpf_inst) + 1 >= prog.prog.size()) {
         throw UnmarshalError("Invalid relocation data");
     }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8373d84 and 320dee7.

📒 Files selected for processing (15)
  • src/asm_cfg.cpp (1 hunks)
  • src/asm_files.cpp (7 hunks)
  • src/asm_marshal.cpp (4 hunks)
  • src/asm_ostream.cpp (1 hunks)
  • src/asm_parse.cpp (2 hunks)
  • src/asm_syntax.hpp (3 hunks)
  • src/asm_unmarshal.cpp (2 hunks)
  • src/assertions.cpp (1 hunks)
  • src/crab/ebpf_transformer.cpp (3 hunks)
  • src/crab/split_dbm.hpp (1 hunks)
  • src/ebpf_vm_isa.hpp (1 hunks)
  • src/test/test_marshal.cpp (2 hunks)
  • src/test/test_verify.cpp (1 hunks)
  • src/test/test_yaml.cpp (1 hunks)
  • test-data/map.yaml (1 hunks)
🧰 Additional context used
📓 Learnings (4)
test-data/map.yaml (1)
Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#820
File: test-data/map.yaml:20-20
Timestamp: 2025-01-08T16:32:48.309Z
Learning: The eBPF verifier test framework does not support testing invalid file descriptors, making it impossible to add test cases for invalid map_fd values or non-existent file descriptors.
src/asm_parse.cpp (3)
Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#820
File: src/asm_parse.cpp:177-179
Timestamp: 2025-01-08T16:39:55.117Z
Learning: The ASM parser in src/asm_parse.cpp is only used for parsing hand-written assembly snippets from YAML test files, not for parsing untrusted input, so input validation is unnecessary.
Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#820
File: src/asm_parse.cpp:173-176
Timestamp: 2025-01-08T16:39:47.584Z
Learning: In ebpf-verifier, ASM parsing (parse_instruction) is specifically used in src/test/ebpf_yaml.cpp for parsing controlled test inputs from YAML files in the test-data directory, and is not exposed to external inputs that would require validation.
Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#820
File: src/asm_parse.cpp:173-176
Timestamp: 2025-01-08T16:39:47.584Z
Learning: In ebpf-verifier, ASM parsing is only used for parsing hand-written test snippets in YAML files, not for parsing arbitrary input from external sources. Therefore, extensive input validation is not required.
src/test/test_marshal.cpp (1)
Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#820
File: src/test/test_marshal.cpp:395-397
Timestamp: 2025-01-08T20:19:15.070Z
Learning: Test coverage improvements should be handled in separate PRs to maintain focused changes, as demonstrated in PR #820 where additional test cases for LoadMapAddress were deferred.
src/asm_files.cpp (1)
Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#820
File: src/asm_files.cpp:205-229
Timestamp: 2025-01-08T17:37:07.223Z
Learning: Global variable relocations in eBPF are only supported with the new map types that use BTF data, not with the old-style map sections.
⏰ Context from checks skipped due to timeout of 90000ms (10)
  • GitHub Check: build_windows (Release, library)
  • GitHub Check: build_ubuntu (Release, tests)
  • GitHub Check: build_windows (Release, tests)
  • GitHub Check: build_ubuntu (Debug, library)
  • GitHub Check: build_windows (Debug, library)
  • GitHub Check: build_ubuntu (Debug, tests)
  • GitHub Check: build_windows (Debug, tests)
  • GitHub Check: build_ubuntu (Release)
  • GitHub Check: Analyze (cpp)
  • GitHub Check: build_ubuntu (Debug)
🔇 Additional comments (26)
src/asm_marshal.cpp (5)

90-93: Updated makeLddw function handles type parameter correctly

The function signature now accepts a uint8_t type parameter and assigns it to .src. The implementation is correct.


108-109: Correct usage of INST_LD_MODE_MAP_FD in LoadMapFd operator

The operator() for LoadMapFd uses INST_LD_MODE_MAP_FD appropriately in the call to makeLddw.


110-112: Proper implementation of LoadMapAddress operator

The new operator() for LoadMapAddress correctly constructs the instructions with INST_LD_MODE_MAP_VALUE.


119-119: Immediate load uses INST_LD_MODE_IMM accurately

The immediate load operation now correctly utilizes INST_LD_MODE_IMM in the makeLddw call.


311-313: asm_syntax::size function updated for LoadMapAddress

The function correctly returns 2 for LoadMapAddress instructions, ensuring proper instruction sizing.

src/asm_files.cpp (3)

405-411: Global variable sections collected correctly

The code accurately identifies and collects indices of global variable sections from .rodata, .data, and .bss.


216-223: Proper handling of relocation values in relocate_global_variable

The function correctly retrieves and validates the relocation value for global variables, throwing an exception if the map descriptor is not found.


225-229: Validation of relocation value bounds

The check ensures that reloc_value is within the bounds of info.map_descriptors, preventing out-of-range access.

src/test/test_yaml.cpp (1)

34-34: Added test case for map.yaml

The new test case YAML_CASE("test-data/map.yaml") is appropriately added to the test suite.

src/ebpf_vm_isa.hpp (1)

91-93: LGTM!

The new instruction mode constants are correctly defined with sequential values and clear comments.

src/crab/split_dbm.hpp (1)

212-213: LGTM!

The parameter name is correctly removed to prevent unused parameter warnings.

src/asm_syntax.hpp (2)

96-103: LGTM!

The LoadMapAddress struct is well-documented with clear field descriptions.


386-386: Provide implementation details for the size function.

The function declaration lacks documentation about its purpose and return value semantics.

Run the following script to find the implementation:

✅ Verification successful

Add documentation for the size function

/// Calculates the size of an instruction in instruction words.
/// @param inst The instruction to measure
/// @return 2 for double-word instructions (lddw Bin, LoadMapFd, LoadMapAddress),
///         1 for all other instructions
int size(const Instruction& inst);
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Find the implementation of the size function.
rg -A 5 "int size\(const Instruction& inst\)"

Length of output: 270


Script:

#!/bin/bash
# Search for implementation files
fd "asm_syntax\.(cpp|cc|cxx)"

# Search for size function implementation with more context
rg -g "!{build,test}/*" "size.*\(.*Instruction.*\)" -A 10

# Use ast-grep to find potential inline implementations
ast-grep --pattern 'int size(const Instruction& inst) { $___ }'

Length of output: 1096


Script:

#!/bin/bash
# Get complete implementation with more context
rg -A 15 "^int asm_syntax::size\(const Instruction& inst\)" src/asm_marshal.cpp

Length of output: 472

src/assertions.cpp (1)

48-48: LGTM!

The operator correctly returns an empty vector, consistent with the LoadMapFd overload.

src/asm_cfg.cpp (1)

402-403: LGTM!

Correctly categorizes LoadMapAddress as "assign" type, consistent with LoadMapFd.

src/asm_parse.cpp (2)

57-62: LGTM!

Regex patterns correctly match the expected formats:

  • MAP_VAL: map_val(fd) + offset
  • MAP_FD: map_fd fd

173-179: LGTM!

Correctly parses and constructs LoadMapAddress and LoadMapFd instructions from the matched patterns.

src/asm_unmarshal.cpp (2)

Line range hint 432-441: LGTM!

Proper validation of instruction source and offset values.


442-461: LGTM!

Clean switch statement implementation handling all load modes with proper error handling.

src/test/test_verify.cpp (1)

528-528: LGTM!

Test case correctly updated to expect successful verification.

src/crab/ebpf_transformer.cpp (2)

47-47: LGTM!

The operator declaration follows the established pattern and is correctly placed with other similar operators.


1860-1860: LGTM!

The operator correctly delegates to do_load_map_address with proper parameter passing.

test-data/map.yaml (1)

16-26: LGTM!

The test case correctly verifies map fd loading and register properties.

src/asm_ostream.cpp (1)

362-363: LGTM!

The operator implementation follows the established pattern and produces consistent output format.

src/test/test_marshal.cpp (2)

44-45: LGTM! Addition of LDDW with src_reg=2 aligns with ISA RFC Section 5.4.

The instruction template correctly implements support for 64-bit immediate instructions with type 2.


395-397: Basic test coverage for LoadMapAddress looks good.

The test verifies the basic marshaling and unmarshaling functionality.

Run this script to verify the test coverage:

✅ Verification successful

Basic test coverage for LoadMapAddress is sufficient and follows established patterns

The test case properly verifies marshaling and unmarshaling using the same proven approach as other instructions in the codebase.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check test coverage for LoadMapAddress functionality
# Test: Look for other test cases that might be testing LoadMapAddress
rg -l "LoadMapAddress" | grep "_test"

Length of output: 35


Script:

#!/bin/bash
# Try case-insensitive search with broader test file patterns
rg -i "loadmapaddress|load_map_address" -g "*test*"

# Use ast-grep to find test cases
ast-grep --pattern 'SECTION($$$) { $$$LoadMapAddress$$$}'

Length of output: 294


Script:

#!/bin/bash
# Check how compare_marshal_unmarshal is used in other tests
rg "compare_marshal_unmarshal" src/test/test_marshal.cpp -B 1 -A 1

Length of output: 4731

src/crab/ebpf_transformer.cpp Outdated Show resolved Hide resolved
src/crab/ebpf_transformer.cpp Outdated Show resolved Hide resolved
test-data/map.yaml Show resolved Hide resolved
Signed-off-by: Alan Jowett <[email protected]>
@Alan-Jowett
Copy link
Contributor Author

Looks good in general.

Is it better to have a Load instruction with variant value? I thought it would be, but I see that the assignment is not the same operation, so your implementation is probably the right way. Just wanted to make this an explicit choice.

I am not sure that really makes sense as the three load operations are quite different.

  1. Load just a 64bit immediate.
  2. Load the address of a map into the register.
  3. Load the address of the first value in the map into the register with offset.

I think keeping them distinct makes more sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants