Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for is_valid_before #825

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

Alan-Jowett
Copy link
Contributor

@Alan-Jowett Alan-Jowett commented Jan 16, 2025

The uBPF fuzzer needs a version of the API that checks against the pre-conditions at the label instead of the post conditions.

Summary by CodeRabbit

  • New Features
    • Added a new method to validate invariant states before a specific label, enhancing state verification capabilities.
    • Updated the existing method for validating states after a specific label to improve its logic.
  • Bug Fixes
    • Modified method signature to avoid unused parameter warnings, ensuring cleaner code.

Copy link

coderabbitai bot commented Jan 16, 2025

Walkthrough

The pull request introduces a new method is_valid_before to the Invariants class across two files. This method checks the validity of a state before a specified label, similar to the existing is_valid_after functionality. The implementation involves constructing an abstract state and comparing it against the pre-invariant. Additionally, the logic for the is_valid_after method has been modified to change its return statement. Furthermore, a method signature in the SplitDBM class has been updated to remove an unused parameter.

Changes

File Change Summary
src/crab_verifier.hpp Added public method bool is_valid_before(const label_t& label, const string_invariant& state) const to Invariants class
src/crab_verifier.cpp Implemented is_valid_before method; modified return statement logic in is_valid_after method
src/crab/split_dbm.hpp Updated method signature of widening_thresholds to remove named parameter ts, making it anonymous
✨ Finishing Touches
  • 📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@elazarg
Copy link
Collaborator

elazarg commented Jan 16, 2025

@Alan-Jowett any idea why the code coverage failed?

@Alan-Jowett
Copy link
Contributor Author

Converting this to draft for now. Fuzzer integration is failing for reasons I don't understand yet.

@Alan-Jowett Alan-Jowett marked this pull request as draft January 16, 2025 20:09
@Alan-Jowett Alan-Jowett force-pushed the verifier_fuzzing_support branch from 7cbf769 to c861682 Compare January 16, 2025 20:16
@Alan-Jowett
Copy link
Contributor Author

@Alan-Jowett any idea why the code coverage failed?

It looks like it's failing for all PRs.

@Alan-Jowett Alan-Jowett marked this pull request as ready for review January 21, 2025 21:13
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7cbf769 and fed3462.

📒 Files selected for processing (2)
  • src/crab_verifier.cpp (1 hunks)
  • src/crab_verifier.hpp (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: build_ubuntu (Release, tests)
  • GitHub Check: build_windows (Release, library)
  • GitHub Check: build_windows (Release, tests)
  • GitHub Check: build_windows (Debug, library)
  • GitHub Check: build_windows (Debug, tests)
  • GitHub Check: build_ubuntu (Debug, tests)
  • GitHub Check: build_ubuntu (Release)
  • GitHub Check: build_ubuntu (Debug)
  • GitHub Check: Analyze (cpp)
🔇 Additional comments (2)
src/crab_verifier.cpp (2)

27-41: LGTM! Clear explanation of validation logic.

The comment block effectively explains the validation outcomes and justifies the implementation change.


42-46: Verify the impact of changing the validation logic.

The implementation has changed from using <= comparison to using intersection check.

Run this script to find all callers of is_valid_after:

✅ Verification successful

The validation logic change is safe and localized.

The method is internal to the Invariants class with no external callers. The new intersection-based implementation provides more precise state validation for abstract interpretation.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Find all callers of is_valid_after to assess impact
rg -l "is_valid_after" | xargs -I {} rg -A 2 "is_valid_after" {}

Length of output: 466


Script:

#!/bin/bash
# Case-insensitive search for is_valid_after
rg -i "is_valid_after" -A 2

# Find usage of Invariants class
rg "class Invariants" -A 5

# Find files containing Invariants:: to see all method calls
rg "Invariants::" -A 2

Length of output: 2928

@@ -58,6 +58,7 @@ class Invariants final {
Invariants(const Invariants& invariants) = default;

bool is_valid_after(const label_t& label, const string_invariant& state) const;
bool is_valid_before(const label_t& label, const string_invariant& state) const;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Add documentation for is_valid_before method.

Document the method's purpose, parameters, return value, and relationship to is_valid_after. This is essential given the complex validation logic explained in the implementation file.

+    /// Checks if the given state is valid before the specified label by comparing
+    /// against the pre-invariant.
+    /// @param label The program point to check
+    /// @param state The state to validate
+    /// @return true if state is valid, false if it violates constraints
     bool is_valid_before(const label_t& label, const string_invariant& state) const;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
bool is_valid_before(const label_t& label, const string_invariant& state) const;
/// Checks if the given state is valid before the specified label by comparing
/// against the pre-invariant.
/// @param label The program point to check
/// @param state The state to validate
/// @return true if state is valid, false if it violates constraints
bool is_valid_before(const label_t& label, const string_invariant& state) const;

@Alan-Jowett
Copy link
Contributor Author

@elazarg This PR contains a change on how is_valid_after and is_valid_before operate. The specific issue is that the usage of the <= operator doesn't work for the uBPF use cases.

Specifically, the uBPF doesn't know when the verifier havocs a register. If the abstract state from the uBPF concrete state contains constraints based on valid register state that verifier has havoced then the <= fails.

Let me know if I am not understanding this correctly.

@Alan-Jowett Alan-Jowett force-pushed the verifier_fuzzing_support branch from 45016a2 to b3b36e1 Compare January 21, 2025 21:18
@Alan-Jowett
Copy link
Contributor Author

@elazarg This PR contains a change on how is_valid_after and is_valid_before operate. The specific issue is that the usage of the <= operator doesn't work for the uBPF use cases.

Specifically, the uBPF doesn't know when the verifier havocs a register. If the abstract state from the uBPF concrete state contains constraints based on valid register state that verifier has havoced then the <= fails.

Let me know if I am not understanding this correctly.

Note, I understand you are busy, so no rush in responding to this. This was just notes for why I made the change for when you get a chance to review it.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fed3462 and b3b36e1.

📒 Files selected for processing (2)
  • src/crab_verifier.cpp (1 hunks)
  • src/crab_verifier.hpp (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: build_windows (Release, library)
  • GitHub Check: build_windows (Release, tests)
  • GitHub Check: build_windows (Debug, library)
  • GitHub Check: build_ubuntu (Release, tests)
  • GitHub Check: build_windows (Debug, tests)
  • GitHub Check: build_ubuntu (Debug, tests)
  • GitHub Check: build_ubuntu (Release)
  • GitHub Check: build_ubuntu (Debug)
  • GitHub Check: Analyze (cpp)
🔇 Additional comments (2)
src/crab_verifier.hpp (1)

61-61: Add documentation for is_valid_before method.

Document the method's purpose, parameters, return value, and relationship to is_valid_after.

+    /// Checks if the given state is valid before the specified label by comparing
+    /// against the pre-invariant.
+    /// @param label The program point to check
+    /// @param state The state to validate
+    /// @return true if state is valid, false if it violates constraints
     bool is_valid_before(const label_t& label, const string_invariant& state) const;
src/crab_verifier.cpp (1)

27-41: LGTM! Clear explanation of check outcomes.

The comment block effectively explains the rationale behind the implementation change from <= to intersection checks.

Comment on lines 42 to +46
bool Invariants::is_valid_after(const label_t& label, const string_invariant& state) const {
const ebpf_domain_t abstract_state =
ebpf_domain_t::from_constraints(state.value(), thread_local_options.setup_constraints);
return abstract_state <= invariants.at(label).post;
return !(abstract_state & invariants.at(label).post).is_bottom();
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Add inline comment explaining the intersection check.

Add a brief comment explaining why we're checking if the intersection is not bottom.

 bool Invariants::is_valid_after(const label_t& label, const string_invariant& state) const {
     const ebpf_domain_t abstract_state =
         ebpf_domain_t::from_constraints(state.value(), thread_local_options.setup_constraints);
+    // Return false only if abstract_state violates post-invariant constraints
     return !(abstract_state & invariants.at(label).post).is_bottom();
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
bool Invariants::is_valid_after(const label_t& label, const string_invariant& state) const {
const ebpf_domain_t abstract_state =
ebpf_domain_t::from_constraints(state.value(), thread_local_options.setup_constraints);
return abstract_state <= invariants.at(label).post;
return !(abstract_state & invariants.at(label).post).is_bottom();
}
bool Invariants::is_valid_after(const label_t& label, const string_invariant& state) const {
const ebpf_domain_t abstract_state =
ebpf_domain_t::from_constraints(state.value(), thread_local_options.setup_constraints);
// Return false only if abstract_state violates post-invariant constraints
return !(abstract_state & invariants.at(label).post).is_bottom();
}

Comment on lines +48 to +51
bool Invariants::is_valid_before(const label_t& label, const string_invariant& state) const {
const ebpf_domain_t abstract_state =
ebpf_domain_t::from_constraints(state.value(), thread_local_options.setup_constraints);
return !(abstract_state & invariants.at(label).pre).is_bottom();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Add inline comment explaining the intersection check.

Add a brief comment explaining why we're checking if the intersection is not bottom.

 bool Invariants::is_valid_before(const label_t& label, const string_invariant& state) const {
     const ebpf_domain_t abstract_state =
         ebpf_domain_t::from_constraints(state.value(), thread_local_options.setup_constraints);
+    // Return false only if abstract_state violates pre-invariant constraints
     return !(abstract_state & invariants.at(label).pre).is_bottom();
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
bool Invariants::is_valid_before(const label_t& label, const string_invariant& state) const {
const ebpf_domain_t abstract_state =
ebpf_domain_t::from_constraints(state.value(), thread_local_options.setup_constraints);
return !(abstract_state & invariants.at(label).pre).is_bottom();
bool Invariants::is_valid_before(const label_t& label, const string_invariant& state) const {
const ebpf_domain_t abstract_state =
ebpf_domain_t::from_constraints(state.value(), thread_local_options.setup_constraints);
// Return false only if abstract_state violates pre-invariant constraints
return !(abstract_state & invariants.at(label).pre).is_bottom();
}

Comment on lines 42 to +51
bool Invariants::is_valid_after(const label_t& label, const string_invariant& state) const {
const ebpf_domain_t abstract_state =
ebpf_domain_t::from_constraints(state.value(), thread_local_options.setup_constraints);
return abstract_state <= invariants.at(label).post;
return !(abstract_state & invariants.at(label).post).is_bottom();
}

bool Invariants::is_valid_before(const label_t& label, const string_invariant& state) const {
const ebpf_domain_t abstract_state =
ebpf_domain_t::from_constraints(state.value(), thread_local_options.setup_constraints);
return !(abstract_state & invariants.at(label).pre).is_bottom();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Add unit tests for is_valid_before and is_valid_after methods

  • Add test cases in src/test/test_verify.cpp covering:
    • Valid pre-conditions
    • Invalid pre-conditions
    • Valid post-conditions
    • Invalid post-conditions
  • Current coverage is only through integration tests in ebpf_yaml.cpp
🔗 Analysis chain

Verify test coverage for both methods.

Both is_valid_before and is_valid_after need test coverage to ensure they handle all four cases described in the comments.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check test coverage for both methods

# Search for test files containing is_valid_before
echo "Searching for is_valid_before tests..."
rg "is_valid_before" "test/"

# Search for test files containing is_valid_after
echo "Searching for is_valid_after tests..."
rg "is_valid_after" "test/"

# Look for test files that might contain relevant test cases
echo "Searching for potential test files..."
fd -t f "test_.*\.(cpp|hpp)" "test/"

Length of output: 531


Script:

#!/bin/bash
# First, find any test-related files or directories
echo "Looking for test files and directories..."
fd -t d "test"
fd -t f "test"

# Search for files containing test in their name
echo -e "\nSearching for test-related files..."
fd -t f "(test|spec|check)"

# Search for files mentioning the class name
echo -e "\nSearching for files mentioning Invariants..."
rg -l "class.*Invariants" 

Length of output: 954


Script:

#!/bin/bash
# Check test_verify.cpp content
echo "Checking test_verify.cpp content..."
rg -A 5 -B 5 "is_valid_(before|after)|Invariants::" "src/test/test_verify.cpp"

# Search for Invariants usage in all test files
echo -e "\nSearching for Invariants usage in tests..."
rg -A 3 -B 3 "Invariants" "src/test/"

# Look for related test patterns
echo -e "\nSearching for related verification tests..."
rg -l "TEST.*verify|verify.*test" "src/test/"

Length of output: 1642

@coveralls
Copy link

coveralls commented Jan 21, 2025

Coverage Status

coverage: 88.133% (-0.04%) from 88.169%
when pulling 0d2ce9e on Alan-Jowett:verifier_fuzzing_support
into 15e1925 on vbpf:main.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b3b36e1 and 0d2ce9e.

📒 Files selected for processing (1)
  • src/crab/split_dbm.hpp (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (4)
  • GitHub Check: build_windows (Debug, tests)
  • GitHub Check: build_ubuntu (Debug, tests)
  • GitHub Check: build_ubuntu (Release)
  • GitHub Check: build_ubuntu (Debug)

src/crab/split_dbm.hpp Show resolved Hide resolved
@elazarg
Copy link
Collaborator

elazarg commented Jan 21, 2025

@elazarg This PR contains a change on how is_valid_after and is_valid_before operate. The specific issue is that the usage of the <= operator doesn't work for the uBPF use cases.

Specifically, the uBPF doesn't know when the verifier havocs a register. If the abstract state from the uBPF concrete state contains constraints based on valid register state that verifier has havoced then the <= fails.

Let me know if I am not understanding this correctly.

I think we're not using the same terminology, and maybe I chose overly generic names. I'll try to write what I understand in my own words.

  • The verifier computes abstract states from abstract states.
  • Invariant is an entire abstract state that the verifier finds to be unchanged during the execution of the abstract interpreter (fixpoint).
  • Abstract states are essentially weighted graphs, printed as sets of constraints - linear inequalities - and stored as a matrix (ignoring intricacies of our ebpf_domain). Each abstract state represents (uniquely) a set of concrete states.
  • Lifting takes a concrete state C and turns it into an abstract state C#, ideally representing only the single concrete state C.
  • The api looks at an invariant V#, asks whether C# represents a subset of the concrete states represented by the invariant V#.

The last sentence is exactly the test C# <= V#. Another way to write the same test is (C# /\ V#) = C#. Your proposed solution tests whether C# and V# have any common states. These questions differ only when C# represents more concrete states than C. If your fix is correct, then the lifting from C to C# is wrong. It might be due to missing constraints, or due to an overapproxination/bug. In general using the meet operator & is somewhat error prone, and I see it as a warning sign.

The invariants found by the verifier are intentionally overapproxinations. Havocing should not be a problem by itself since it only yields larger abstract states, and smaller abstract states are never guaranteed by the verifier.

Do you know what is the source of overapproximation?

(I really hope I'm not writing nonsense here. If I am, we can return to the discussion when I'm back. Alternatively we may want to consult with @caballa).

@Alan-Jowett
Copy link
Contributor Author

To make things easier to reason about, I will re-introduce my changes to ebpf-yaml.cpp to make it easier to check via yaml test definitions.

@elazarg
Copy link
Collaborator

elazarg commented Jan 22, 2025

It might be easier with a C++ unit test

@elazarg
Copy link
Collaborator

elazarg commented Jan 22, 2025

A better approach might be to pass a set of simple constraints.

@Alan-Jowett
Copy link
Contributor Author

I guess I simply don't understand what the <= operator does.

Domain 1 only has the register values, while domain 2 has register values + type.
It looks like (domain 1 <= domain 2) is false whereas (domain 2 <= domain 1) is true.
This seems backwards to me, but I am probably not understanding it.
I also included the case where domain 1 is outside of the range of domain 3.
This returns the expected result of (domain 1 <= domain 3) is false and (domain 3 <= domain 1) is false.

Previously I was using & to test for the case where values in domain 1 are outside of the expected values in domain 3.

Consider the following one register example:

TEST_CASE("domain_operations", "[verify]") {
    std::set<std::string> invariant1({
        "r0.uvalue=0", 
        "r0.svalue=0",
    });

    std::set<std::string> invariant2({
        "r0.type=number",
        "r0.uvalue=0",
        "r0.svalue=0",
    });

    std::set<std::string> invariant3({
        "r0.type=number",
        "r0.uvalue=[1,10]",
        "r0.svalue=[1,10]",
    });

    const auto domain1 = crab::ebpf_domain_t::from_constraints(invariant1, false);
    const auto domain2 = crab::ebpf_domain_t::from_constraints(invariant2, false);
    const auto domain3 = crab::ebpf_domain_t::from_constraints(invariant3, false);

    std::cerr << "domain 1 " << domain1 << std::endl;
    std::cerr << "domain 2 " << domain2 << std::endl;
    std::cerr << "domain 3 " << domain3 << std::endl;
    std::cerr << "(domain1 <= domain2) " << (domain1 <= domain2) << std::endl;
    std::cerr << "(domain2 <= domain1) " << (domain2 <= domain1) << std::endl;
    std::cerr << "(domain1 & domain2) " << (domain1 & domain2) << std::endl;
    std::cerr << "(domain1 <= domain3) " << (domain1 <= domain3) << std::endl;
    std::cerr << "(domain3 <= domain1) " << (domain3 <= domain1) << std::endl;
    std::cerr << "(domain1 & domain3) " << (domain1 & domain3) << std::endl;
}

This outputs:

domain 1 [
    r0.svalue=0, r0.uvalue=0]
Stack: Numbers -> {}
domain 2 [
    r0.svalue=0, r0.type=number, r0.uvalue=0]
Stack: Numbers -> {}
domain 3 [
    r0.svalue=[1, 10], r0.type=number, r0.uvalue=[1, 10]]
Stack: Numbers -> {}
(domain1 <= domain2) 0
(domain2 <= domain1) 1
(domain1 & domain2) [
    r0.svalue=0, r0.type=number, r0.uvalue=0]
Stack: Numbers -> {}
(domain1 <= domain3) 0
(domain3 <= domain1) 0
(domain1 & domain3) _|_

Use case I am shooting for:
Before or after executing each instruction in the interpreter, check interpreter register values against the pre or post-invariants to detect cases where the interpreter's register values violate the pre or post-invariants.

Interpreter builds a set of invariants based on the register state and any knowledge it has about context, stack and shared pointers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants