Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test(integration): fix flaky integration tests with proper health checks #16

Merged
merged 4 commits into from
Dec 5, 2024

Conversation

jim380
Copy link
Contributor

@jim380 jim380 commented Nov 27, 2024

Fix Flaky Integration Tests with Proper Container Health Checks

Problem

  1. Tests were failing intermittently due to fixed sleep timers
  2. JWT token handling was duplicated between test and production code
  3. Resource cleanup wasn't properly ordered
  4. Test organization needed improvement:
    • Multiple container setups for related tests
    • No exit status checking for subtests
    • Complex health check using configuration negotiation

Changes

Test Reliability

  • Replaced fixed sleep with active health checks using timer
  • Optimized HTTP client and timer timeouts
  • Use stateless engine_getClientVersionV1 for health checks instead of configuration negotiation

Code Organization

  • Added shared JWT token utility function used by both test and production code
  • Moved invalid timestamp test into lifecycle test to reduce container spawns
  • Added proper subtest exit status checking to fail fast on errors

Resource Management

  • Register cleanup handlers immediately after resource creation
  • Proper cleanup ordering for JWT files, Docker client, and containers
  • Added .gitkeep to maintain jwttoken directory structure

File Structure

  • Simplified path constants and directory handling
  • Improved JWT file management in tests
  • Better path handling using filepath.Join

Testing

The changes have been tested with multiple runs of the integration tests. The tests now:

  • Fail fast with clear error messages
  • Clean up resources properly even when tests fail
  • Run more efficiently with optimized timeouts and container reuse
  • Maintain proper isolation while sharing resources where appropriate

Summary by CodeRabbit

Summary by CodeRabbit

  • Bug Fixes

    • Improved reliability of the test setup by replacing fixed sleep duration with dynamic polling for Reth container readiness.
  • New Features

    • Introduced a new function for generating a dynamic JWT secret, enhancing security during integration tests.
    • Added a new function to encapsulate JWT token generation, simplifying error handling and improving code organization.

Copy link

coderabbitai bot commented Nov 27, 2024

Walkthrough

The changes in this pull request focus on modifications to the integration_test.go and execution.go files to enhance the testing setup for a Reth engine container. Key updates include the addition of new imports for handling JWT tokens and HTTP requests, the introduction of the generateJWTSecret and waitForRethContainer functions, and adjustments to the setupTestRethEngine function for improved reliability. Additionally, a new getAuthToken function is introduced in execution.go to streamline JWT token generation and error handling.

Changes

File Change Summary
integration_test.go - Added new imports: crypto/rand, encoding/hex, fmt, net/http, strings.
- Introduced function generateJWTSecret() (string, error) to create a dynamic JWT secret.
- Introduced function waitForRethContainer(t *testing.T, jwtSecret string) error to replace fixed sleep with polling for Reth endpoints readiness.
- Modified setupTestRethEngine(t *testing.T) to return the generated JWT secret and call generateJWTSecret.
- Simplified Docker path handling with a DOCKER_PATH constant.
- Adjusted cleanup process to remove the dynamically created JWT secret file.
execution.go - Added function getAuthToken(jwtSecret string) (string, error) to encapsulate JWT token generation logic.
- Updated NewEngineAPIExecutionClient to utilize getAuthToken for generating the authToken.

Possibly related PRs

  • chore: initial EVM Execution Client Implementation #10: The changes in integration_test.go regarding the setup of the Reth engine container and JWT secret generation are directly related to the implementation of the execution client in this PR, which also involves JWT handling.
  • test: fix integration & unit tests #14: The modifications in integration_test.go to improve the lifecycle tests for the execution client align with the objectives of this PR, which aims to fix integration and unit tests related to the execution client.

Suggested reviewers

  • gupadhyaya
  • Manav-Aggarwal
  • tzdybal

Poem

🐰 In the garden of code, we hop and play,
With tests that are ready, come what may.
A function to wait, no more sleep in sight,
Reth engine's now ready, all set for the fight!
With JWTs in hand, we’re nimble and spry,
Let’s test with a leap, oh my, oh my! 🌼


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@RollkitBot RollkitBot requested review from a team, tzdybal, gupadhyaya and MSevey and removed request for a team November 27, 2024 00:15
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (3)
integration_test.go (3)

134-136: Consider making timeout configurable

The HTTP client timeout is hardcoded to 1 second. Consider making this configurable or using a constant to improve maintainability.

+const httpClientTimeout = 1 * time.Second

 client := &http.Client{
-    Timeout: 1 * time.Second,
+    Timeout: httpClientTimeout,
 }

181-181: Consider making polling interval configurable

The sleep duration of 500ms is hardcoded. Consider making this configurable or using a constant for better maintainability.

+const pollInterval = 500 * time.Millisecond

-time.Sleep(500 * time.Millisecond)
+time.Sleep(pollInterval)

127-184: Consider adding unit tests for waitForRethContainer

While the function is used in integration tests, consider adding unit tests to verify its behavior with mocked HTTP responses, timeout scenarios, and error conditions.

Would you like me to help generate unit tests for this function?

🧰 Tools
🪛 golangci-lint (1.62.2)

147-147: Error return value of resp.Body.Close is not checked

(errcheck)


174-174: Error return value of resp.Body.Close is not checked

(errcheck)

🪛 GitHub Check: lint / golangci-lint

[failure] 147-147:
Error return value of resp.Body.Close is not checked (errcheck)


[failure] 174-174:
Error return value of resp.Body.Close is not checked (errcheck)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 3f9dcf5 and 94e4ebf.

📒 Files selected for processing (1)
  • integration_test.go (3 hunks)
🧰 Additional context used
🪛 golangci-lint (1.62.2)
integration_test.go

147-147: Error return value of resp.Body.Close is not checked

(errcheck)


174-174: Error return value of resp.Body.Close is not checked

(errcheck)

🪛 GitHub Check: lint / golangci-lint
integration_test.go

[failure] 147-147:
Error return value of resp.Body.Close is not checked (errcheck)


[failure] 174-174:
Error return value of resp.Body.Close is not checked (errcheck)

🔇 Additional comments (1)
integration_test.go (1)

114-115: LGTM: Good replacement of fixed sleep with health checks

The change properly replaces the fixed sleep duration with active health checks, improving test reliability.

integration_test.go Outdated Show resolved Hide resolved
integration_test.go Outdated Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (1)
integration_test.go (1)

174-177: Simplify JWT secret decoding by removing unnecessary prefix trimming

In the line where the JWT secret is decoded, the strings.TrimPrefix(jwtSecret, "0x") is unnecessary because jwtSecret does not include a "0x" prefix. You can simplify the code by removing the TrimPrefix call.

Apply this diff to simplify the code:

-	jwtSecretBytes, err := hex.DecodeString(strings.TrimPrefix(jwtSecret, "0x"))
+	jwtSecretBytes, err := hex.DecodeString(jwtSecret)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 94e4ebf and 417599f.

📒 Files selected for processing (1)
  • integration_test.go (7 hunks)
🧰 Additional context used
🪛 golangci-lint (1.62.2)
integration_test.go

[medium] 65-65: G301: Expect directory permissions to be 0750 or less

(gosec)


165-165: Error return value of resp.Body.Close is not checked

(errcheck)


192-192: Error return value of resp.Body.Close is not checked

(errcheck)

🪛 GitHub Check: lint / golangci-lint
integration_test.go

[failure] 165-165:
Error return value of resp.Body.Close is not checked (errcheck)


[failure] 192-192:
Error return value of resp.Body.Close is not checked (errcheck)

🔇 Additional comments (2)
integration_test.go (2)

165-165: Ensure proper handling of response body closure

The error return value of resp.Body.Close() is not checked, which could lead to resource leaks, as indicated by the static analysis tool.

Please refer to the previous review comment suggesting to use defer to properly close the response body and check for errors. This ensures resources are released appropriately and aligns with best practices.

Also applies to: 192-192

🧰 Tools
🪛 GitHub Check: lint / golangci-lint

[failure] 165-165:
Error return value of resp.Body.Close is not checked (errcheck)

🪛 golangci-lint (1.62.2)

165-165: Error return value of resp.Body.Close is not checked

(errcheck)


159-159: Enhance timeout error message with duration information

The timeout error message can be more informative by including the duration that was waited. This can help with debugging and understanding the context of the timeout.

Please refer to the previous review comment suggesting the inclusion of the timeout duration in the error message. Here's how you can adjust it:

-	return fmt.Errorf("timeout waiting for reth container to be ready")
+	return fmt.Errorf("timeout after %v waiting for reth container to be ready", 30*time.Second)

Alternatively, use the actual duration from the context to avoid hardcoding the timeout value.

integration_test.go Outdated Show resolved Hide resolved
integration_test.go Outdated Show resolved Hide resolved
integration_test.go Outdated Show resolved Hide resolved
integration_test.go Outdated Show resolved Hide resolved
integration_test.go Outdated Show resolved Hide resolved
integration_test.go Outdated Show resolved Hide resolved
@ProgramCpp
Copy link
Contributor

ProgramCpp commented Nov 28, 2024

if you are moving to dockertest, ignore.

The Docker Go SDK allows you to enable API version negotiation, automatically selects an API version that's supported by both the client and the Docker Engine that's in use.

ref, https://docs.docker.com/reference/api/engine/

integration_test.go Outdated Show resolved Hide resolved
integration_test.go Outdated Show resolved Hide resolved
Copy link
Collaborator

@MSevey MSevey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree with @ProgramCpp review comments

- Replace fixed sleep with active health checks using timer
- Use stateless engine_getClientVersionV1 for health checks
- Move JWT token generation to shared utility function
- Register cleanup handlers immediately after resource creation
- Check subtest exit status to fail fast
- Move invalid timestamp test into lifecycle test
- Optimize timeouts for faster test execution
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 417599f and cfddec3.

📒 Files selected for processing (2)
  • execution.go (3 hunks)
  • integration_test.go (7 hunks)
🧰 Additional context used
🪛 golangci-lint (1.62.2)
integration_test.go

[medium] 64-64: G301: Expect directory permissions to be 0750 or less

(gosec)


172-172: Error return value of resp.Body.Close is not checked

(errcheck)


190-190: Error return value of resp.Body.Close is not checked

(errcheck)

🪛 GitHub Check: lint / golangci-lint
integration_test.go

[failure] 172-172:
Error return value of resp.Body.Close is not checked (errcheck)


[failure] 190-190:
Error return value of resp.Body.Close is not checked (errcheck)

🔇 Additional comments (6)
integration_test.go (4)

45-53: LGTM! Secure implementation of JWT secret generation.

The function correctly uses crypto/rand for secure random number generation and properly handles errors.


156-158: LGTM! Appropriate timeout value.

The 100ms timeout is suitable for local network calls and aligns with the observed container readiness time.


160-161: LGTM! Proper timer implementation.

The code correctly implements the suggested timer-based approach instead of using context for timeout handling.


64-64: 🛠️ Refactor suggestion

Adjust directory permissions to enhance security

The directory permissions (0755) are more permissive than necessary. Consider using 0750 for better security.

-err = os.MkdirAll(jwtPath, 0755)
+err = os.MkdirAll(jwtPath, 0750)

Likely invalid or redundant comment.

🧰 Tools
🪛 golangci-lint (1.62.2)

[medium] 64-64: G301: Expect directory permissions to be 0750 or less

(gosec)

execution.go (2)

287-307: LGTM! Well-implemented JWT token generation.

The function follows JWT best practices with proper:

  • Empty secret handling
  • Hex decoding with "0x" prefix support
  • Token expiration
  • Comprehensive error handling

62-65: LGTM! Proper error handling and resource cleanup.

The code correctly handles token generation errors and ensures proper cleanup of resources by closing the ethClient on error.

integration_test.go Outdated Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (2)
integration_test.go (2)

41-42: Consider grouping related constants together

Consider grouping the path-related constants (DOCKER_PATH, JWT_FILENAME) with other related constants for better organization.

const (
	TEST_ETH_URL    = "http://localhost:8545"
	TEST_ENGINE_URL = "http://localhost:8551"

+	DOCKER_PATH  = "./docker"
+	JWT_FILENAME = "testsecret.hex"
+
	CHAIN_ID          = "1234"
	GENESIS_HASH      = "0x8bf225d50da44f60dee1c4ee6f810fe5b44723c76ac765654b6692d50459f216"
	GENESIS_STATEROOT = "0x362b7d8a31e7671b0f357756221ac385790c25a27ab222dc8cbdd08944f5aea4"
	TEST_PRIVATE_KEY  = "cece4f25ac74deb1468965160c7185e07dff413f23fcadb611b05ca37ab0a52e"
	TEST_TO_ADDRESS   = "0x944fDcD1c868E3cC566C78023CcB38A32cDA836E"
)

152-203: Consider improving error handling in the health check loop

While the implementation is good, the error handling in the health check loop could be more informative. Consider capturing and including specific error messages in the timeout error.

 func waitForRethContainer(t *testing.T, jwtSecret string) error {
     t.Helper()
+    var lastErr error
     
     client := &http.Client{
         Timeout: 100 * time.Millisecond,
     }
     
     timer := time.NewTimer(500 * time.Millisecond)
     defer timer.Stop()
     
     for {
         select {
         case <-timer.C:
-            return fmt.Errorf("timeout waiting for reth container to be ready")
+            return fmt.Errorf("timeout waiting for reth container to be ready: %v", lastErr)
         default:
             // check :8545 is ready
             rpcReq := strings.NewReader(`{"jsonrpc":"2.0","method":"net_version","params":[],"id":1}`)
             resp, err := client.Post(TEST_ETH_URL, "application/json", rpcReq)
+            if err != nil {
+                lastErr = fmt.Errorf("HTTP endpoint not ready: %v", err)
+                time.Sleep(100 * time.Millisecond)
+                continue
+            }
-            if err == nil {
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between cfddec3 and ff282d0.

📒 Files selected for processing (1)
  • integration_test.go (7 hunks)
🔇 Additional comments (3)
integration_test.go (3)

45-53: LGTM! Secure implementation of JWT secret generation

The implementation correctly uses crypto/rand for secure random number generation and includes proper error handling.


Line range hint 55-150: LGTM! Well-structured setup with proper resource management

The implementation follows best practices:

  • Immediate cleanup registration after resource creation
  • Secure file permissions (0750 for directory, 0600 for file)
  • Proper error handling and resource cleanup
🧰 Tools
🪛 Gitleaks (8.21.2)

38-38: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)


228-242: LGTM! Well-structured test cases with proper error validation

The test cases are well-organized and include proper validation of error conditions. The use of require.True(t.Run()) ensures proper test termination on failure.

@tzdybal tzdybal requested a review from MSevey December 4, 2024 20:29
Copy link
Member

@tzdybal tzdybal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tzdybal
Copy link
Member

tzdybal commented Dec 4, 2024

Created follow-up issue: #18

Copy link
Collaborator

@MSevey MSevey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK, defer to @tzdybal

@tzdybal tzdybal merged commit 8016a54 into rollkit:main Dec 5, 2024
8 of 9 checks passed
@coderabbitai coderabbitai bot mentioned this pull request Dec 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants