Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Fix missing min/max alpha clamps in losses #2684

Merged
merged 1 commit into from
Jan 9, 2025

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Jan 9, 2025

Closes #2683

Copy link

pytorch-bot bot commented Jan 9, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2684

Note: Links to docs will display an error until the docs builds have been completed.

❌ 37 New Failures, 3 Pending

As of commit 1dca119 with merge base f672c70 (image):

NEW FAILURES - The following jobs have failed:

  • Build Aarch64 Linux Wheels / pytorch/rl (pytorch/rl, test/smoke_test.py, torchrl) / upload / wheel-py3_9-cpu-aarch64 (gh)
    This request has been automatically failed because it uses a deprecated version of actions/download-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/
  • Build Aarch64 Linux Wheels / pytorch/rl (pytorch/rl, test/smoke_test.py, torchrl) / upload / wheel-py3_9-cuda-aarch64cuda-aarch64 (gh)
    This request has been automatically failed because it uses a deprecated version of actions/download-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/
  • Build Linux Wheels / pytorch/rl (pytorch/rl, test/smoke_test.py, torchrl) / manywheel-py3_9-cpu (gh)
    This request has been automatically failed because it uses a deprecated version of actions/upload-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/
  • Build Linux Wheels / pytorch/rl (pytorch/rl, test/smoke_test.py, torchrl) / upload / manywheel-py3_9-cpu (gh)
    This request has been automatically failed because it uses a deprecated version of actions/download-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/
  • Build Linux Wheels / pytorch/rl (pytorch/rl, test/smoke_test.py, torchrl) / upload / manywheel-py3_9-cuda11_8 (gh)
    This request has been automatically failed because it uses a deprecated version of actions/download-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/
  • Build Linux Wheels / pytorch/rl (pytorch/rl, test/smoke_test.py, torchrl) / upload / manywheel-py3_9-cuda12_4 (gh)
    This request has been automatically failed because it uses a deprecated version of actions/download-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/
  • Build Linux Wheels / pytorch/rl (pytorch/rl, test/smoke_test.py, torchrl) / upload / manywheel-py3_9-cuda12_6 (gh)
    This request has been automatically failed because it uses a deprecated version of actions/download-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/
  • Build Linux Wheels / pytorch/rl (pytorch/rl, test/smoke_test.py, torchrl) / upload / manywheel-py3_9-rocm6_2_4 (gh)
    This request has been automatically failed because it uses a deprecated version of actions/download-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/
  • Build Linux Wheels / pytorch/rl (pytorch/rl, test/smoke_test.py, torchrl) / upload / manywheel-py3_9-rocm6_3 (gh)
    This request has been automatically failed because it uses a deprecated version of actions/download-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/
  • Build M1 Wheels / pytorch/rl (pytorch/rl, test/smoke_test.py, torchrl) / upload / wheel-py3_9-cpu (gh)
    This request has been automatically failed because it uses a deprecated version of actions/download-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/
  • Build M1 Wheels / pytorch/rl (pytorch/rl, test/smoke_test.py, torchrl) / wheel-py3_9-cpu (gh)
    This request has been automatically failed because it uses a deprecated version of actions/upload-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/
  • Build Windows Wheels / pytorch/rl / upload / wheel-py3_9-cpu (gh)
    This request has been automatically failed because it uses a deprecated version of actions/download-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/
  • Build Windows Wheels / pytorch/rl / upload / wheel-py3_9-cuda11_8 (gh)
    This request has been automatically failed because it uses a deprecated version of actions/download-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/
  • Build Windows Wheels / pytorch/rl / upload / wheel-py3_9-cuda12_4 (gh)
    This request has been automatically failed because it uses a deprecated version of actions/download-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/
  • Build Windows Wheels / pytorch/rl / upload / wheel-py3_9-cuda12_6 (gh)
    This request has been automatically failed because it uses a deprecated version of actions/download-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/
  • Generate documentation / build-docs (3.10, 12.1) / linux-job (gh)
    This request has been automatically failed because it uses a deprecated version of actions/download-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/. This request has been automatically failed because it uses a deprecated version of actions/upload-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/
  • Habitat Tests on Linux / tests (3.9, 12.1) / linux-job (gh)
    This request has been automatically failed because it uses a deprecated version of actions/download-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/. This request has been automatically failed because it uses a deprecated version of actions/upload-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/
  • Libs Tests on Linux / unittests-gym (3.9, 12.1) / linux-job (gh)
    This request has been automatically failed because it uses a deprecated version of actions/download-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/. This request has been automatically failed because it uses a deprecated version of actions/upload-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/
  • Libs Tests on Linux / unittests-sklearn (3.9, 12.1) / linux-job (gh)
    This request has been automatically failed because it uses a deprecated version of actions/download-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/. This request has been automatically failed because it uses a deprecated version of actions/upload-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/
  • Lint / c-source / linux-job (gh)
    This request has been automatically failed because it uses a deprecated version of actions/download-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/. This request has been automatically failed because it uses a deprecated version of actions/upload-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/
  • Lint / python-source-and-configs / linux-job (gh)
  • RLHF Tests on Linux / unittests (3.9, 12.1) / linux-job (gh)
    This request has been automatically failed because it uses a deprecated version of actions/download-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/. This request has been automatically failed because it uses a deprecated version of actions/upload-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/
  • SOTA Tests on Linux / tests (3.9, 12.1) / linux-job (gh)
    This request has been automatically failed because it uses a deprecated version of actions/download-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/. This request has been automatically failed because it uses a deprecated version of actions/upload-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/
  • Unit-tests on Linux / tests-cpu (3.10) / linux-job (gh)
    This request has been automatically failed because it uses a deprecated version of actions/download-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/. This request has been automatically failed because it uses a deprecated version of actions/upload-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/
  • Unit-tests on Linux / tests-cpu (3.11) / linux-job (gh)
    This request has been automatically failed because it uses a deprecated version of actions/download-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/. This request has been automatically failed because it uses a deprecated version of actions/upload-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/
  • Unit-tests on Linux / tests-cpu (3.12) / linux-job (gh)
    This request has been automatically failed because it uses a deprecated version of actions/download-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/. This request has been automatically failed because it uses a deprecated version of actions/upload-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/
  • Unit-tests on Linux / tests-cpu (3.9) / linux-job (gh)
    This request has been automatically failed because it uses a deprecated version of actions/download-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/. This request has been automatically failed because it uses a deprecated version of actions/upload-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/
  • Unit-tests on Linux / tests-cpu-oldget (3.12) / linux-job (gh)
    This request has been automatically failed because it uses a deprecated version of actions/download-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/. This request has been automatically failed because it uses a deprecated version of actions/upload-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/
  • Unit-tests on Linux / tests-gpu (3.11, 12.1) / linux-job (gh)
    This request has been automatically failed because it uses a deprecated version of actions/download-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/. This request has been automatically failed because it uses a deprecated version of actions/upload-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/
  • Unit-tests on Linux / tests-olddeps (3.8, 11.6) / linux-job (gh)
    This request has been automatically failed because it uses a deprecated version of actions/download-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/. This request has been automatically failed because it uses a deprecated version of actions/upload-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/
  • Unit-tests on Linux / tests-optdeps (3.11, 12.1) / linux-job (gh)
    This request has been automatically failed because it uses a deprecated version of actions/download-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/. This request has been automatically failed because it uses a deprecated version of actions/upload-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/
  • Unit-tests on Linux / tests-stable-gpu (3.10, 11.8) / linux-job (gh)
    This request has been automatically failed because it uses a deprecated version of actions/download-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/. This request has been automatically failed because it uses a deprecated version of actions/upload-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/
  • Unit-tests on Windows / unittests-cpu / windows-job (gh)
    This request has been automatically failed because it uses a deprecated version of actions/download-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/. This request has been automatically failed because it uses a deprecated version of actions/upload-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/
  • Wheels / build-wheel-windows (3.10, 3.10.3) (gh)
    This request has been automatically failed because it uses a deprecated version of actions/upload-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/
  • Wheels / build-wheel-windows (3.11, 3.11) (gh)
    This request has been automatically failed because it uses a deprecated version of actions/upload-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/
  • Wheels / build-wheel-windows (3.12, 3.12) (gh)
  • Wheels / build-wheel-windows (3.9, 3.9) (gh)
    This request has been automatically failed because it uses a deprecated version of actions/upload-artifact: v3. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 9, 2025
@vmoens vmoens added the bug Something isn't working label Jan 9, 2025
Copy link

github-actions bot commented Jan 9, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}18$. Worsened: $\large\color{#d91a1a}2$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.5340s 0.4460s 2.2419 Ops/s 2.2180 Ops/s $\color{#35bf28}+1.08\%$
test_transformed 0.6055s 0.6010s 1.6638 Ops/s 1.5763 Ops/s $\textbf{\color{#35bf28}+5.55\%}$
test_serial 1.4494s 1.3590s 0.7358 Ops/s 0.7200 Ops/s $\color{#35bf28}+2.20\%$
test_parallel 1.3235s 1.2167s 0.8219 Ops/s 0.8121 Ops/s $\color{#35bf28}+1.20\%$
test_step_mdp_speed[True-True-True-True-True] 0.2232ms 29.7508μs 33.6125 KOps/s 32.2131 KOps/s $\color{#35bf28}+4.34\%$
test_step_mdp_speed[True-True-True-True-False] 55.9840μs 17.6694μs 56.5951 KOps/s 54.3825 KOps/s $\color{#35bf28}+4.07\%$
test_step_mdp_speed[True-True-True-False-True] 46.5360μs 16.7514μs 59.6964 KOps/s 57.0018 KOps/s $\color{#35bf28}+4.73\%$
test_step_mdp_speed[True-True-True-False-False] 44.2430μs 9.8938μs 101.0737 KOps/s 95.3087 KOps/s $\textbf{\color{#35bf28}+6.05\%}$
test_step_mdp_speed[True-True-False-True-True] 71.7430μs 31.8938μs 31.3540 KOps/s 30.3264 KOps/s $\color{#35bf28}+3.39\%$
test_step_mdp_speed[True-True-False-True-False] 70.5710μs 19.5752μs 51.0850 KOps/s 49.0984 KOps/s $\color{#35bf28}+4.05\%$
test_step_mdp_speed[True-True-False-False-True] 46.2160μs 18.6159μs 53.7175 KOps/s 51.4295 KOps/s $\color{#35bf28}+4.45\%$
test_step_mdp_speed[True-True-False-False-False] 45.6750μs 11.6418μs 85.8977 KOps/s 80.6426 KOps/s $\textbf{\color{#35bf28}+6.52\%}$
test_step_mdp_speed[True-False-True-True-True] 77.2340μs 33.6401μs 29.7265 KOps/s 28.8264 KOps/s $\color{#35bf28}+3.12\%$
test_step_mdp_speed[True-False-True-True-False] 63.6380μs 21.3891μs 46.7529 KOps/s 45.1849 KOps/s $\color{#35bf28}+3.47\%$
test_step_mdp_speed[True-False-True-False-True] 61.8350μs 18.6352μs 53.6618 KOps/s 51.0137 KOps/s $\textbf{\color{#35bf28}+5.19\%}$
test_step_mdp_speed[True-False-True-False-False] 45.3340μs 11.7179μs 85.3392 KOps/s 81.2761 KOps/s $\color{#35bf28}+5.00\%$
test_step_mdp_speed[True-False-False-True-True] 78.5060μs 35.7174μs 27.9976 KOps/s 27.3613 KOps/s $\color{#35bf28}+2.33\%$
test_step_mdp_speed[True-False-False-True-False] 50.7040μs 23.1998μs 43.1038 KOps/s 41.6372 KOps/s $\color{#35bf28}+3.52\%$
test_step_mdp_speed[True-False-False-False-True] 0.1756ms 20.6711μs 48.3768 KOps/s 46.7971 KOps/s $\color{#35bf28}+3.38\%$
test_step_mdp_speed[True-False-False-False-False] 62.4870μs 13.3707μs 74.7904 KOps/s 70.8814 KOps/s $\textbf{\color{#35bf28}+5.51\%}$
test_step_mdp_speed[False-True-True-True-True] 93.1830μs 34.1724μs 29.2634 KOps/s 28.6607 KOps/s $\color{#35bf28}+2.10\%$
test_step_mdp_speed[False-True-True-True-False] 49.0610μs 21.5929μs 46.3114 KOps/s 45.0549 KOps/s $\color{#35bf28}+2.79\%$
test_step_mdp_speed[False-True-True-False-True] 72.6250μs 21.6446μs 46.2010 KOps/s 44.7112 KOps/s $\color{#35bf28}+3.33\%$
test_step_mdp_speed[False-True-True-False-False] 54.1000μs 13.2024μs 75.7436 KOps/s 72.4362 KOps/s $\color{#35bf28}+4.57\%$
test_step_mdp_speed[False-True-False-True-True] 0.2428ms 35.4463μs 28.2117 KOps/s 27.4839 KOps/s $\color{#35bf28}+2.65\%$
test_step_mdp_speed[False-True-False-True-False] 0.1171ms 23.1565μs 43.1845 KOps/s 41.5592 KOps/s $\color{#35bf28}+3.91\%$
test_step_mdp_speed[False-True-False-False-True] 2.9636ms 23.3728μs 42.7847 KOps/s 41.7162 KOps/s $\color{#35bf28}+2.56\%$
test_step_mdp_speed[False-True-False-False-False] 43.4500μs 14.9074μs 67.0810 KOps/s 63.7956 KOps/s $\textbf{\color{#35bf28}+5.15\%}$
test_step_mdp_speed[False-False-True-True-True] 73.2760μs 37.4415μs 26.7083 KOps/s 26.0253 KOps/s $\color{#35bf28}+2.62\%$
test_step_mdp_speed[False-False-True-True-False] 61.8440μs 25.1211μs 39.8072 KOps/s 38.3610 KOps/s $\color{#35bf28}+3.77\%$
test_step_mdp_speed[False-False-True-False-True] 74.7090μs 22.9615μs 43.5511 KOps/s 41.6565 KOps/s $\color{#35bf28}+4.55\%$
test_step_mdp_speed[False-False-True-False-False] 52.4970μs 14.8790μs 67.2088 KOps/s 64.4446 KOps/s $\color{#35bf28}+4.29\%$
test_step_mdp_speed[False-False-False-True-True] 82.4130μs 38.5267μs 25.9560 KOps/s 25.2220 KOps/s $\color{#35bf28}+2.91\%$
test_step_mdp_speed[False-False-False-True-False] 59.3600μs 26.7556μs 37.3754 KOps/s 36.5130 KOps/s $\color{#35bf28}+2.36\%$
test_step_mdp_speed[False-False-False-False-True] 57.0860μs 24.6431μs 40.5793 KOps/s 38.8700 KOps/s $\color{#35bf28}+4.40\%$
test_step_mdp_speed[False-False-False-False-False] 61.9950μs 16.5535μs 60.4101 KOps/s 57.8790 KOps/s $\color{#35bf28}+4.37\%$
test_values[generalized_advantage_estimate-True-True] 10.2859ms 9.8047ms 101.9915 Ops/s 100.0430 Ops/s $\color{#35bf28}+1.95\%$
test_values[vec_generalized_advantage_estimate-True-True] 36.6447ms 33.9344ms 29.4686 Ops/s 27.1951 Ops/s $\textbf{\color{#35bf28}+8.36\%}$
test_values[td0_return_estimate-False-False] 0.2371ms 0.1778ms 5.6250 KOps/s 5.6634 KOps/s $\color{#d91a1a}-0.68\%$
test_values[td1_return_estimate-False-False] 47.6116ms 25.5683ms 39.1109 Ops/s 40.6675 Ops/s $\color{#d91a1a}-3.83\%$
test_values[vec_td1_return_estimate-False-False] 36.0552ms 33.8462ms 29.5454 Ops/s 27.1264 Ops/s $\textbf{\color{#35bf28}+8.92\%}$
test_values[td_lambda_return_estimate-True-False] 35.6899ms 34.6629ms 28.8493 Ops/s 28.0703 Ops/s $\color{#35bf28}+2.78\%$
test_values[vec_td_lambda_return_estimate-True-False] 36.6136ms 34.0486ms 29.3698 Ops/s 27.0942 Ops/s $\textbf{\color{#35bf28}+8.40\%}$
test_gae_speed[generalized_advantage_estimate-False-1-512] 9.1912ms 8.6157ms 116.0676 Ops/s 115.6720 Ops/s $\color{#35bf28}+0.34\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.4031ms 1.8786ms 532.3107 Ops/s 514.4223 Ops/s $\color{#35bf28}+3.48\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.4606ms 0.3666ms 2.7278 KOps/s 2.7364 KOps/s $\color{#d91a1a}-0.31\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 41.7749ms 38.1052ms 26.2431 Ops/s 22.8233 Ops/s $\textbf{\color{#35bf28}+14.98\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 3.7612ms 3.0312ms 329.9058 Ops/s 330.6545 Ops/s $\color{#d91a1a}-0.23\%$
test_dqn_speed[False-None] 6.5098ms 1.4405ms 694.2190 Ops/s 703.3331 Ops/s $\color{#d91a1a}-1.30\%$
test_dqn_speed[False-backward] 2.0193ms 1.9414ms 515.0831 Ops/s 523.6935 Ops/s $\color{#d91a1a}-1.64\%$
test_dqn_speed[True-None] 0.7654ms 0.4840ms 2.0660 KOps/s 2.0315 KOps/s $\color{#35bf28}+1.70\%$
test_dqn_speed[True-backward] 1.0173ms 0.9526ms 1.0497 KOps/s 798.4149 Ops/s $\textbf{\color{#35bf28}+31.48\%}$
test_dqn_speed[reduce-overhead-None] 0.7751ms 0.4885ms 2.0471 KOps/s 2.0486 KOps/s $\color{#d91a1a}-0.07\%$
test_dqn_speed[reduce-overhead-backward] 0.9971ms 0.9351ms 1.0694 KOps/s 1.0413 KOps/s $\color{#35bf28}+2.70\%$
test_ddpg_speed[False-None] 4.0080ms 2.9745ms 336.1865 Ops/s 345.3742 Ops/s $\color{#d91a1a}-2.66\%$
test_ddpg_speed[False-backward] 5.8489ms 4.1344ms 241.8705 Ops/s 246.8611 Ops/s $\color{#d91a1a}-2.02\%$
test_ddpg_speed[True-None] 1.2034ms 1.0228ms 977.7446 Ops/s 948.2952 Ops/s $\color{#35bf28}+3.11\%$
test_ddpg_speed[True-backward] 2.6197ms 2.0042ms 498.9437 Ops/s 475.3374 Ops/s $\color{#35bf28}+4.97\%$
test_ddpg_speed[reduce-overhead-None] 1.4660ms 1.0216ms 978.8855 Ops/s 980.4723 Ops/s $\color{#d91a1a}-0.16\%$
test_ddpg_speed[reduce-overhead-backward] 1.9984ms 1.9378ms 516.0531 Ops/s 508.2706 Ops/s $\color{#35bf28}+1.53\%$
test_sac_speed[False-None] 8.6891ms 8.2566ms 121.1151 Ops/s 121.5331 Ops/s $\color{#d91a1a}-0.34\%$
test_sac_speed[False-backward] 11.5791ms 10.9463ms 91.3552 Ops/s 91.5110 Ops/s $\color{#d91a1a}-0.17\%$
test_sac_speed[True-None] 2.1627ms 1.8614ms 537.2407 Ops/s 531.0052 Ops/s $\color{#35bf28}+1.17\%$
test_sac_speed[True-backward] 3.7550ms 3.5483ms 281.8220 Ops/s 274.9433 Ops/s $\color{#35bf28}+2.50\%$
test_sac_speed[reduce-overhead-None] 2.0709ms 1.8585ms 538.0541 Ops/s 533.0535 Ops/s $\color{#35bf28}+0.94\%$
test_sac_speed[reduce-overhead-backward] 3.7483ms 3.5323ms 283.0991 Ops/s 281.5365 Ops/s $\color{#35bf28}+0.56\%$
test_redq_speed[False-None] 15.4218ms 13.1542ms 76.0214 Ops/s 74.4965 Ops/s $\color{#35bf28}+2.05\%$
test_redq_speed[False-backward] 25.1596ms 22.9017ms 43.6649 Ops/s 43.8114 Ops/s $\color{#d91a1a}-0.33\%$
test_redq_speed[True-None] 5.5836ms 4.8461ms 206.3510 Ops/s 197.4163 Ops/s $\color{#35bf28}+4.53\%$
test_redq_speed[True-backward] 12.8729ms 12.1759ms 82.1292 Ops/s 77.6730 Ops/s $\textbf{\color{#35bf28}+5.74\%}$
test_redq_speed[reduce-overhead-None] 5.7070ms 4.6447ms 215.3002 Ops/s 197.1906 Ops/s $\textbf{\color{#35bf28}+9.18\%}$
test_redq_speed[reduce-overhead-backward] 13.2109ms 12.2453ms 81.6639 Ops/s 76.6120 Ops/s $\textbf{\color{#35bf28}+6.59\%}$
test_redq_deprec_speed[False-None] 14.7917ms 13.1709ms 75.9251 Ops/s 74.6311 Ops/s $\color{#35bf28}+1.73\%$
test_redq_deprec_speed[False-backward] 20.2768ms 19.0006ms 52.6299 Ops/s 50.8256 Ops/s $\color{#35bf28}+3.55\%$
test_redq_deprec_speed[True-None] 3.9778ms 3.6302ms 275.4705 Ops/s 247.7937 Ops/s $\textbf{\color{#35bf28}+11.17\%}$
test_redq_deprec_speed[True-backward] 8.6136ms 8.1475ms 122.7371 Ops/s 117.9622 Ops/s $\color{#35bf28}+4.05\%$
test_redq_deprec_speed[reduce-overhead-None] 4.0852ms 3.6333ms 275.2286 Ops/s 270.2232 Ops/s $\color{#35bf28}+1.85\%$
test_redq_deprec_speed[reduce-overhead-backward] 9.4656ms 8.2580ms 121.0947 Ops/s 119.0741 Ops/s $\color{#35bf28}+1.70\%$
test_td3_speed[False-None] 10.3919ms 8.2530ms 121.1680 Ops/s 120.4474 Ops/s $\color{#35bf28}+0.60\%$
test_td3_speed[False-backward] 11.6413ms 10.6385ms 93.9980 Ops/s 93.0172 Ops/s $\color{#35bf28}+1.05\%$
test_td3_speed[True-None] 1.9145ms 1.7492ms 571.6829 Ops/s 554.7039 Ops/s $\color{#35bf28}+3.06\%$
test_td3_speed[True-backward] 3.6701ms 3.4098ms 293.2709 Ops/s 295.9548 Ops/s $\color{#d91a1a}-0.91\%$
test_td3_speed[reduce-overhead-None] 2.0343ms 1.7516ms 570.9053 Ops/s 560.7088 Ops/s $\color{#35bf28}+1.82\%$
test_td3_speed[reduce-overhead-backward] 3.5308ms 3.3588ms 297.7225 Ops/s 293.1911 Ops/s $\color{#35bf28}+1.55\%$
test_cql_speed[False-None] 42.3899ms 38.0112ms 26.3080 Ops/s 26.3435 Ops/s $\color{#d91a1a}-0.13\%$
test_cql_speed[False-backward] 49.3516ms 47.0535ms 21.2524 Ops/s 20.9775 Ops/s $\color{#35bf28}+1.31\%$
test_cql_speed[True-None] 17.3218ms 15.7667ms 63.4246 Ops/s 62.6952 Ops/s $\color{#35bf28}+1.16\%$
test_cql_speed[True-backward] 24.3549ms 22.7126ms 44.0284 Ops/s 43.5952 Ops/s $\color{#35bf28}+0.99\%$
test_cql_speed[reduce-overhead-None] 16.7885ms 15.8943ms 62.9156 Ops/s 61.7768 Ops/s $\color{#35bf28}+1.84\%$
test_cql_speed[reduce-overhead-backward] 29.9552ms 23.1611ms 43.1759 Ops/s 43.2370 Ops/s $\color{#d91a1a}-0.14\%$
test_a2c_speed[False-None] 8.0302ms 7.2167ms 138.5679 Ops/s 135.7390 Ops/s $\color{#35bf28}+2.08\%$
test_a2c_speed[False-backward] 15.9663ms 14.3140ms 69.8619 Ops/s 68.4768 Ops/s $\color{#35bf28}+2.02\%$
test_a2c_speed[True-None] 4.8295ms 4.2783ms 233.7364 Ops/s 231.5337 Ops/s $\color{#35bf28}+0.95\%$
test_a2c_speed[True-backward] 11.9795ms 10.8602ms 92.0795 Ops/s 92.3268 Ops/s $\color{#d91a1a}-0.27\%$
test_a2c_speed[reduce-overhead-None] 5.2563ms 4.2701ms 234.1881 Ops/s 235.6826 Ops/s $\color{#d91a1a}-0.63\%$
test_a2c_speed[reduce-overhead-backward] 11.5659ms 10.8266ms 92.3649 Ops/s 91.6241 Ops/s $\color{#35bf28}+0.81\%$
test_ppo_speed[False-None] 8.7363ms 7.5263ms 132.8671 Ops/s 129.1571 Ops/s $\color{#35bf28}+2.87\%$
test_ppo_speed[False-backward] 17.3722ms 15.0458ms 66.4639 Ops/s 66.5128 Ops/s $\color{#d91a1a}-0.07\%$
test_ppo_speed[True-None] 4.1059ms 3.6928ms 270.7998 Ops/s 263.0148 Ops/s $\color{#35bf28}+2.96\%$
test_ppo_speed[True-backward] 11.0528ms 9.7386ms 102.6844 Ops/s 101.5335 Ops/s $\color{#35bf28}+1.13\%$
test_ppo_speed[reduce-overhead-None] 3.9743ms 3.7067ms 269.7848 Ops/s 262.2882 Ops/s $\color{#35bf28}+2.86\%$
test_ppo_speed[reduce-overhead-backward] 10.4515ms 9.8024ms 102.0159 Ops/s 101.7425 Ops/s $\color{#35bf28}+0.27\%$
test_reinforce_speed[False-None] 8.8666ms 6.6094ms 151.3001 Ops/s 151.4432 Ops/s $\color{#d91a1a}-0.09\%$
test_reinforce_speed[False-backward] 10.1694ms 9.9736ms 100.2645 Ops/s 99.0803 Ops/s $\color{#35bf28}+1.20\%$
test_reinforce_speed[True-None] 3.0381ms 2.6675ms 374.8868 Ops/s 370.0996 Ops/s $\color{#35bf28}+1.29\%$
test_reinforce_speed[True-backward] 9.0720ms 8.6489ms 115.6221 Ops/s 113.9729 Ops/s $\color{#35bf28}+1.45\%$
test_reinforce_speed[reduce-overhead-None] 3.0515ms 2.6750ms 373.8386 Ops/s 369.0659 Ops/s $\color{#35bf28}+1.29\%$
test_reinforce_speed[reduce-overhead-backward] 9.6655ms 8.6538ms 115.5558 Ops/s 113.7867 Ops/s $\color{#35bf28}+1.55\%$
test_iql_speed[False-None] 35.0965ms 32.8164ms 30.4725 Ops/s 30.0833 Ops/s $\color{#35bf28}+1.29\%$
test_iql_speed[False-backward] 48.3924ms 46.1066ms 21.6889 Ops/s 21.4696 Ops/s $\color{#35bf28}+1.02\%$
test_iql_speed[True-None] 12.6636ms 10.8792ms 91.9185 Ops/s 89.3779 Ops/s $\color{#35bf28}+2.84\%$
test_iql_speed[True-backward] 29.2132ms 22.7885ms 43.8817 Ops/s 44.5058 Ops/s $\color{#d91a1a}-1.40\%$
test_iql_speed[reduce-overhead-None] 12.3159ms 11.0083ms 90.8409 Ops/s 90.1333 Ops/s $\color{#35bf28}+0.79\%$
test_iql_speed[reduce-overhead-backward] 23.8094ms 22.0053ms 45.4435 Ops/s 45.1271 Ops/s $\color{#35bf28}+0.70\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.6937ms 4.9222ms 203.1624 Ops/s 198.9543 Ops/s $\color{#35bf28}+2.12\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.8400ms 0.5186ms 1.9282 KOps/s 619.4676 Ops/s $\textbf{\color{#35bf28}+211.27\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7976ms 0.4954ms 2.0185 KOps/s 1.9782 KOps/s $\color{#35bf28}+2.04\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.6022ms 4.8416ms 206.5433 Ops/s 204.4686 Ops/s $\color{#35bf28}+1.01\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.4299ms 0.5138ms 1.9462 KOps/s 1.9113 KOps/s $\color{#35bf28}+1.83\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.8047ms 0.4849ms 2.0622 KOps/s 2.0626 KOps/s $\color{#d91a1a}-0.02\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.0815ms 1.6541ms 604.5618 Ops/s 600.0380 Ops/s $\color{#35bf28}+0.75\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.4831ms 1.5822ms 632.0284 Ops/s 631.2655 Ops/s $\color{#35bf28}+0.12\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.1206ms 4.9193ms 203.2823 Ops/s 199.0237 Ops/s $\color{#35bf28}+2.14\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.0227ms 0.6530ms 1.5313 KOps/s 1.5066 KOps/s $\color{#35bf28}+1.64\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1.0775ms 0.6348ms 1.5753 KOps/s 1.5847 KOps/s $\color{#d91a1a}-0.59\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.0840ms 4.8846ms 204.7269 Ops/s 205.7215 Ops/s $\color{#d91a1a}-0.48\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.8157ms 0.5205ms 1.9214 KOps/s 1.8526 KOps/s $\color{#35bf28}+3.71\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7863ms 0.4985ms 2.0061 KOps/s 2.0117 KOps/s $\color{#d91a1a}-0.28\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.6473ms 4.7726ms 209.5272 Ops/s 204.8642 Ops/s $\color{#35bf28}+2.28\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.8351ms 0.5134ms 1.9477 KOps/s 1.9555 KOps/s $\color{#d91a1a}-0.40\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7374ms 0.4836ms 2.0680 KOps/s 2.0094 KOps/s $\color{#35bf28}+2.92\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.0077ms 4.8255ms 207.2333 Ops/s 199.7407 Ops/s $\color{#35bf28}+3.75\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.6056ms 0.6543ms 1.5283 KOps/s 1.5386 KOps/s $\color{#d91a1a}-0.66\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8332ms 0.6307ms 1.5855 KOps/s 1.5576 KOps/s $\color{#35bf28}+1.79\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.4782s 13.7055ms 72.9633 Ops/s 235.2192 Ops/s $\textbf{\color{#d91a1a}-68.98\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 8.2932ms 2.3880ms 418.7589 Ops/s 428.7878 Ops/s $\color{#d91a1a}-2.34\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 5.3722ms 1.3274ms 753.3542 Ops/s 781.5442 Ops/s $\color{#d91a1a}-3.61\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 5.3791ms 4.1212ms 242.6477 Ops/s 247.3352 Ops/s $\color{#d91a1a}-1.90\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 7.6256ms 2.4068ms 415.4849 Ops/s 413.6492 Ops/s $\color{#35bf28}+0.44\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 4.4860ms 1.2794ms 781.6438 Ops/s 752.7881 Ops/s $\color{#35bf28}+3.83\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.4130s 12.5493ms 79.6860 Ops/s 228.7087 Ops/s $\textbf{\color{#d91a1a}-65.16\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 8.5730ms 2.5936ms 385.5592 Ops/s 397.2090 Ops/s $\color{#d91a1a}-2.93\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 1.8976ms 1.4101ms 709.1926 Ops/s 37.0868 Ops/s $\textbf{\color{#35bf28}+1812.25\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 16.2109ms 13.2747ms 75.3313 Ops/s 71.5659 Ops/s $\textbf{\color{#35bf28}+5.26\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 16.7979ms 15.0755ms 66.3329 Ops/s 63.5246 Ops/s $\color{#35bf28}+4.42\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 23.5279ms 22.0008ms 45.4528 Ops/s 44.1808 Ops/s $\color{#35bf28}+2.88\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 16.6076ms 15.2230ms 65.6902 Ops/s 63.5805 Ops/s $\color{#35bf28}+3.32\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 24.2559ms 22.0069ms 45.4403 Ops/s 43.6167 Ops/s $\color{#35bf28}+4.18\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 17.7969ms 16.5228ms 60.5225 Ops/s 58.3509 Ops/s $\color{#35bf28}+3.72\%$

@vmoens
Copy link
Contributor Author

vmoens commented Jan 9, 2025

CI outage, tests pass locally - merging since minor change

@vmoens vmoens merged commit ed656a1 into main Jan 9, 2025
30 of 78 checks passed
Copy link

github-actions bot commented Jan 9, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}16$. Worsened: $\large\color{#d91a1a}8$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.7201s 0.7190s 1.3908 Ops/s 1.3487 Ops/s $\color{#35bf28}+3.12\%$
test_transformed 0.9705s 0.9693s 1.0316 Ops/s 1.0090 Ops/s $\color{#35bf28}+2.25\%$
test_serial 2.2375s 2.1468s 0.4658 Ops/s 0.4672 Ops/s $\color{#d91a1a}-0.29\%$
test_parallel 1.9223s 1.8451s 0.5420 Ops/s 0.5428 Ops/s $\color{#d91a1a}-0.15\%$
test_step_mdp_speed[True-True-True-True-True] 0.1377ms 40.1557μs 24.9031 KOps/s 24.6194 KOps/s $\color{#35bf28}+1.15\%$
test_step_mdp_speed[True-True-True-True-False] 51.0700μs 23.6738μs 42.2407 KOps/s 42.4824 KOps/s $\color{#d91a1a}-0.57\%$
test_step_mdp_speed[True-True-True-False-True] 56.2010μs 22.4848μs 44.4744 KOps/s 45.2080 KOps/s $\color{#d91a1a}-1.62\%$
test_step_mdp_speed[True-True-True-False-False] 44.9600μs 12.8804μs 77.6376 KOps/s 76.3347 KOps/s $\color{#35bf28}+1.71\%$
test_step_mdp_speed[True-True-False-True-True] 80.6020μs 43.2530μs 23.1198 KOps/s 23.5177 KOps/s $\color{#d91a1a}-1.69\%$
test_step_mdp_speed[True-True-False-True-False] 57.2410μs 25.4606μs 39.2764 KOps/s 39.0145 KOps/s $\color{#35bf28}+0.67\%$
test_step_mdp_speed[True-True-False-False-True] 57.3410μs 24.7519μs 40.4009 KOps/s 39.8589 KOps/s $\color{#35bf28}+1.36\%$
test_step_mdp_speed[True-True-False-False-False] 53.6400μs 15.4173μs 64.8620 KOps/s 64.6502 KOps/s $\color{#35bf28}+0.33\%$
test_step_mdp_speed[True-False-True-True-True] 83.4920μs 45.3851μs 22.0337 KOps/s 22.0081 KOps/s $\color{#35bf28}+0.12\%$
test_step_mdp_speed[True-False-True-True-False] 59.3810μs 28.1692μs 35.4998 KOps/s 35.3420 KOps/s $\color{#35bf28}+0.45\%$
test_step_mdp_speed[True-False-True-False-True] 54.6510μs 25.1636μs 39.7399 KOps/s 41.1576 KOps/s $\color{#d91a1a}-3.44\%$
test_step_mdp_speed[True-False-True-False-False] 48.5310μs 15.3147μs 65.2967 KOps/s 64.3825 KOps/s $\color{#35bf28}+1.42\%$
test_step_mdp_speed[True-False-False-True-True] 0.1396ms 45.9191μs 21.7774 KOps/s 20.5462 KOps/s $\textbf{\color{#35bf28}+5.99\%}$
test_step_mdp_speed[True-False-False-True-False] 60.8110μs 29.9388μs 33.4015 KOps/s 33.0574 KOps/s $\color{#35bf28}+1.04\%$
test_step_mdp_speed[True-False-False-False-True] 56.9710μs 27.0616μs 36.9528 KOps/s 36.8440 KOps/s $\color{#35bf28}+0.30\%$
test_step_mdp_speed[True-False-False-False-False] 46.3610μs 17.6094μs 56.7878 KOps/s 56.8445 KOps/s $\color{#d91a1a}-0.10\%$
test_step_mdp_speed[False-True-True-True-True] 79.9210μs 45.2321μs 22.1082 KOps/s 21.9693 KOps/s $\color{#35bf28}+0.63\%$
test_step_mdp_speed[False-True-True-True-False] 61.7710μs 28.1587μs 35.5130 KOps/s 35.3758 KOps/s $\color{#35bf28}+0.39\%$
test_step_mdp_speed[False-True-True-False-True] 60.1110μs 28.6699μs 34.8798 KOps/s 35.1456 KOps/s $\color{#d91a1a}-0.76\%$
test_step_mdp_speed[False-True-True-False-False] 45.8800μs 16.9828μs 58.8831 KOps/s 57.9960 KOps/s $\color{#35bf28}+1.53\%$
test_step_mdp_speed[False-True-False-True-True] 83.5420μs 47.6097μs 21.0041 KOps/s 20.7684 KOps/s $\color{#35bf28}+1.14\%$
test_step_mdp_speed[False-True-False-True-False] 73.1410μs 30.2616μs 33.0451 KOps/s 32.8443 KOps/s $\color{#35bf28}+0.61\%$
test_step_mdp_speed[False-True-False-False-True] 3.0884ms 30.7501μs 32.5203 KOps/s 31.6246 KOps/s $\color{#35bf28}+2.83\%$
test_step_mdp_speed[False-True-False-False-False] 47.7510μs 19.3256μs 51.7448 KOps/s 51.4538 KOps/s $\color{#35bf28}+0.57\%$
test_step_mdp_speed[False-False-True-True-True] 93.3120μs 49.6451μs 20.1430 KOps/s 19.8051 KOps/s $\color{#35bf28}+1.71\%$
test_step_mdp_speed[False-False-True-True-False] 87.0110μs 32.9856μs 30.3162 KOps/s 30.3734 KOps/s $\color{#d91a1a}-0.19\%$
test_step_mdp_speed[False-False-True-False-True] 79.6120μs 31.0765μs 32.1786 KOps/s 32.3856 KOps/s $\color{#d91a1a}-0.64\%$
test_step_mdp_speed[False-False-True-False-False] 45.2610μs 19.1205μs 52.2999 KOps/s 51.0113 KOps/s $\color{#35bf28}+2.53\%$
test_step_mdp_speed[False-False-False-True-True] 81.8110μs 51.5098μs 19.4138 KOps/s 19.0775 KOps/s $\color{#35bf28}+1.76\%$
test_step_mdp_speed[False-False-False-True-False] 77.0810μs 34.4128μs 29.0589 KOps/s 28.5161 KOps/s $\color{#35bf28}+1.90\%$
test_step_mdp_speed[False-False-False-False-True] 61.0310μs 32.7574μs 30.5274 KOps/s 30.8626 KOps/s $\color{#d91a1a}-1.09\%$
test_step_mdp_speed[False-False-False-False-False] 53.4510μs 21.2649μs 47.0258 KOps/s 46.1936 KOps/s $\color{#35bf28}+1.80\%$
test_values[generalized_advantage_estimate-True-True] 25.5300ms 25.0181ms 39.9711 Ops/s 39.5931 Ops/s $\color{#35bf28}+0.95\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1007s 2.9185ms 342.6382 Ops/s 298.0776 Ops/s $\textbf{\color{#35bf28}+14.95\%}$
test_values[td0_return_estimate-False-False] 0.1087ms 81.2704μs 12.3046 KOps/s 12.3978 KOps/s $\color{#d91a1a}-0.75\%$
test_values[td1_return_estimate-False-False] 56.3606ms 56.0875ms 17.8293 Ops/s 17.7318 Ops/s $\color{#35bf28}+0.55\%$
test_values[vec_td1_return_estimate-False-False] 1.4423ms 1.0897ms 917.6762 Ops/s 917.8804 Ops/s $\color{#d91a1a}-0.02\%$
test_values[td_lambda_return_estimate-True-False] 89.6272ms 89.2223ms 11.2080 Ops/s 11.1689 Ops/s $\color{#35bf28}+0.35\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.3954ms 1.0853ms 921.4230 Ops/s 918.3770 Ops/s $\color{#35bf28}+0.33\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 25.3584ms 24.5256ms 40.7736 Ops/s 40.4003 Ops/s $\color{#35bf28}+0.92\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0413ms 0.7633ms 1.3101 KOps/s 1.2981 KOps/s $\color{#35bf28}+0.93\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7709ms 0.6739ms 1.4839 KOps/s 1.4723 KOps/s $\color{#35bf28}+0.79\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5393ms 1.4878ms 672.1148 Ops/s 673.3586 Ops/s $\color{#d91a1a}-0.18\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7312ms 0.6896ms 1.4501 KOps/s 1.4408 KOps/s $\color{#35bf28}+0.65\%$
test_dqn_speed[False-None] 1.6058ms 1.5137ms 660.6175 Ops/s 663.7681 Ops/s $\color{#d91a1a}-0.47\%$
test_dqn_speed[False-backward] 2.1871ms 2.1217ms 471.3181 Ops/s 469.6497 Ops/s $\color{#35bf28}+0.36\%$
test_dqn_speed[True-None] 0.6386ms 0.5550ms 1.8017 KOps/s 1.7872 KOps/s $\color{#35bf28}+0.81\%$
test_dqn_speed[True-backward] 1.2573ms 1.2174ms 821.4357 Ops/s 891.6852 Ops/s $\textbf{\color{#d91a1a}-7.88\%}$
test_dqn_speed[reduce-overhead-None] 0.7146ms 0.5777ms 1.7310 KOps/s 1.7374 KOps/s $\color{#d91a1a}-0.37\%$
test_dqn_speed[reduce-overhead-backward] 1.1375ms 1.0757ms 929.6452 Ops/s 1.0206 KOps/s $\textbf{\color{#d91a1a}-8.91\%}$
test_ddpg_speed[False-None] 3.2691ms 2.8690ms 348.5521 Ops/s 345.9835 Ops/s $\color{#35bf28}+0.74\%$
test_ddpg_speed[False-backward] 4.4929ms 4.2332ms 236.2267 Ops/s 240.2307 Ops/s $\color{#d91a1a}-1.67\%$
test_ddpg_speed[True-None] 1.5095ms 1.0952ms 913.1053 Ops/s 911.1346 Ops/s $\color{#35bf28}+0.22\%$
test_ddpg_speed[True-backward] 2.3564ms 2.3096ms 432.9829 Ops/s 447.9563 Ops/s $\color{#d91a1a}-3.34\%$
test_ddpg_speed[reduce-overhead-None] 1.5322ms 1.1147ms 897.0711 Ops/s 888.7642 Ops/s $\color{#35bf28}+0.93\%$
test_ddpg_speed[reduce-overhead-backward] 1.8534ms 1.7888ms 559.0453 Ops/s 596.7655 Ops/s $\textbf{\color{#d91a1a}-6.32\%}$
test_sac_speed[False-None] 8.5619ms 8.0794ms 123.7714 Ops/s 123.1578 Ops/s $\color{#35bf28}+0.50\%$
test_sac_speed[False-backward] 11.7314ms 11.2798ms 88.6537 Ops/s 90.2792 Ops/s $\color{#d91a1a}-1.80\%$
test_sac_speed[True-None] 1.9712ms 1.5598ms 641.0938 Ops/s 625.2355 Ops/s $\color{#35bf28}+2.54\%$
test_sac_speed[True-backward] 3.5172ms 3.4438ms 290.3804 Ops/s 288.9262 Ops/s $\color{#35bf28}+0.50\%$
test_sac_speed[reduce-overhead-None] 22.5793ms 12.5014ms 79.9910 Ops/s 78.6437 Ops/s $\color{#35bf28}+1.71\%$
test_sac_speed[reduce-overhead-backward] 1.6327ms 1.5317ms 652.8591 Ops/s 653.6638 Ops/s $\color{#d91a1a}-0.12\%$
test_redq_speed[False-None] 8.3166ms 7.5671ms 132.1509 Ops/s 131.3168 Ops/s $\color{#35bf28}+0.64\%$
test_redq_speed[False-backward] 12.4185ms 11.7205ms 85.3203 Ops/s 84.6042 Ops/s $\color{#35bf28}+0.85\%$
test_redq_speed[True-None] 2.4001ms 2.0072ms 498.1982 Ops/s 495.0736 Ops/s $\color{#35bf28}+0.63\%$
test_redq_speed[True-backward] 4.3001ms 3.9062ms 256.0007 Ops/s 269.3527 Ops/s $\color{#d91a1a}-4.96\%$
test_redq_speed[reduce-overhead-None] 2.4427ms 2.0088ms 497.8089 Ops/s 495.0611 Ops/s $\color{#35bf28}+0.56\%$
test_redq_speed[reduce-overhead-backward] 4.3214ms 3.8927ms 256.8931 Ops/s 268.3423 Ops/s $\color{#d91a1a}-4.27\%$
test_redq_deprec_speed[False-None] 9.5281ms 9.0780ms 110.1569 Ops/s 108.9177 Ops/s $\color{#35bf28}+1.14\%$
test_redq_deprec_speed[False-backward] 12.7096ms 12.3335ms 81.0797 Ops/s 81.6395 Ops/s $\color{#d91a1a}-0.69\%$
test_redq_deprec_speed[True-None] 2.8076ms 2.3862ms 419.0809 Ops/s 417.0623 Ops/s $\color{#35bf28}+0.48\%$
test_redq_deprec_speed[True-backward] 4.2498ms 4.0635ms 246.0941 Ops/s 234.2912 Ops/s $\textbf{\color{#35bf28}+5.04\%}$
test_redq_deprec_speed[reduce-overhead-None] 2.7739ms 2.3746ms 421.1238 Ops/s 420.8840 Ops/s $\color{#35bf28}+0.06\%$
test_redq_deprec_speed[reduce-overhead-backward] 4.1514ms 4.0444ms 247.2546 Ops/s 229.8169 Ops/s $\textbf{\color{#35bf28}+7.59\%}$
test_td3_speed[False-None] 8.0841ms 7.9723ms 125.4339 Ops/s 124.7124 Ops/s $\color{#35bf28}+0.58\%$
test_td3_speed[False-backward] 10.7572ms 10.2971ms 97.1148 Ops/s 96.0846 Ops/s $\color{#35bf28}+1.07\%$
test_td3_speed[True-None] 1.6386ms 1.6143ms 619.4533 Ops/s 623.0392 Ops/s $\color{#d91a1a}-0.58\%$
test_td3_speed[True-backward] 3.2213ms 3.1631ms 316.1457 Ops/s 316.1359 Ops/s $+0.00\%$
test_td3_speed[reduce-overhead-None] 51.4698ms 26.1959ms 38.1739 Ops/s 38.2527 Ops/s $\color{#d91a1a}-0.21\%$
test_td3_speed[reduce-overhead-backward] 1.3911ms 1.3123ms 761.9922 Ops/s 758.2847 Ops/s $\color{#35bf28}+0.49\%$
test_cql_speed[False-None] 17.4807ms 16.9058ms 59.1513 Ops/s 58.8169 Ops/s $\color{#35bf28}+0.57\%$
test_cql_speed[False-backward] 23.0845ms 22.1132ms 45.2219 Ops/s 44.9954 Ops/s $\color{#35bf28}+0.50\%$
test_cql_speed[True-None] 3.0593ms 2.9622ms 337.5813 Ops/s 320.8929 Ops/s $\textbf{\color{#35bf28}+5.20\%}$
test_cql_speed[True-backward] 5.3485ms 5.1293ms 194.9572 Ops/s 187.8562 Ops/s $\color{#35bf28}+3.78\%$
test_cql_speed[reduce-overhead-None] 0.3629s 14.9460ms 66.9075 Ops/s 75.4335 Ops/s $\textbf{\color{#d91a1a}-11.30\%}$
test_cql_speed[reduce-overhead-backward] 1.6078ms 1.5443ms 647.5271 Ops/s 648.0577 Ops/s $\color{#d91a1a}-0.08\%$
test_a2c_speed[False-None] 3.4971ms 3.2564ms 307.0837 Ops/s 304.8803 Ops/s $\color{#35bf28}+0.72\%$
test_a2c_speed[False-backward] 6.5918ms 6.1079ms 163.7214 Ops/s 160.0655 Ops/s $\color{#35bf28}+2.28\%$
test_a2c_speed[True-None] 1.1226ms 1.0266ms 974.0454 Ops/s 950.3908 Ops/s $\color{#35bf28}+2.49\%$
test_a2c_speed[True-backward] 2.7201ms 2.6196ms 381.7397 Ops/s 378.7023 Ops/s $\color{#35bf28}+0.80\%$
test_a2c_speed[reduce-overhead-None] 21.3331ms 11.2843ms 88.6183 Ops/s 87.2974 Ops/s $\color{#35bf28}+1.51\%$
test_a2c_speed[reduce-overhead-backward] 1.0339ms 0.9796ms 1.0208 KOps/s 865.6128 Ops/s $\textbf{\color{#35bf28}+17.93\%}$
test_ppo_speed[False-None] 3.8124ms 3.7122ms 269.3837 Ops/s 268.7802 Ops/s $\color{#35bf28}+0.22\%$
test_ppo_speed[False-backward] 7.2679ms 6.8540ms 145.9012 Ops/s 140.2961 Ops/s $\color{#35bf28}+4.00\%$
test_ppo_speed[True-None] 1.0055ms 0.9599ms 1.0418 KOps/s 1.0224 KOps/s $\color{#35bf28}+1.90\%$
test_ppo_speed[True-backward] 2.9986ms 2.5753ms 388.3043 Ops/s 362.6719 Ops/s $\textbf{\color{#35bf28}+7.07\%}$
test_ppo_speed[reduce-overhead-None] 0.6495ms 0.5343ms 1.8717 KOps/s 68.5044 Ops/s $\textbf{\color{#35bf28}+2632.24\%}$
test_ppo_speed[reduce-overhead-backward] 1.0244ms 0.9718ms 1.0291 KOps/s 842.2531 Ops/s $\textbf{\color{#35bf28}+22.18\%}$
test_reinforce_speed[False-None] 2.4321ms 2.2804ms 438.5141 Ops/s 437.8973 Ops/s $\color{#35bf28}+0.14\%$
test_reinforce_speed[False-backward] 3.7504ms 3.3149ms 301.6711 Ops/s 289.6741 Ops/s $\color{#35bf28}+4.14\%$
test_reinforce_speed[True-None] 0.9525ms 0.8496ms 1.1770 KOps/s 1.1348 KOps/s $\color{#35bf28}+3.72\%$
test_reinforce_speed[True-backward] 2.4674ms 2.4295ms 411.6069 Ops/s 379.5906 Ops/s $\textbf{\color{#35bf28}+8.43\%}$
test_reinforce_speed[reduce-overhead-None] 0.2951s 12.0375ms 83.0736 Ops/s 88.7442 Ops/s $\textbf{\color{#d91a1a}-6.39\%}$
test_reinforce_speed[reduce-overhead-backward] 1.0839ms 1.0420ms 959.6872 Ops/s 820.4901 Ops/s $\textbf{\color{#35bf28}+16.97\%}$
test_iql_speed[False-None] 9.9711ms 9.4525ms 105.7921 Ops/s 106.7355 Ops/s $\color{#d91a1a}-0.88\%$
test_iql_speed[False-backward] 13.9552ms 13.1589ms 75.9940 Ops/s 74.4104 Ops/s $\color{#35bf28}+2.13\%$
test_iql_speed[True-None] 1.8507ms 1.7996ms 555.6700 Ops/s 539.5735 Ops/s $\color{#35bf28}+2.98\%$
test_iql_speed[True-backward] 4.4619ms 4.2991ms 232.6044 Ops/s 231.8376 Ops/s $\color{#35bf28}+0.33\%$
test_iql_speed[reduce-overhead-None] 19.7760ms 11.4066ms 87.6688 Ops/s 68.0128 Ops/s $\textbf{\color{#35bf28}+28.90\%}$
test_iql_speed[reduce-overhead-backward] 1.5571ms 1.4241ms 702.2177 Ops/s 683.8853 Ops/s $\color{#35bf28}+2.68\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.9983ms 6.4787ms 154.3531 Ops/s 154.0687 Ops/s $\color{#35bf28}+0.18\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.5127ms 0.3363ms 2.9733 KOps/s 3.4668 KOps/s $\textbf{\color{#d91a1a}-14.24\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5835ms 0.3244ms 3.0823 KOps/s 3.8634 KOps/s $\textbf{\color{#d91a1a}-20.22\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.4778ms 6.1768ms 161.8949 Ops/s 161.5215 Ops/s $\color{#35bf28}+0.23\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.0373ms 0.2932ms 3.4108 KOps/s 3.5369 KOps/s $\color{#d91a1a}-3.56\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.4727ms 0.2647ms 3.7773 KOps/s 3.5257 KOps/s $\textbf{\color{#35bf28}+7.13\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.9721ms 1.3040ms 766.8664 Ops/s 739.8456 Ops/s $\color{#35bf28}+3.65\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.5580ms 1.2177ms 821.2278 Ops/s 795.1116 Ops/s $\color{#35bf28}+3.28\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.4021ms 6.3201ms 158.2252 Ops/s 156.1536 Ops/s $\color{#35bf28}+1.33\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.9569ms 0.4426ms 2.2595 KOps/s 2.1611 KOps/s $\color{#35bf28}+4.56\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6834ms 0.4486ms 2.2293 KOps/s 2.2809 KOps/s $\color{#d91a1a}-2.26\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.2939ms 6.1935ms 161.4590 Ops/s 160.7046 Ops/s $\color{#35bf28}+0.47\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.8464ms 0.3154ms 3.1703 KOps/s 3.3405 KOps/s $\textbf{\color{#d91a1a}-5.09\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6205ms 0.2976ms 3.3597 KOps/s 3.3281 KOps/s $\color{#35bf28}+0.95\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.3537ms 6.0861ms 164.3083 Ops/s 162.8519 Ops/s $\color{#35bf28}+0.89\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.5187ms 0.3225ms 3.1012 KOps/s 3.2551 KOps/s $\color{#d91a1a}-4.73\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5867ms 0.2967ms 3.3704 KOps/s 3.3964 KOps/s $\color{#d91a1a}-0.77\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.3984ms 6.2762ms 159.3327 Ops/s 157.1012 Ops/s $\color{#35bf28}+1.42\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.9840ms 0.4684ms 2.1350 KOps/s 2.2138 KOps/s $\color{#d91a1a}-3.56\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6484ms 0.4567ms 2.1895 KOps/s 2.2659 KOps/s $\color{#d91a1a}-3.37\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 7.1008ms 5.4866ms 182.2628 Ops/s 186.2485 Ops/s $\color{#d91a1a}-2.14\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 9.0394ms 2.0486ms 488.1384 Ops/s 500.6798 Ops/s $\color{#d91a1a}-2.50\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 7.5337ms 1.2038ms 830.6751 Ops/s 843.5925 Ops/s $\color{#d91a1a}-1.53\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 7.3056ms 5.4515ms 183.4361 Ops/s 185.1198 Ops/s $\color{#d91a1a}-0.91\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 9.0637ms 2.0953ms 477.2690 Ops/s 432.6712 Ops/s $\textbf{\color{#35bf28}+10.31\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 2.2283ms 0.9812ms 1.0192 KOps/s 844.5484 Ops/s $\textbf{\color{#35bf28}+20.68\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.4922s 15.4814ms 64.5935 Ops/s 32.9808 Ops/s $\textbf{\color{#35bf28}+95.85\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 11.2418ms 2.2788ms 438.8344 Ops/s 420.2029 Ops/s $\color{#35bf28}+4.43\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 2.2600ms 1.2629ms 791.8105 Ops/s 778.3364 Ops/s $\color{#35bf28}+1.73\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 15.7146ms 15.3009ms 65.3555 Ops/s 64.3523 Ops/s $\color{#35bf28}+1.56\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 19.9433ms 17.6627ms 56.6166 Ops/s 56.6720 Ops/s $\color{#d91a1a}-0.10\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 20.3895ms 19.6389ms 50.9194 Ops/s 49.4949 Ops/s $\color{#35bf28}+2.88\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 20.0392ms 17.7718ms 56.2689 Ops/s 55.9645 Ops/s $\color{#35bf28}+0.54\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 19.7400ms 19.4821ms 51.3291 Ops/s 50.0560 Ops/s $\color{#35bf28}+2.54\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 21.1982ms 19.3558ms 51.6641 Ops/s 51.5384 Ops/s $\color{#35bf28}+0.24\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] max_alpha only works in DiscreteSACLoss if min_alpha is provided as well
2 participants