You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If there's a simple way to do this without breaking API / not adhering to Gymnasium VectorEnv API then sure. The current proposed fix breaks the API however
Thanks for the feedback about API compatibility. Based on the VectorEnv implementation in gymnasium/vector/vector_env.py, I'd like to propose an alternative solution that maintains API compatibility;
Solution: Add dones to the infos dictionary
In ManiSkillVectorEnv.step():
defstep(self, actions):
obs, rew, terminations, truncations, infos=self._env.step(actions)
dones=torch.logical_or(terminations, truncations)
infos['dones'] =dones# Add dones to infos dictionaryreturnobs, rew, terminations, truncations, infos
In ppo.py, update to use the dones from infos:
next_obs, reward, terminations, truncations, infos=envs.step(clip_action(action))
next_done=infos['dones'].to(torch.float32) # Get dones from infos
Benefits:
Eliminates redundant computation
Maintains Gymnasium VectorEnv API compatibility
Follows the pattern shown in vector_env.py for adding additional info
Clean and efficient solution
Would this approach work better? I would be happy to submit a PR if this looks good.
There seems to be a redundant computation in the current implementation where
dones
is calculated twice:ManiSkillVectorEnv.step()
:ppo.py
:Proposed Solution
We can eliminate this redundancy by:
ManiSkillVectorEnv.step()
to return the pre-computeddones
:ppo.py
to use the returneddones
:Benefits
I'm happy to help submit a PR implementing these changes if this proposal seems reasonable.
The text was updated successfully, but these errors were encountered: