Reinforcement Learning Module, Part 2#98
Open
gabriel-trigo wants to merge 34 commits intogoogle:copybara_pushfrom
Open
Reinforcement Learning Module, Part 2#98gabriel-trigo wants to merge 34 commits intogoogle:copybara_pushfrom
gabriel-trigo wants to merge 34 commits intogoogle:copybara_pushfrom
Conversation
1 task
s2t2
reviewed
Jun 12, 2025
Collaborator
s2t2
left a comment
There was a problem hiding this comment.
Hi @gabriel-trigo Thanks for the new PR. I have done a first pass review and made some comments. I will pull down the code and update some things to resolve my review comments, and let you know if I have any more comments or questions.
| @@ -0,0 +1,268 @@ | |||
| # -*- coding: utf-8 -*- | |||
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
Collaborator
There was a problem hiding this comment.
@gabriel-trigo is the MultiEpisodeWrapper used anywhere?
smart_control/reinforcement_learning/visualization/trajectory_plotter.py
Outdated
Show resolved
Hide resolved
Collaborator
There was a problem hiding this comment.
let's move the td3 network logic here, or delete the empty file.
smart_control/reinforcement_learning/observers/trajectory_recorder_observer.py
Outdated
Show resolved
Hide resolved
smart_control/reinforcement_learning/scripts/generate_gin_config_files.py
Outdated
Show resolved
Hide resolved
This comment was marked as resolved.
This comment was marked as resolved.
s2t2
pushed a commit
to s2t2/sbsim
that referenced
this pull request
Jun 19, 2025
This commit introduces unit tests for parts of the reinforcement learning code added in PR google#98. Here's a summary of what I did: 1. I reviewed PR google#98, analyzing changes related to new RL scripts (eval, gin generation), TD3/DDPG agents, and visualization. 2. I fetched the code from PR google#98. 3. I attempted to run the RL scripts: - `generate_gin_config_files.py` ran successfully. - `train.py` failed with a `TypeError` in `tf_agents.policies.policy_saver.PolicySaver`, which prevented training and a full evaluation of `eval.py`. This indicates an issue with the TF-Agents setup or its usage in the PR. 4. I created unit tests for: - `smart_control/reinforcement_learning/scripts/generate_gin_config_files.py`: These tests cover reading the base configuration, substituting parameters, and generating output files. - `smart_control/reinforcement_learning/visualization/trajectory_plotter.py`: These tests cover the plotting methods for actions, rewards, and cumulative rewards, including how timestamps and empty data are handled. The tests for these two modules pass. I didn't pursue further testing of agent-specific code or environment wrappers due to the blocking issue with `train.py` and the TF-Agents environment.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Updates the reinforcement learning module to include additional functionality.
Details:
step_intervaland the BaseBuilding'stime_step_secproperty. Fixes Ambiguity between BaseBuilding'stime_step_secproperty and Environment'sstep_intervalvalues #20.