Reinforcement Learning Module, Part 2 by gabriel-trigo · Pull Request #98 · google/sbsim

gabriel-trigo · 2025-06-11T19:13:24Z

Updates the reinforcement learning module to include additional functionality.

Details:

Adds "eval.py" script to evaluate policies.
Adds "generate_gin_config_files.py" script to generate variations of gin environment config files.
Adds visualization module with features to save plots of evaluation results.
Adds implementation of td3 and ddpg agents.
Removes ambiguity between the Environment's step_interval and the BaseBuilding's time_step_sec property. Fixes Ambiguity between BaseBuilding's time_step_sec property and Environment's step_interval values #20.
Refactors filepath constants and temperature conversion functions. Closes Reorganize Temperature Conversion Functions and Constants #25.

s2t2

Hi @gabriel-trigo Thanks for the new PR. I have done a first pass review and made some comments. I will pull down the code and update some things to resolve my review comments, and let you know if I have any more comments or questions.

poetry.lock

pyproject.toml

s2t2 · 2025-06-12T19:38:06Z

smart_control/reinforcement_learning/utils/MultiEpisodeWrapper.py

@@ -0,0 +1,268 @@
+# -*- coding: utf-8 -*-


@gabriel-trigo is the MultiEpisodeWrapper used anywhere?

smart_control/simulator/constants.py

smart_control/reinforcement_learning/visualization/trajectory_plotter.py

s2t2 · 2025-06-12T19:14:54Z

smart_control/reinforcement_learning/agents/networks/td3_networks.py

let's move the td3 network logic here, or delete the empty file.

smart_control/reinforcement_learning/observers/trajectory_recorder_observer.py

smart_control/reinforcement_learning/scripts/eval.py

smart_control/reinforcement_learning/scripts/generate_gin_config_files.py

This commit introduces unit tests for parts of the reinforcement learning code added in PR google#98. Here's a summary of what I did: 1. I reviewed PR google#98, analyzing changes related to new RL scripts (eval, gin generation), TD3/DDPG agents, and visualization. 2. I fetched the code from PR google#98. 3. I attempted to run the RL scripts: - `generate_gin_config_files.py` ran successfully. - `train.py` failed with a `TypeError` in `tf_agents.policies.policy_saver.PolicySaver`, which prevented training and a full evaluation of `eval.py`. This indicates an issue with the TF-Agents setup or its usage in the PR. 4. I created unit tests for: - `smart_control/reinforcement_learning/scripts/generate_gin_config_files.py`: These tests cover reading the base configuration, substituting parameters, and generating output files. - `smart_control/reinforcement_learning/visualization/trajectory_plotter.py`: These tests cover the plotting methods for actions, rewards, and cumulative rewards, including how timestamps and empty data are handled. The tests for these two modules pass. I didn't pursue further testing of agent-specific code or environment wrappers due to the blocking issue with `train.py` and the TF-Agents environment.

…rl pipeline

…rking

…ts as well

s2t2 changed the title ~~feat: reinforcement learning PR#2; several additions/improvements to rl pipeline~~ Reinforcement Learning Module, Part 2 Jun 12, 2025

s2t2 mentioned this pull request Jun 12, 2025

Reinforcement Learning Module, Part 2 #22

Closed

1 task

s2t2 reviewed Jun 12, 2025

View reviewed changes

This comment was marked as resolved.

Sign in to view

s2t2 mentioned this pull request Jun 19, 2025

Jules - RL 2 Tests s2t2/sbsim#18

Draft

s2t2 force-pushed the PR_rl2-gabriel branch from 51a2d88 to 1d0363f Compare June 23, 2025 20:31

s2t2 force-pushed the PR_rl2-gabriel branch from 8956cfa to e7604b1 Compare July 10, 2025 18:37

s2t2 mentioned this pull request Jul 14, 2025

Fix spelling issues #106

Open

s2t2 mentioned this pull request Jul 28, 2025

Gin Config Issue - Get Histogram Path #115

Open

gabriel-trigo and others added 20 commits August 15, 2025 16:19

feat: reinforcement learning PR#2; several additions/improvements to …

8b8900b

…rl pipeline

Update pyproject.toml

e35953e

fix: fix linting errors of previous commit

385d7ee

Update PR Template

02cea62

Update PR Template

e377076

Restore original formatting

28d016c

Restore original formatting

792ca1d

Clean top of files

c65bc25

Refactor filepaths

c20efcb

Refactor filepaths

e9c2f34

Refactor and test temp conversion functions; closes google#25

ebbae9c

Refactor temp conversion tests

4ac8181

Review eval script

959728b

Remove redundant variable setting

e29eeb1

Fix failing test

b1a48ad

Repro generate configs script; use absl flags because argparse not wo…

8031d66

…rking

Update gitignore

763f60e

Test config file generation

27dae87

Test read config file

a7a127a

Fix file names - remove quote

5235615

s2t2 added 11 commits August 15, 2025 16:19

Describe the config generation script

7a8f1d2

Flags WIP

baa6ed4

Attempt to reproduce starter buffer script; fix google#115

a45f6cd

Test starter buffer population

adeacfc

Refactor test: use setup, teardown, and temp dir

3224585

WIP - reproduce train script, run into known issue

3b6f3a8

Hotfix known issue

1503fce

Generate example starter buffers for training and testing

0b36870

WIP - refactor and test RL agent trainer

034af1f

Regenerate starter buffer for testing

00072d7

Decrease number of training steps when testing

5cb19bf

s2t2 force-pushed the PR_rl2-gabriel branch from ce3a48c to 5cb19bf Compare August 15, 2025 16:20

s2t2 added 3 commits August 15, 2025 17:20

WIP - reproducing eval script - encounter env config errors

3d22490

Reproduce eval script

ea1d3aa

WIP - refactor eval script; need to save schedule policy results char…

3c53a8c

…ts as well

s2t2 mentioned this pull request Sep 3, 2025

Enhanced Occupancy Model #123

Merged

yuktakul04 mentioned this pull request Oct 14, 2025

Yukta/rl module finish #127

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reinforcement Learning Module, Part 2#98

Reinforcement Learning Module, Part 2#98
gabriel-trigo wants to merge 34 commits intogoogle:copybara_pushfrom
gabriel-trigo:PR_rl2-gabriel

gabriel-trigo commented Jun 11, 2025 •

edited by s2t2

Loading

Uh oh!

s2t2 left a comment

Uh oh!

Uh oh!

Uh oh!

This comment was marked as resolved.

Uh oh!

s2t2 Jun 12, 2025

Uh oh!

Uh oh!

Uh oh!

s2t2 Jun 12, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment was marked as resolved.

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

gabriel-trigo commented Jun 11, 2025 • edited by s2t2 Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

s2t2 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

This comment was marked as resolved.

Uh oh!

s2t2 Jun 12, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

s2t2 Jun 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment was marked as resolved.

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

gabriel-trigo commented Jun 11, 2025 •

edited by s2t2

Loading

s2t2 Jun 12, 2025 •

edited

Loading