Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
8b8900b
feat: reinforcement learning PR#2; several additions/improvements to …
gabriel-trigo Jun 11, 2025
e35953e
Update pyproject.toml
s2t2 Jun 12, 2025
385d7ee
fix: fix linting errors of previous commit
gabriel-trigo Jun 12, 2025
02cea62
Update PR Template
s2t2 Jun 23, 2025
e377076
Update PR Template
s2t2 Jun 23, 2025
28d016c
Restore original formatting
s2t2 Jun 23, 2025
792ca1d
Restore original formatting
s2t2 Jun 23, 2025
c65bc25
Clean top of files
s2t2 Jun 24, 2025
c20efcb
Refactor filepaths
s2t2 Jun 24, 2025
e9c2f34
Refactor filepaths
s2t2 Jun 24, 2025
ebbae9c
Refactor and test temp conversion functions; closes #25
s2t2 Jun 24, 2025
4ac8181
Refactor temp conversion tests
s2t2 Jun 24, 2025
959728b
Review eval script
s2t2 Jun 24, 2025
e29eeb1
Remove redundant variable setting
s2t2 Jun 24, 2025
b1a48ad
Fix failing test
s2t2 Jun 24, 2025
8031d66
Repro generate configs script; use absl flags because argparse not wo…
s2t2 Jun 26, 2025
763f60e
Update gitignore
s2t2 Jun 26, 2025
27dae87
Test config file generation
s2t2 Jun 26, 2025
a7a127a
Test read config file
s2t2 Jun 26, 2025
5235615
Fix file names - remove quote
s2t2 Jul 10, 2025
7a8f1d2
Describe the config generation script
s2t2 Jul 10, 2025
baa6ed4
Flags WIP
s2t2 Jul 11, 2025
a45f6cd
Attempt to reproduce starter buffer script; fix #115
s2t2 Jul 28, 2025
adeacfc
Test starter buffer population
s2t2 Jul 29, 2025
3224585
Refactor test: use setup, teardown, and temp dir
s2t2 Jul 29, 2025
3b6f3a8
WIP - reproduce train script, run into known issue
s2t2 Aug 11, 2025
1503fce
Hotfix known issue
s2t2 Aug 11, 2025
0b36870
Generate example starter buffers for training and testing
s2t2 Aug 12, 2025
034af1f
WIP - refactor and test RL agent trainer
s2t2 Aug 12, 2025
00072d7
Regenerate starter buffer for testing
s2t2 Aug 13, 2025
5cb19bf
Decrease number of training steps when testing
s2t2 Aug 13, 2025
3d22490
WIP - reproducing eval script - encounter env config errors
s2t2 Aug 15, 2025
ea1d3aa
Reproduce eval script
s2t2 Aug 22, 2025
3c53a8c
WIP - refactor eval script; need to save schedule policy results char…
s2t2 Aug 22, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 3 additions & 5 deletions .github/ISSUE_TEMPLATE.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,14 @@
## Expected Behavior


## Actual Behavior


## Steps to Reproduce the Problem

1.
1.
1.
2.
3.

## Specifications

- Version:
- Platform:
- Platform:
30 changes: 26 additions & 4 deletions .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,28 @@
Fixes #<issue_number_goes_here>
## Description

> It's a good idea to open an issue first for discussion.
[Provide a one sentence summary of the changes implemented.]

- [ ] Tests pass
- [ ] Appropriate changes to documentation are included in the PR
[Link to related issues (e.g. "Closes #123", "Resolves #456").]

## Details

Details:

- [Provide additional details, as applicable.]

- [Provide additional details, as applicable.]

- [Provide additional details, as applicable.]

## Checklist

- [ ] I have read the [Contributor's Guide](https://google.github.io/sbsim/contributing/).
- [ ] I have signed the [Contributor License Agreement](https://cla.developers.google.com/) (first time contributors only).
- [ ] I have set up [pre-commit hooks](https://google.github.io/sbsim/contributing/#pre-commit-hooks) by running `pre-commit install` (one time only), and the pre-commit hooks pass.
- [ ] I have added appropriate [unit tests](https://google.github.io/sbsim/contributing/#testing), and the tests pass.
- [ ] I have added [docstrings](https://google.github.io/sbsim/contributing/#documentation) and updated the documentation as necessary, and I have previewed the [documentation site](https://google.github.io/sbsim/docs-site/) locally to make sure things look good.
- [ ] I have self-reviewed my code (especially important if using AI agents).

---

**Thank you for your contribution!**
29 changes: 21 additions & 8 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -21,15 +21,28 @@ data/sb1.zip
data/sb1/

# results files:
*/**/output_data/
*/**/metrics/
**/videos/
**/train/
**/eval/
smart_control/learning/
#*/**/output_data/
#*/**/metrics/
#**/videos/
#**/train/
#**/eval/

smart_control/configs/resources/sb1/train_sim_configs/generated/
# todo: use temp dir instead:
smart_control/configs/resources/sb1/train_sim_configs/generation_test/

smart_control/simulator/videos
smart_control/refactor/data/
smart_control/refactor/experiment_results/

smart_control/reinforcement_learning/data/starter_buffers/*
!smart_control/reinforcement_learning/data/starter_buffers/.gitkeep
!smart_control/reinforcement_learning/data/starter_buffers/default
!smart_control/reinforcement_learning/data/starter_buffers/test

smart_control/reinforcement_learning/data/experiment_results/*
!smart_control/reinforcement_learning/data/experiment_results/.gitkeep

smart_control/reinforcement_learning/data/experiment_eval/*
!smart_control/reinforcement_learning/data/experiment_eval/.gitkeep

# jupyter notebook checkpoints:
smart_control/notebooks/.ipynb_checkpoints/
Expand Down
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -37,4 +37,4 @@ repos:
rev: 0.7.22
hooks:
- id: mdformat
exclude: ^docs/api/
exclude: ^docs/api/|^\.github/
4 changes: 4 additions & 0 deletions docs/api/reinforcement_learning/scripts.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
# Scripts

::: smart_control.reinforcement_learning.scripts.generate_gin_configs

::: smart_control.reinforcement_learning.scripts.populate_starter_buffer

::: smart_control.reinforcement_learning.scripts.train

::: smart_control.reinforcement_learning.scripts.eval
4 changes: 4 additions & 0 deletions docs/contributing.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,10 @@ pytest --disable-pytest-warnings -k your_test_name_here
# ignore specific test files and directories:
pytest --ignore=path/to/your/test.py --ignore=path/to/other/

# display more logs:
pytest --disable-pytest-warnings -s --log-cli-level=INFO path/to/your/test.py
# display all logs:
pytest --disable-pytest-warnings -s --log-cli-level=DEBUG path/to/your/test.py
```

## Linting
Expand Down
125 changes: 125 additions & 0 deletions docs/guides/reinforcement_learning/scripts.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
# Reinforcement Learning Scripts

## Configuration Generation

By default, when training an RL agent, it will use configuration options defined
in the base gin config file (see
"smart_control/configs/resources/\<dataset_id>/sim_config.gin").

However if you would like to use different configuration options, you can use
the configuration generation script to flexibly create alternative config files
with slight modifications to the base config file.

Generate different configuration files to use during training:

```sh
python -m smart_control.reinforcement_learning.scripts.generate_gin_configs
```

By default, the script will use the following parameter grid:

- `time_steps`: `['300']`
- `num_days`: `['1', '7', '14', '30']`
- `start_timestamps`: ['2023-07-06']

Optionally pass any of these command line flags to customize the parameter grid:

```sh
python -m smart_control.reinforcement_learning.scripts.generate_gin_configs \
--time_steps 300,600,900 \
--num_days 1,7,14 \
--start_timestamps 2023-07-06,2023-08-06,2023-10-06
```

This script will generate a different file for each combination of custom
parameter values you specify. The files will be written to the
"smart_control/configs/resources/\<dataset_id>/train_sim_configs/generated"
directory. Each file name will contain the parameter values you choose (e.g.
"step_300_days_1_start_20230706.gin").

## Starter Buffer Population

Populate an initial replay buffer with initial exploration data, to provide a
starting point when training RL agents:

```sh
python -m smart_control.reinforcement_learning.scripts.populate_starter_buffer
```

```sh
python -m smart_control.reinforcement_learning.scripts.populate_starter_buffer \
--buffer_name buffer_xyz
--config_path smart_control/configs/resources/sb1/sim_config.gin
```

This creates a directory corresponding with the buffer name in
"smart_control/reinforcement_learning/data/starter_buffers".

A "default" starter buffer has been created for example purposes:

```sh
python -m smart_control.reinforcement_learning.scripts.populate_starter_buffer \
--buffer_name default \
--num_runs 5 \
--capacity 50000 \
--steps_per_run 100 \
--sequence_length 2
```

A "test" starter buffer has been created for testing purposes:

```sh
python -m smart_control.reinforcement_learning.scripts.populate_starter_buffer \
--buffer_name test \
--num_runs 1 \
--steps_per_run 1 \
--capacity 100 \
--sequence_length 2
```

## RL Agent Training

Train a reinforcement learning agent, choosing a unique name for the experiment:

```sh
python -m smart_control.reinforcement_learning.scripts.train \
--experiment_name="my-experiment-1"
```

```sh
python -m smart_control.reinforcement_learning.scripts.train \
--experiment_name="my-experiment-2" \
--starter_buffer_name="default" \
--agent_type="sac" \
--learner_iterations=3 \
--train_iterations=10 \
--collect_steps_per_training_iteration=5
```

This will generate a new experiment results directory under
"smart_control/reinforcement_learning/data/experiment_results/`experiment_name`".
In the experiment results directory will be the following files and directories:

- "collect" directory
- "eval" directory
- "metrics" directory
- "replay_buffer" directory
- "experiment_parameters.json" file
- "experiment_parameters.txt" file

## Evaluation

Evaluate a previously trained agent, specifying an experiment name that
references an existing experiment results directory:

```sh
python -m smart_control.reinforcement_learning.scripts.eval \
--eval_experiment_name my-experiment-1
```

```sh
python scripts/eval.py
--policy-dir experiment_results/ddpg_train_run-july-6th_2025_04_07-12:50:40/policies/
--gin-config /home/gabriel-user/projects/sbsim/smart_control/configs/resources/sb1/generated_configs/config_timestepsec-900_numdaysinepisode-14_starttimestamp-2023-11-06.gin
--experiment-name ddpg_train-summer_eval-winter
```
4 changes: 2 additions & 2 deletions docs/setup/linux.md
Original file line number Diff line number Diff line change
Expand Up @@ -120,7 +120,7 @@ cd ../..

By default, simulation videos are stored in the "simulator/videos" directory
(which is ignored from version control). If you would like to customize this
location, use the `SIM_VIDEOS_DIRPATH` environment variable.
location, use the `SIM_VIDEOS_DIR` environment variable.

You can pass environment variable(s) at runtime, or create a local ".env" file
and set your desired value(s) there:
Expand All @@ -129,7 +129,7 @@ and set your desired value(s) there:
# this is the ".env" file...

# customizing the directory where simulation videos are stored:
SIM_VIDEOS_DIRPATH="/cns/oz-d/home/smart-buildings-control-team/smart-buildings/geometric_sim_videos/"
SIM_VIDEOS_DIR="/cns/oz-d/home/smart-buildings-control-team/smart-buildings/geometric_sim_videos/"
```

## Notebook Setup
Expand Down
4 changes: 2 additions & 2 deletions docs/setup/mac.md
Original file line number Diff line number Diff line change
Expand Up @@ -121,7 +121,7 @@ cd ../..

By default, simulation videos are stored in the "simulator/videos" directory
(which is ignored from version control). If you would like to customize this
location, use the `SIM_VIDEOS_DIRPATH` environment variable.
location, use the `SIM_VIDEOS_DIR` environment variable.

You can pass environment variable(s) at runtime, or create a local ".env" file
and set your desired value(s) there:
Expand All @@ -130,7 +130,7 @@ and set your desired value(s) there:
# this is the ".env" file...

# customizing the directory where simulation videos are stored:
SIM_VIDEOS_DIRPATH="/cns/oz-d/home/smart-buildings-control-team/smart-buildings/geometric_sim_videos/"
SIM_VIDEOS_DIR="/cns/oz-d/home/smart-buildings-control-team/smart-buildings/geometric_sim_videos/"
```

## Notebook Setup
Expand Down
Loading
Loading