Skip to content

Conversation

@gabriel-trigo
Copy link
Contributor

Fixes #20 (along with other changes)

This PR should be merged AFTER #18 and #19, as it builds on top of them. Consequently, it has the commits from those PRs, as well as new commits on top

The new commits add:

  1. eval.py script to evaluate policies
  2. generate_gin_config_files.py script to generate variations of gin environment config files
  3. Added visualization module with features to save plots of evaluation runs to eval results
  4. Added implementation of td3 and ddpg agents
  5. Other qol changes (see commit descriptions)
  • Tests pass

…tory is to have all reinforcement learning related code, and scripts to train and evaluate agents
…gin config file, and generates variations changing imporant parameters like time step length, start date and number of days on episode. Also added .bash script to illustrate how to use it
…ntations that were added in previous commits. Changed some default argument values, same for populate_starter_buffer.py script
…visualization module, which this observer uses to plot the graphs
…r of steps instead of episode steps to calculate percentage, leading to > 100% values
… Added saved_model_policy.py file to policies directory -- this file has a class which is used to load and interact with policies saved during training
…eter was redundant with the base_building's time_step_sec
…fore, it was wrongly using the first checkpoint, which gave the untrained agent performance)
@s2t2 s2t2 force-pushed the copybara_push branch from cfad6e2 to da325d2 Compare May 16, 2025 16:27
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this file is empty we can delete

logger = log.getLogger(__name__)

def all_actions_accepted(action_response: ActionResponse) -> bool:
logger = log.getLogger(__name__)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a duplicate logger from line 84?

train_step_counter=train_step_counter
)

return td3_agent_obj No newline at end of file
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you save the file in VS code with the suggested formatting extension it should automatically add the newline at the end of the file.

This comment was marked as off-topic.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's use snake case for the file name

@@ -0,0 +1,8 @@
#!/bin/bash

# Run the generator script

This comment was marked as outdated.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually let's have the Python code use default directories that exist inside the repo and are git-ignored. Let's move all these bash files to markdown commands listed in the docs. And hopefully we won't have to pass any directory paths because we can use the gitignored default directories.

first environment loaded.
"""
def __init__(self, scenario_config_paths: collections.abc.Sequence[str], create_env_fn: collections.abc.Callable):
"""
Copy link
Collaborator

@s2t2 s2t2 May 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like all these files are using four spaces instead of two. If you touch them with VS code after the suggested extensions are installed, it should automatically re-format

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that Pyink tool was fixed, the touching and saving in vs code should now work #95


logger = logging.getLogger(__name__)

class TrajectoryPlotter:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be cool to add a screenshot of the plots produced by each of the various plotting utilities introduced in this PR. we can then include the screenshots in the docs.

Copy link
Collaborator

@s2t2 s2t2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General comments from initial review:

  • let's fix all the merge conflicts
  • let's update the formatting of all files - should happen automatically when using the recommended VS code extensions
  • let's see if we can convert the bash files to README commands if possible. we will probably want to tweak various parameters (including filepaths as necessary), so checking in these commands may cause many updates
  • for import statements let's import a given function on a single line (can be very long - we should be allowing that right now with the updated formatting rules)
  • also! some tests would be ideal / nice to have, even if they are very simple

Let me know when you would like me to review the next iteration.

Thank you for all your contributions!

@gabriel-trigo
Copy link
Contributor Author

Hey, my bad, a pretty silly error here: I forgot to save the files after solving the merge conflicts.

The latest commits should address that. I also used isort to make styling improvements, so hopefully that is according to the google standards now @s2t2

Copy link
Collaborator

@s2t2 s2t2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @gabriel-trigo thanks for fixing the merge conflicts.

The biggest changes needed right now are to:

  1. Fix the file formatting in all files (tab using two spaces instead of four). You should be able to leverage the linting tools - it may help to install pre-commit hooks
  2. Remove the bash files, and move the Python commands in them to new or existing markdown files in the docs/ directory.
  3. Change all directory path references to use default directory paths that are relative to the repository root directory. we can leverage existing paths in "smart_control/reinforcement_learning/utils/config.py" and/or create new paths there as applicable

Ping me again when these changes are implemented and I can take another look.

Thank you!!

known_first_party = ["smart_control"]
skip_glob = ['smart_control/proto/*']


Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might need to re-add tqdm to the toml.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This td3 networks file looks empty. let's move the network classes here, or delete the file.


def __init__(
self,
input_tensor_spec,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be helpful to add type hints

)

def call(self, observations, step_type=None, network_state=(), training=False):
del step_type # Unused.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we remove this unused variable from the function signature ?

)

# Create critic networks if not provided
# Create critic networks if not provided
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a duplicate comment

)
logger = logging.getLogger(__name__)

def find_latest_checkpoint(policy_dir):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use a default directory here that is gitignored inside the repo?

# If we're here, either there's no checkpoints dir or no checkpoints in it
return None

def create_merged_saved_model(policy_dir):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

echo comments for using a default directory here as the default parameter value

return temp_dir

def evaluate_policy(
policy_dir,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

echo comments for using a default directory

parser = argparse.ArgumentParser(description='Evaluate a trained reinforcement learning policy')
parser.add_argument('--policy-dir', type=str, required=True, help='Path to the directory containing the saved policy. To \
use schedule policy, just type `schedule`')
parser.add_argument('--gin-config', type=str, default="/home/gabriel-user/projects/sbsim/smart_control/configs/resources/sb1/generated_configs/config_timestepsec-900_numdaysinepisode-7_starttimestamp-2023-07-06.gin", help='Path to the .gin config file')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's avoid using a user's personal path, and use a relative reference to a place in this repo instead

gin_config_path=gin_config_path,
experiment_name=args.experiment_name,
num_eval_episodes=args.num_eval_episodes
) No newline at end of file
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need line at end of file

@s2t2 s2t2 mentioned this pull request Jun 6, 2025
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like this file has some runtime errors

@s2t2 s2t2 changed the title Reinforcement learning2 gabriel Reinforcement Learning Module, Part 2 Jun 6, 2025
@s2t2
Copy link
Collaborator

s2t2 commented Jun 12, 2025

superseded by #98

@s2t2 s2t2 closed this Jun 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Ambiguity between BaseBuilding's time_step_sec property and Environment's step_interval values

2 participants