Release v05 #461

daphne-cornelisse · 2025-06-08T19:34:21Z

Description

GPUDrive v0.04 → v0.05

This update aims to support both:

Simple agents trained with single goal points (as in main, reliable sim agents)
More complex agents with richer observations and custom reward function (guidance, in dev)

The cleanest approach may be to create separate PPO run scripts for each setup, or use a single ppo script with different configs.

Todo

Resolve Incompatibilities

dev assumes episodes do not terminate early, while main does
↳ This discrepancy exists in both the C++ environment and the PPO wrapper
Unify or clearly separate default config values between the two branches

Documentation

Update gym environment README
Update local README (include links to paper and project site)

Validation

Verify that the reliable agents policy still works (currently fails in automatic test)
Confirm that the tutorials run without errors

Other

Decouple reward function logic from obs
Ensure kmax agents is 64 by default

* Configurable goal behavior * Make goal state feature optional * Fix network * Set default training settings * Correct the flow of things * Trigger goal behaviour only if goal reached --------- Co-authored-by: Aarav Pandya <ap7641@nyu.edu>

* raw untested changes * cleanup artifact * tensor conversion fix * working amortization script * amortization fixes * replace examples with vbd * fixes * revert to 32 agents * variable agent count * amortized womd agents

* Improve formatting tutorial 8 * Visualize rollouts with different rewards. * Make human-replay agents slightly darker * Add option for storing behavioral metrics * WIP * Analyze agent diversity * Analyze agent diversity v2 * Full diversity analysis * Merge in main * wip * Sync * Implement waypoint following agent * WIP * Set defaults * fix network * Small setting updates * Improve and extend options for waypoint following rewards * Eval new model * Formatting * Set default agent_type for fixed condition mode * minor * Apply reward weight sharing across environments for memory efficiency * Add condition mode to wrapper * Reduce max road points for sim speed up * Add agent with separate actor and critic network * Bug fix: checkpointing * wip * Set training defaults to best params * Add separate waypoint following agent * Merge dev into branch * Fix waypoint following implementation * Sbatch * Can successful learn waypoint following agent * Add goal state to ego state by default, so that agents know when the goal is reached. * Increase log window size * Remove the goal reward when following waypoints * Set roadpoints to default to avoid switching * Implement reference path in reward and observation * Bug fixes * Working kinematic metrics * Minor * Make logging realism metrics optional * WIP * Add simple agent * Update settings * Add condition in dones such tthat agents are not allowed to terminate before the log end * Update realism metrics and support for adding the reference speed * Update realism metrics and support for adding the reference speed * Settings * Add average displacement error * Set reward for reaching the goal * Minor * New defaults * Bug fix: Zero-out the waypoint distance computations for time steps where the reference logs are invalid. * More stable realism metrics by averaging over larger batches * Controll all agent types by default * Add option for jerk penalties * Add option for jerk penalties * Change: Agents cannot be terminated before end of episode length * Remove distance to last expert position from ego state * Update number of ego state constants accordingly * Update visualizer to match new conditions * Batch global -> local reference frame transformation * typo * Minor logging fix * New defaults * Replace jerk with single param * Condition on previous action if present * Name change * Faster resets * Cleanup * Integrate fb * Fix all reference-path-related bugs * Formatting * Useful debug notebook * Better default * Decrease steering angle ub from pi to pi/3 * Add agent obs to logging * Linting * Set group * Fix config

* inital commit for wosac eval * Force data processing to be on CPU * Add readme * fix the data extraction script * Force jax run on cpu only * Fixes and add data processing file * add wosac original eval script * agent init fix? * Add agent to test with * Ensure episodelen = 91 for wosac compatibility * Fixes to ensure WOSAC compatibility with env * Revert episode length to T-1 * Add trained policy and wosac eval baseline comparison pipeline * path fix * Clean up eval pipeline * Rename --------- Co-authored-by: Daphne Cornelisse <cor.daphne@gmail.com> Co-authored-by: kevin <kevinwinston184@gmail.com>

* inital commit for wosac eval * Force data processing to be on CPU * Add readme * fix the data extraction script * Force jax run on cpu only * Fixes and add data processing file * add wosac original eval script * agent init fix? * Feat/vbd amortize (#409) * raw untested changes * cleanup artifact * tensor conversion fix * working amortization script * amortization fixes * replace examples with vbd * fixes * revert to 32 agents * variable agent count * amortized womd agents * Improved reward conditioning and waypoint following support (#391) * Improve formatting tutorial 8 * Visualize rollouts with different rewards. * Make human-replay agents slightly darker * Add option for storing behavioral metrics * WIP * Analyze agent diversity * Analyze agent diversity v2 * Full diversity analysis * Merge in main * wip * Sync * Implement waypoint following agent * WIP * Set defaults * fix network * Small setting updates * Improve and extend options for waypoint following rewards * Eval new model * Formatting * Set default agent_type for fixed condition mode * minor * Apply reward weight sharing across environments for memory efficiency * Add condition mode to wrapper * Reduce max road points for sim speed up * Add agent with separate actor and critic network * Bug fix: checkpointing * wip * Set training defaults to best params * Add separate waypoint following agent * Merge dev into branch * Fix waypoint following implementation * Sbatch * Can successful learn waypoint following agent * Add goal state to ego state by default, so that agents know when the goal is reached. * Increase log window size * Remove the goal reward when following waypoints * Set roadpoints to default to avoid switching * Implement reference path in reward and observation * Bug fixes * Working kinematic metrics * Minor * Make logging realism metrics optional * WIP * Add simple agent * Update settings * Add condition in dones such tthat agents are not allowed to terminate before the log end * Update realism metrics and support for adding the reference speed * Update realism metrics and support for adding the reference speed * Settings * Add average displacement error * Set reward for reaching the goal * Minor * New defaults * Bug fix: Zero-out the waypoint distance computations for time steps where the reference logs are invalid. * More stable realism metrics by averaging over larger batches * Controll all agent types by default * Add option for jerk penalties * Add option for jerk penalties * Change: Agents cannot be terminated before end of episode length * Remove distance to last expert position from ego state * Update number of ego state constants accordingly * Update visualizer to match new conditions * Batch global -> local reference frame transformation * typo * Minor logging fix * New defaults * Replace jerk with single param * Condition on previous action if present * Name change * Faster resets * Cleanup * Integrate fb * Fix all reference-path-related bugs * Formatting * Useful debug notebook * Better default * Decrease steering angle ub from pi to pi/3 * Add agent obs to logging * Linting * Set group * Fix config * partial fix * inverse action fix * warmup impl * womd init fix * cleanup --------- Co-authored-by: Zixu Zhang <zixuz@princeton.edu> Co-authored-by: Daphne Cornelisse <cor.daphne@gmail.com> Co-authored-by: Zixu Zhang <zixu@umich.edu> Co-authored-by: Daphne Cornelisse <33460159+daphne-cornelisse@users.noreply.github.com>

* Update settings * Add warning to catch unnormalized features * Clean done logic. * Restructure expert suggestions under _guidance_ * Update settings * Add warning to catch unnormalized features * Clean done logic. * Restructure expert suggestions under _guidance_ * Update WOSAC eval readme * Provide agents with full speed guidance trajs * Add support for headings * Add vel_xy * Improve code efficiency * Small shape bug fix * Improve wosac eval setting * Update best 1-scene policy * Refactor gym env with unified guidance mode. * Update visualizer with unified plotting of reference traj * Minor * Transform reference headings to local coordinate frame. * Give value network a bit more capacity * Give value network a bit more capacity * Add dataframe code * Minor * Add optional video logging to wosac script * Update init mode * Visualizer bug fix * Minor * wosac eval updates and new cpts

* detect parked script * default arg change * parked vehicle mask * train and eval init modes * Minor fixes --------- Co-authored-by: kevin <kevinwinston184@gmail.com>

* Improve agent pov plotting utils * Delete old unused function * Fix * Add lidar obs option to plot agent observation * wip * Revert defaults

* Delete obsolete vbd functions * Format * Remove all instances of use_vbd and alike -> unified guidance mode * Create dataclass struct for VBD online predictionsions * Delete world_time_steps as it is no longer used * Leaving a todo * Make sure to demean the VBD predicted trajectories * Data analysis and minor changes * Guidance data analysis notebook * Script to process guidance data * Fixes * Fixes * Update nb * Fixes and make sure to always wrap the yaws. * Fixes and make sure to always wrap the yaws. * Example data to work with

…#427) * Intended usage for z-axis * Access average z pos (elevation) from logs * Fix intended usage of avg_z * fixes * z pos fixes * Fix bug by converting all vals in tensor to floats * remove print statement * Cleanup --------- Co-authored-by: kevin <kevinwinston184@gmail.com>

* Small improvements * Minor training improvements * Rebase over dev * Bookkeeping * Bookkeeping * Small fixes for gpu * wip: config * Make adding the action optional * wip * Analysis notebooks * WIP: new reward mode * Frequency * Make sure to normalize rewards and other changes * Minor * Update model * Add bonus at end of episode mechanism * Fixes * Improve renderer * Rmv one-hot encoding * Fix render * Adding back the whole shazam * Naming stuff * Add data folder as cmdline arg * Add data folder as cmdline arg * Minor bug fixes * Cleanup and better error message * Align reward components scale

* model looks kinda okay? * think we added the steer angle correctly to the env * Small bug fix --------- Co-authored-by: Daphne Cornelisse <cor.daphne@gmail.com>

* add support for model switching * remove memory pinning * revert changed values

* Bug fix: Ensure there is always a warmup period of at least 10 steps when using vbd_online * online guidance fixes * minor fix * changes to get 91 steps * Keep episodeLen at 90, and don't use it toread in the logs * Add back warnings / idiot proofing * tiny fix * fix online guidance * Set max agents to 32 --------- Co-authored-by: kevin <kevinwinston184@gmail.com>

* Small improvements * Minor training improvements * Rebase over dev * Bookkeeping * Bookkeeping * Small fixes for gpu * wip: config * Make adding the action optional * wip * Analysis notebooks * WIP: new reward mode * Frequency * Make sure to normalize rewards and other changes * Minor * Update model * Add bonus at end of episode mechanism * Fixes * Improve renderer * Rmv one-hot encoding * Fix render * Adding back the whole shazam * Naming stuff * Add data folder as cmdline arg * Add data folder as cmdline arg * Minor bug fixes * Cleanup and better error message * Align reward components scale * Push models * Viz improvements * Data analysis * Add guidance dropout mask * Add dropout as option for training * Add dropout as option for training * Bug fix: only count valid points * Increase network size to 200K params * Bug fix * Larger net * Set controlled agent default to 32 * Minor improvements * WIP * Make agents their original size * Minor * Remove collision state from ego * Plotting stuff * wip * More plotting stuff * Minor * Fixes * Reward improvements * Reward improvements * Give agent a bit more road graph information * Visualizer bug fix: clean up axis before generating new plot * Formatting * Change dropout mechanism such that dguidance_dropout_prob represents the maximum dropout probability [cover wide range]. * Add mechanism to discourage agents from turning around for bonus * Fix sbatch generation script * Add speed penalty if end of traj is reached * wip * Some eval fixes * Eval wip * Fix action space bug * Fix defaults * VBD amortize shape bug fix * rmv speed penalty * Settings * Fig * huh? * Replace end of route bonus with small jerk penalty * Many improvements * Exclude data * Delete checkpoints * Minor

* Fix: ensure vbd_online can be used on cuda and cpu. * Small updates * WOSAC defaults * Dynamics model fix * Minor

* dataset script * wosac dataset script * parallelized wosac script * pkl support * vbd cuda ---------

* Fix: ensure vbd_online can be used on cuda and cpu. * Small updates * WOSAC defaults * Dynamics model fix * Minor * Update sbatch * Code to make latex wosac tables * pkl file * Add smoothness * Add new dropout mode * WOSAC eval improvements * Add figures * Increment progress based on visible route points only. * Files * Some small updates

* Fix: ensure vbd_online can be used on cuda and cpu. * Small updates * WOSAC defaults * Dynamics model fix * Minor * Update sbatch * Code to make latex wosac tables * pkl file * Add smoothness * Add new dropout mode * WOSAC eval improvements * Add figures * Increment progress based on visible route points only. * Files * Some small updates * Minor * WOSAC eval

Co-authored-by: Pragnay Mandavilli <pm3881@ga014.hpc.nyu.edu>

* Testing agent type in obs * Implement type-aware action space * Remove padding agent * Cleanup * Comment out cone view line for now * Update angle ranges

* fix empty geometry handling and out of bounds indices * add safer access and overflow prevention * allow flexible no of elements

Add occlusion to observation

Fix: pass config parameters into EnvConfig object in PufferEnv

* Add tl_states to extraction script. * WIP * Add back metadata * Add tl_states struct and access through simulator. * Add time index * Cleanup * Cleanup * SMall fix * Add tl state data struct * Add minimum test script * Add minimum test script * [Work In progress] Fixed initialization and made TL a singleton tensor * Fix json init and export * mean centering * Test and omit unnesecessary code in tl obs * Changed data access format in python code * Added a more interpretable positional element arrays * Unpack tl_states correctly * Add tl state plotting function in visualizer * mini bug * Fix traffic lights by exporting everything as float32 * Improve colors * Remove test file --------- Co-authored-by: Aarav Pandya <ap7641@nyu.edu> Co-authored-by: Eugene Vinitsky <eugenevinitsky@users.noreply.github.com> Co-authored-by: Pragnay Mandavilli <pm3881@ga034.hpc.nyu.edu> Co-authored-by: Pragnay Mandavilli <pm3881@ga014.hpc.nyu.edu>

Co-authored-by: Ellington Kirby <ellingtonkirby@gmail.com>

* add head_tilt_actions config as alternative to linspaced action values * add head_tilt_actions config as alternative to linspaced action values * fix parameters for action handling * remove view cone if view cone is 360° + make compatible with traffic lights

daphne-cornelisse and others added 30 commits April 7, 2025 13:06

Control goal behavior (#402)

1e0566c

* Configurable goal behavior * Make goal state feature optional * Fix network * Set default training settings * Correct the flow of things * Trigger goal behaviour only if goal reached --------- Co-authored-by: Aarav Pandya <ap7641@nyu.edu>

Set add_goal_state=False by default

0734514

Merge remote-tracking branch 'origin/main' into dev

8ada1fe

Feat/vbd amortize (#409)

ddc8571

* raw untested changes * cleanup artifact * tensor conversion fix * working amortization script * amortization fixes * replace examples with vbd * fixes * revert to 32 agents * variable agent count * amortized womd agents

add 2025 wosac eval (#420)

8ae4903

Add WOSAC initialization modes (#426)

a7512ab

* detect parked script * default arg change * parked vehicle mask * train and eval init modes * Minor fixes --------- Co-authored-by: kevin <kevinwinston184@gmail.com>

Better agent POV visualization and add LiDAr (#424)

131626b

* Improve agent pov plotting utils * Delete old unused function * Fix * Add lidar obs option to plot agent observation * wip * Revert defaults

Hot model fix

47713ff

Integrate optional smoothening pipeline for guidance data. (#434)

f965066

model looks kinda okay? (#439)

4e10440

* model looks kinda okay? * think we added the steer angle correctly to the env * Small bug fix --------- Co-authored-by: Daphne Cornelisse <cor.daphne@gmail.com>

Ev/memory decrease (#442)

e0a32c8

* add support for model switching * remove memory pinning * revert changed values

update log for each scenario (#433)

1cad3fc

make vbd 91 steps (#446)

f1dd3be

Fix dynamics model and some setting updates (#447)

605cd4b

* Fix: ensure vbd_online can be used on cuda and cpu. * Small updates * WOSAC defaults * Dynamics model fix * Minor

wosac dataset (#448)

473e3e8

* dataset script * wosac dataset script * parallelized wosac script * pkl support * vbd cuda ---------

Small fix

58620b1

fix the wosac eval

2cffa6a

Merge branch 'dev' of github.com:Emerge-Lab/gpudrive into dev

ffbb5f3

fix amortization script (#452)

ad3ce01

mpragnay and others added 13 commits May 30, 2025 06:49

Added configs, params and code for the view cone setting (#454)

56529c2

Co-authored-by: Pragnay Mandavilli <pm3881@ga014.hpc.nyu.edu>

Type-aware action space (#455)

3db4deb

* Testing agent type in obs * Implement type-aware action space * Remove padding agent * Cleanup * Comment out cone view line for now * Update angle ranges

Fix out of range error for VBD trajectory (#456)

60081e2

* fix empty geometry handling and out of bounds indices * add safer access and overflow prevention * allow flexible no of elements

simple occlusion check working

bc62868

add occlusion check with nested loop instead of bvh

442b210

ray trace against 8 corner points instead of only center

e7ebc10

change vector to array for gpu compatibility

6f28c63

resolved rebase conflicts with dev

254deb4

add action for head tilt animation

1138290

change line style

909655f

update config files to new parameters

cd3ccf3

set full view and no occlusion as default

0b674cb

add addtional sampling points for occlusion check

39aa469

daphne-cornelisse added the enhancement New feature or request label Jun 8, 2025

rjs02 and others added 7 commits June 8, 2025 21:42

rename parameter and add documentation

aecae0c

Merge pull request #460 from Emerge-Lab/rs/occluded-obs

ef57e8b

Add occlusion to observation

pass config parameters into config object

f492dd0

Merge pull request #463 from Emerge-Lab/rs/fix-pufferenv-config

957f57e

Fix: pass config parameters into EnvConfig object in PufferEnv

[FIX] add back missing _set_continuous_action_space (#470)

1f58ba6

Co-authored-by: Ellington Kirby <ellingtonkirby@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Release v05 #461

Release v05 #461

Uh oh!

daphne-cornelisse commented Jun 8, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

Release v05 #461

Are you sure you want to change the base?

Release v05 #461

Uh oh!

Conversation

daphne-cornelisse commented Jun 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Todo

Resolve Incompatibilities

Documentation

Validation

Other

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

daphne-cornelisse commented Jun 8, 2025 •

edited

Loading