-
Notifications
You must be signed in to change notification settings - Fork 79
Release v05 #461
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
daphne-cornelisse
wants to merge
50
commits into
main
Choose a base branch
from
dev
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Release v05 #461
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* Configurable goal behavior * Make goal state feature optional * Fix network * Set default training settings * Correct the flow of things * Trigger goal behaviour only if goal reached --------- Co-authored-by: Aarav Pandya <ap7641@nyu.edu>
* raw untested changes * cleanup artifact * tensor conversion fix * working amortization script * amortization fixes * replace examples with vbd * fixes * revert to 32 agents * variable agent count * amortized womd agents
* Improve formatting tutorial 8 * Visualize rollouts with different rewards. * Make human-replay agents slightly darker * Add option for storing behavioral metrics * WIP * Analyze agent diversity * Analyze agent diversity v2 * Full diversity analysis * Merge in main * wip * Sync * Implement waypoint following agent * WIP * Set defaults * fix network * Small setting updates * Improve and extend options for waypoint following rewards * Eval new model * Formatting * Set default agent_type for fixed condition mode * minor * Apply reward weight sharing across environments for memory efficiency * Add condition mode to wrapper * Reduce max road points for sim speed up * Add agent with separate actor and critic network * Bug fix: checkpointing * wip * Set training defaults to best params * Add separate waypoint following agent * Merge dev into branch * Fix waypoint following implementation * Sbatch * Can successful learn waypoint following agent * Add goal state to ego state by default, so that agents know when the goal is reached. * Increase log window size * Remove the goal reward when following waypoints * Set roadpoints to default to avoid switching * Implement reference path in reward and observation * Bug fixes * Working kinematic metrics * Minor * Make logging realism metrics optional * WIP * Add simple agent * Update settings * Add condition in dones such tthat agents are not allowed to terminate before the log end * Update realism metrics and support for adding the reference speed * Update realism metrics and support for adding the reference speed * Settings * Add average displacement error * Set reward for reaching the goal * Minor * New defaults * Bug fix: Zero-out the waypoint distance computations for time steps where the reference logs are invalid. * More stable realism metrics by averaging over larger batches * Controll all agent types by default * Add option for jerk penalties * Add option for jerk penalties * Change: Agents cannot be terminated before end of episode length * Remove distance to last expert position from ego state * Update number of ego state constants accordingly * Update visualizer to match new conditions * Batch global -> local reference frame transformation * typo * Minor logging fix * New defaults * Replace jerk with single param * Condition on previous action if present * Name change * Faster resets * Cleanup * Integrate fb * Fix all reference-path-related bugs * Formatting * Useful debug notebook * Better default * Decrease steering angle ub from pi to pi/3 * Add agent obs to logging * Linting * Set group * Fix config
* inital commit for wosac eval * Force data processing to be on CPU * Add readme * fix the data extraction script * Force jax run on cpu only * Fixes and add data processing file * add wosac original eval script * agent init fix? * Add agent to test with * Ensure episodelen = 91 for wosac compatibility * Fixes to ensure WOSAC compatibility with env * Revert episode length to T-1 * Add trained policy and wosac eval baseline comparison pipeline * path fix * Clean up eval pipeline * Rename --------- Co-authored-by: Daphne Cornelisse <cor.daphne@gmail.com> Co-authored-by: kevin <kevinwinston184@gmail.com>
* inital commit for wosac eval * Force data processing to be on CPU * Add readme * fix the data extraction script * Force jax run on cpu only * Fixes and add data processing file * add wosac original eval script * agent init fix? * Feat/vbd amortize (#409) * raw untested changes * cleanup artifact * tensor conversion fix * working amortization script * amortization fixes * replace examples with vbd * fixes * revert to 32 agents * variable agent count * amortized womd agents * Improved reward conditioning and waypoint following support (#391) * Improve formatting tutorial 8 * Visualize rollouts with different rewards. * Make human-replay agents slightly darker * Add option for storing behavioral metrics * WIP * Analyze agent diversity * Analyze agent diversity v2 * Full diversity analysis * Merge in main * wip * Sync * Implement waypoint following agent * WIP * Set defaults * fix network * Small setting updates * Improve and extend options for waypoint following rewards * Eval new model * Formatting * Set default agent_type for fixed condition mode * minor * Apply reward weight sharing across environments for memory efficiency * Add condition mode to wrapper * Reduce max road points for sim speed up * Add agent with separate actor and critic network * Bug fix: checkpointing * wip * Set training defaults to best params * Add separate waypoint following agent * Merge dev into branch * Fix waypoint following implementation * Sbatch * Can successful learn waypoint following agent * Add goal state to ego state by default, so that agents know when the goal is reached. * Increase log window size * Remove the goal reward when following waypoints * Set roadpoints to default to avoid switching * Implement reference path in reward and observation * Bug fixes * Working kinematic metrics * Minor * Make logging realism metrics optional * WIP * Add simple agent * Update settings * Add condition in dones such tthat agents are not allowed to terminate before the log end * Update realism metrics and support for adding the reference speed * Update realism metrics and support for adding the reference speed * Settings * Add average displacement error * Set reward for reaching the goal * Minor * New defaults * Bug fix: Zero-out the waypoint distance computations for time steps where the reference logs are invalid. * More stable realism metrics by averaging over larger batches * Controll all agent types by default * Add option for jerk penalties * Add option for jerk penalties * Change: Agents cannot be terminated before end of episode length * Remove distance to last expert position from ego state * Update number of ego state constants accordingly * Update visualizer to match new conditions * Batch global -> local reference frame transformation * typo * Minor logging fix * New defaults * Replace jerk with single param * Condition on previous action if present * Name change * Faster resets * Cleanup * Integrate fb * Fix all reference-path-related bugs * Formatting * Useful debug notebook * Better default * Decrease steering angle ub from pi to pi/3 * Add agent obs to logging * Linting * Set group * Fix config * partial fix * inverse action fix * warmup impl * womd init fix * cleanup --------- Co-authored-by: Zixu Zhang <zixuz@princeton.edu> Co-authored-by: Daphne Cornelisse <cor.daphne@gmail.com> Co-authored-by: Zixu Zhang <zixu@umich.edu> Co-authored-by: Daphne Cornelisse <33460159+daphne-cornelisse@users.noreply.github.com>
* Update settings * Add warning to catch unnormalized features * Clean done logic. * Restructure expert suggestions under _guidance_ * Update settings * Add warning to catch unnormalized features * Clean done logic. * Restructure expert suggestions under _guidance_ * Update WOSAC eval readme * Provide agents with full speed guidance trajs * Add support for headings * Add vel_xy * Improve code efficiency * Small shape bug fix * Improve wosac eval setting * Update best 1-scene policy * Refactor gym env with unified guidance mode. * Update visualizer with unified plotting of reference traj * Minor * Transform reference headings to local coordinate frame. * Give value network a bit more capacity * Give value network a bit more capacity * Add dataframe code * Minor * Add optional video logging to wosac script * Update init mode * Visualizer bug fix * Minor * wosac eval updates and new cpts
* detect parked script * default arg change * parked vehicle mask * train and eval init modes * Minor fixes --------- Co-authored-by: kevin <kevinwinston184@gmail.com>
* Improve agent pov plotting utils * Delete old unused function * Fix * Add lidar obs option to plot agent observation * wip * Revert defaults
* Delete obsolete vbd functions * Format * Remove all instances of use_vbd and alike -> unified guidance mode * Create dataclass struct for VBD online predictionsions * Delete world_time_steps as it is no longer used * Leaving a todo * Make sure to demean the VBD predicted trajectories * Data analysis and minor changes * Guidance data analysis notebook * Script to process guidance data * Fixes * Fixes * Update nb * Fixes and make sure to always wrap the yaws. * Fixes and make sure to always wrap the yaws. * Example data to work with
…#427) * Intended usage for z-axis * Access average z pos (elevation) from logs * Fix intended usage of avg_z * fixes * z pos fixes * Fix bug by converting all vals in tensor to floats * remove print statement * Cleanup --------- Co-authored-by: kevin <kevinwinston184@gmail.com>
* Small improvements * Minor training improvements * Rebase over dev * Bookkeeping * Bookkeeping * Small fixes for gpu * wip: config * Make adding the action optional * wip * Analysis notebooks * WIP: new reward mode * Frequency * Make sure to normalize rewards and other changes * Minor * Update model * Add bonus at end of episode mechanism * Fixes * Improve renderer * Rmv one-hot encoding * Fix render * Adding back the whole shazam * Naming stuff * Add data folder as cmdline arg * Add data folder as cmdline arg * Minor bug fixes * Cleanup and better error message * Align reward components scale
* model looks kinda okay? * think we added the steer angle correctly to the env * Small bug fix --------- Co-authored-by: Daphne Cornelisse <cor.daphne@gmail.com>
* add support for model switching * remove memory pinning * revert changed values
* Bug fix: Ensure there is always a warmup period of at least 10 steps when using vbd_online * online guidance fixes * minor fix * changes to get 91 steps * Keep episodeLen at 90, and don't use it toread in the logs * Add back warnings / idiot proofing * tiny fix * fix online guidance * Set max agents to 32 --------- Co-authored-by: kevin <kevinwinston184@gmail.com>
* Small improvements * Minor training improvements * Rebase over dev * Bookkeeping * Bookkeeping * Small fixes for gpu * wip: config * Make adding the action optional * wip * Analysis notebooks * WIP: new reward mode * Frequency * Make sure to normalize rewards and other changes * Minor * Update model * Add bonus at end of episode mechanism * Fixes * Improve renderer * Rmv one-hot encoding * Fix render * Adding back the whole shazam * Naming stuff * Add data folder as cmdline arg * Add data folder as cmdline arg * Minor bug fixes * Cleanup and better error message * Align reward components scale * Push models * Viz improvements * Data analysis * Add guidance dropout mask * Add dropout as option for training * Add dropout as option for training * Bug fix: only count valid points * Increase network size to 200K params * Bug fix * Larger net * Set controlled agent default to 32 * Minor improvements * WIP * Make agents their original size * Minor * Remove collision state from ego * Plotting stuff * wip * More plotting stuff * Minor * Fixes * Reward improvements * Reward improvements * Give agent a bit more road graph information * Visualizer bug fix: clean up axis before generating new plot * Formatting * Change dropout mechanism such that dguidance_dropout_prob represents the maximum dropout probability [cover wide range]. * Add mechanism to discourage agents from turning around for bonus * Fix sbatch generation script * Add speed penalty if end of traj is reached * wip * Some eval fixes * Eval wip * Fix action space bug * Fix defaults * VBD amortize shape bug fix * rmv speed penalty * Settings * Fig * huh? * Replace end of route bonus with small jerk penalty * Many improvements * Exclude data * Delete checkpoints * Minor
* Fix: ensure vbd_online can be used on cuda and cpu. * Small updates * WOSAC defaults * Dynamics model fix * Minor
* dataset script * wosac dataset script * parallelized wosac script * pkl support * vbd cuda ---------
* Fix: ensure vbd_online can be used on cuda and cpu. * Small updates * WOSAC defaults * Dynamics model fix * Minor * Update sbatch * Code to make latex wosac tables * pkl file * Add smoothness * Add new dropout mode * WOSAC eval improvements * Add figures * Increment progress based on visible route points only. * Files * Some small updates
* Fix: ensure vbd_online can be used on cuda and cpu. * Small updates * WOSAC defaults * Dynamics model fix * Minor * Update sbatch * Code to make latex wosac tables * pkl file * Add smoothness * Add new dropout mode * WOSAC eval improvements * Add figures * Increment progress based on visible route points only. * Files * Some small updates * Minor * WOSAC eval
Co-authored-by: Pragnay Mandavilli <pm3881@ga014.hpc.nyu.edu>
* Testing agent type in obs * Implement type-aware action space * Remove padding agent * Cleanup * Comment out cone view line for now * Update angle ranges
* fix empty geometry handling and out of bounds indices * add safer access and overflow prevention * allow flexible no of elements
Add occlusion to observation
Fix: pass config parameters into EnvConfig object in PufferEnv
* Add tl_states to extraction script. * WIP * Add back metadata * Add tl_states struct and access through simulator. * Add time index * Cleanup * Cleanup * SMall fix * Add tl state data struct * Add minimum test script * Add minimum test script * [Work In progress] Fixed initialization and made TL a singleton tensor * Fix json init and export * mean centering * Test and omit unnesecessary code in tl obs * Changed data access format in python code * Added a more interpretable positional element arrays * Unpack tl_states correctly * Add tl state plotting function in visualizer * mini bug * Fix traffic lights by exporting everything as float32 * Improve colors * Remove test file --------- Co-authored-by: Aarav Pandya <ap7641@nyu.edu> Co-authored-by: Eugene Vinitsky <eugenevinitsky@users.noreply.github.com> Co-authored-by: Pragnay Mandavilli <pm3881@ga034.hpc.nyu.edu> Co-authored-by: Pragnay Mandavilli <pm3881@ga014.hpc.nyu.edu>
Co-authored-by: Ellington Kirby <ellingtonkirby@gmail.com>
* add head_tilt_actions config as alternative to linspaced action values * add head_tilt_actions config as alternative to linspaced action values * fix parameters for action handling * remove view cone if view cone is 360° + make compatible with traffic lights
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
GPUDrive v0.04 → v0.05
This update aims to support both:
main, reliable sim agents)dev)The cleanest approach may be to create separate PPO run scripts for each setup, or use a single ppo script with different configs.
Todo
Resolve Incompatibilities
devassumes episodes do not terminate early, whilemaindoes↳ This discrepancy exists in both the C++ environment and the PPO wrapper
Documentation
gymenvironment READMEValidation
Other