-
Notifications
You must be signed in to change notification settings - Fork 20
Open
Description
Tasks
- write a readme for deploying bots (@slundqui)
- include a section on using AWS to deploy bots
- "one-click" solution to deploying checkpoint & invariant bots (tbd after readme)
- setup rollbar monitoring & notifications
- write up a playbook for when things fail (@jrhea)
- setup credential storage (@mcclurejt)
- setup & distribute lastpass credentials for pauser (@jalextowle )
- make sure everyone has access (@wakamex )
- review invariant checks (@jalextowle @slundqui )
responsibility
- Core bot & infra team: @ryangoree @slundqui @jalextowle @mcclurejt @jrhea
- Secondary team: @sentilesdal @dpaiton @wakamex
All people listed should
- know how to (& have credentials to) restart and/or deploy bots
- monitor bot-related rollbar notifications; check that any critical bugs are being addressed
- understand error prioritization and know the failure playbook
importance (priority)
- invariant fails (page @jalextowle @jrhea @mcclurejt )
- checkpoint bot tries to checkpoint & fails
- checkpoint bot goes down
- invariant goes down
top priorities for mainnet
- checkpoint bot & invariance check bot
- runs
- reporting system for when it goes down
- secure credential management
- documentation on how to (re)deploy bots
bots to consider
- checkpoint
- invariance check
- lpandarb
- this should be added after the other two are working well
documentation
- README.md in infra
uptime monitoring
- easily-accessible location for cloud machine address & status
- easily-accessible portal to view all deployed bot wallets
error reporting & notifications
- notifications to critical team when bots go down (rollbar?)
- system in place to assign responsibility for who should handle errors
easy start & restart
- minimal steps to deploy new bots on a pool
- ideally would be able to run out in a mainnet fork on aws instance
containerized deployment
- setup flag for "service bots"
invariant checks
- rollbar filters for each check type
credentials storage
- privileged access to private keys for bots
- whoever sets this up is fine with making calls -- lets prioritize "easy" and "safe"
- ideally use a free service, but if not then fine
- easiest to use env vars
- lastpass credentials for pauser
continuous deployment
- nice to have
- when infra pushes a release we deploy bots on a mainnet fork in AWS?
- almost-continuous deployment -- make it easy for a dev to manually test deployment
current status -- checkpoint bot:
- running in docker container
- docker can restart automatically on failure (easily set up)
- passes credentials via env variables set in infra repo
- registry address, rpc uri (points to anvil node), private key, rollbar api key
Metadata
Metadata
Assignees
Labels
No labels