Skip to content

Productizing bot deployment #1541

@dpaiton

Description

@dpaiton

Tasks

  • write a readme for deploying bots (@slundqui)
    • include a section on using AWS to deploy bots
  • "one-click" solution to deploying checkpoint & invariant bots (tbd after readme)
  • setup rollbar monitoring & notifications
    • flag or label or something to make sure these errors can be filtered (@slundqui )
    • discord channel that ingests notifs (@wakamex)
  • write up a playbook for when things fail (@jrhea)
  • setup credential storage (@mcclurejt)
  • setup & distribute lastpass credentials for pauser (@jalextowle )
  • make sure everyone has access (@wakamex )
  • review invariant checks (@jalextowle @slundqui )

responsibility

All people listed should

  1. know how to (& have credentials to) restart and/or deploy bots
  2. monitor bot-related rollbar notifications; check that any critical bugs are being addressed
  3. understand error prioritization and know the failure playbook

importance (priority)

  1. invariant fails (page @jalextowle @jrhea @mcclurejt )
  2. checkpoint bot tries to checkpoint & fails
  3. checkpoint bot goes down
  4. invariant goes down

top priorities for mainnet

  • checkpoint bot & invariance check bot
    • runs
    • reporting system for when it goes down
    • secure credential management
    • documentation on how to (re)deploy bots

bots to consider

  • checkpoint
  • invariance check
  • lpandarb
    • this should be added after the other two are working well

documentation

  • README.md in infra

uptime monitoring

  • easily-accessible location for cloud machine address & status
  • easily-accessible portal to view all deployed bot wallets

error reporting & notifications

  • notifications to critical team when bots go down (rollbar?)
  • system in place to assign responsibility for who should handle errors

easy start & restart

  • minimal steps to deploy new bots on a pool
  • ideally would be able to run out in a mainnet fork on aws instance

containerized deployment

  • setup flag for "service bots"

invariant checks

  • rollbar filters for each check type

credentials storage

  • privileged access to private keys for bots
  • whoever sets this up is fine with making calls -- lets prioritize "easy" and "safe"
    • ideally use a free service, but if not then fine
  • easiest to use env vars
  • lastpass credentials for pauser

continuous deployment

  • nice to have
  • when infra pushes a release we deploy bots on a mainnet fork in AWS?
  • almost-continuous deployment -- make it easy for a dev to manually test deployment

current status -- checkpoint bot:

  • running in docker container
    • docker can restart automatically on failure (easily set up)
  • passes credentials via env variables set in infra repo
    • registry address, rpc uri (points to anvil node), private key, rollbar api key

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions