Discussion + Planning: How to unittest an agent?

There are a lot of bugs we have run into with regard to different behavior in different implementation/generations of code, and these issues are pretty hard to track down. I would like to move toward having a way to unittest agents.

The way we 'test' agents right now involves running the agents with altrain (which spins up another server that hosts a tutor in the browser) and then in the short term we look a the general behavior of the agent transactions as they are printed in the terminal, and maybe we additionally print out some of what is happening internally with the agent. And in the long term we look at the learning curves, which usually involves first doing some pre-processing/KC labeling and uploading to datashop. 

It would be nice to have unittests at the level of "the agent is at this point in training with these skills, and we'll give it these interactions, and we expect this to happen".

Some impediments to doing this:
1) Modularity: At the moment, the most straightforward way to run AL is via altrain, but this is more of an integration test, it requires spinning up a new process that utilizes a completely different library (that we have kept seperate from AL_Core for good reason).
2) Performance: right now (ballpark estimate) AL_Core is 50% of training time and AL_Train is another 50% and together all that back and forth takes several minutes per agent. It would be nice if our tests ran on the order of seconds to make iterating on the code faster. 
3) No robust way to "Construct" a skill(s): Skills at least in the ModularAgent right now are kind of a hodgepodge of learning mechanisms that are linked together. They are also learned, not defined. It would be nice if there was some language for writing and representing Skills. 
4) Randomness: Some of the process of learning skills is random so this may need to be controlled some way. 

I've been flirting with the idea of having a sort of virtual tutor that is just a knowledge-base that gets updated in python, executes the tutor logic, and calls request/train. This would at least address 1) and maybe also 2). 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Discussion + Planning: How to unittest an agent? #72

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Discussion + Planning: How to unittest an agent? #72

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions