Bani's Model Training
Creating the question answer agent begins with the training of dataset on a pretrained model using the package, Bani: https://github.com/captanlevi/Bani. Training is ran using training script and loading of Bani bot is ran as another process.
- Ensure docker is available on your machine.
- Create 2 docker volume (this can be done via docker app or using command line).
- 1 volume: Store faq store files (.pkl) after building of FAQ classes.
- 1 volume: Store generated model files, after bot training has been completed.
- 1 volume: Store faq store files (.pkl) after building of FAQ classes.
- Placed FAQs files in the csv_folder.
- Access to the project directory.
- Build docker image: docker build -t bani_training_script .
- Run docker image with created volume:
docker run -v $(pwd):/bani_training -it --mount source=model_vol_name,target=/model --mount source=faq_vol_name,target=/faq_store bani_training_scriptFAQ file:
- Each FAQ file contains question and answer for 1 topic.
- The first column is the question, second column is the answer. Question and answer data begins on the first line (no header).
- File should be available in a CSV format.
Volume:
- Subsequent training of FAQ classes with the same file name will overwrite the FAQ files in the faq_store volume.
- If subsequent training does not support a topic that was trained previously, do remove the files in the volume, in case there may be issue with loading of bot for answering.