Thanks for your great work. When I run DistServe/examples/offline.py for inference, the generated inference result is empty when the input prompt "Give me three tips for healthy life." is given. I guess it may be caused by stop=["\n"], so I changed it to stop=[""], and the generated inference result is a sentence that keeps repeating. The various temperature and top_p values I tried did not solve this problem. Do you know how to solve it?
The model I use is llama2-7B.