-
Notifications
You must be signed in to change notification settings - Fork 7.1k
[docs] Add gpt oss deployment example #56400
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This PR adds a new example for deploying gpt-oss models. The changes are mostly good, but there are a few issues. There's a critical copy-paste error in the Anyscale service configuration for gpt-oss-120b in the example notebook and README. There are also some inconsistencies in the documentation links, Docker image versions, and configuration values in the examples. The Dockerfile could also be improved for better practices. I've provided detailed comments and suggestions to address these points.
doc/source/serve/tutorials/deployment-serve-llm/gpt-oss/README.md
Outdated
Show resolved
Hide resolved
doc/source/serve/tutorials/deployment-serve-llm/gpt-oss/notebook.ipynb
Outdated
Show resolved
Hide resolved
doc/source/serve/tutorials/deployment-serve-llm/gpt-oss/notebook.ipynb
Outdated
Show resolved
Hide resolved
doc/source/serve/tutorials/deployment-serve-llm/gpt-oss/notebook.ipynb
Outdated
Show resolved
Hide resolved
doc/source/serve/tutorials/deployment-serve-llm/gpt-oss/notebook.ipynb
Outdated
Show resolved
Hide resolved
doc/source/serve/tutorials/deployment-serve-llm/gpt-oss/notebook.ipynb
Outdated
Show resolved
Hide resolved
kunling-anyscale
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.
37d016d to
2c574a5
Compare
doc/source/serve/tutorials/deployment-serve-llm/gpt-oss/README.md
Outdated
Show resolved
Hide resolved
doc/source/serve/tutorials/deployment-serve-llm/gpt-oss/README.md
Outdated
Show resolved
Hide resolved
doc/source/serve/tutorials/deployment-serve-llm/gpt-oss/README.md
Outdated
Show resolved
Hide resolved
doc/source/serve/tutorials/deployment-serve-llm/gpt-oss/README.md
Outdated
Show resolved
Hide resolved
doc/source/serve/tutorials/deployment-serve-llm/gpt-oss/README.md
Outdated
Show resolved
Hide resolved
kunling-anyscale
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add few suggestions and comments
|
i notice the ray-llm nightly image is using vllm==0.10.2 so we can probably start the engineering review in the next release |
for context using our latest version ray==2.49.2 + vllm 0.10.2 breaks but this seems already fixed in nightly so we can change that requirement once the next release is out (mid october) meanwhile we can continue and publish with this version of ray==2.49.0, vllm==0.10.1 |
doc/source/serve/tutorials/deployment-serve-llm/gpt-oss/serve_gpt_oss.py
Outdated
Show resolved
Hide resolved
Signed-off-by: Aydin Abiar <aydin@anyscale.com>
Signed-off-by: Aydin Abiar <aydin@anyscale.com>
Signed-off-by: Aydin Abiar <aydin@anyscale.com>
695fdda to
b25fe07
Compare
Signed-off-by: Aydin Abiar <aydin@anyscale.com>
Signed-off-by: Aydin Abiar <aydin@anyscale.com>
Signed-off-by: Aydin Abiar <aydin@anyscale.com>
Signed-off-by: Aydin Abiar <aydin@anyscale.com>
Signed-off-by: Aydin Abiar <aydin@anyscale.com>
doc/source/serve/tutorials/deployment-serve-llm/gpt-oss/notebook.ipynb
Outdated
Show resolved
Hide resolved
doc/source/serve/tutorials/deployment-serve-llm/gpt-oss/notebook.ipynb
Outdated
Show resolved
Hide resolved
doc/source/serve/tutorials/deployment-serve-llm/gpt-oss/notebook.ipynb
Outdated
Show resolved
Hide resolved
doc/source/serve/tutorials/deployment-serve-llm/gpt-oss/notebook.ipynb
Outdated
Show resolved
Hide resolved
doc/source/serve/tutorials/deployment-serve-llm/gpt-oss/notebook.ipynb
Outdated
Show resolved
Hide resolved
doc/source/serve/tutorials/deployment-serve-llm/gpt-oss/notebook.ipynb
Outdated
Show resolved
Hide resolved
doc/source/serve/tutorials/deployment-serve-llm/gpt-oss/notebook.ipynb
Outdated
Show resolved
Hide resolved
doc/source/serve/tutorials/deployment-serve-llm/gpt-oss/notebook.ipynb
Outdated
Show resolved
Hide resolved
Signed-off-by: Aydin Abiar <aydin@anyscale.com>
doc/source/serve/tutorials/deployment-serve-llm/gpt-oss/notebook.ipynb
Outdated
Show resolved
Hide resolved
doc/source/serve/tutorials/deployment-serve-llm/gpt-oss/notebook.ipynb
Outdated
Show resolved
Hide resolved
doc/source/serve/tutorials/deployment-serve-llm/gpt-oss/notebook.ipynb
Outdated
Show resolved
Hide resolved
doc/source/serve/tutorials/deployment-serve-llm/gpt-oss/notebook.ipynb
Outdated
Show resolved
Hide resolved
doc/source/serve/tutorials/deployment-serve-llm/gpt-oss/notebook.ipynb
Outdated
Show resolved
Hide resolved
doc/source/serve/tutorials/deployment-serve-llm/gpt-oss/notebook.ipynb
Outdated
Show resolved
Hide resolved
doc/source/serve/tutorials/deployment-serve-llm/gpt-oss/notebook.ipynb
Outdated
Show resolved
Hide resolved
Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com> Signed-off-by: Aydin Abiar <62435714+Aydin-ab@users.noreply.github.com>
Signed-off-by: Aydin Abiar <aydin@anyscale.com>
Signed-off-by: Josh Kodi <joshkodi@gmail.com>
Signed-off-by: xgui <xgui@anyscale.com>
Signed-off-by: Aydin Abiar <aydin@anyscale.com>
Signed-off-by: Future-Outlier <eric901201@gmail.com>
Why are these changes needed?
Adding new examples for gpt-oss, later released as an anyscale templates as well
Note:
Using our latest version ray==2.49.2 + vllm 0.10.2 breaks but this seems already fixed in nightly so we can change that requirement once the next release is out (mid october)
Also, our current ray-llm image is using vllm 0.10.0 so we have to build a new image with vllm>=0.10.1 this makes the anyscale service launch pretty long (~5min)
Checks
git commit -s) in this PR.scripts/format.shto lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/under thecorresponding
.rstfile.