-
Notifications
You must be signed in to change notification settings - Fork 1
releasing 1.4.0 #275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
releasing 1.4.0 #275
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…etadata added VRAM profile for CUDA device in view metadata
Refactored appRunningTime and appRunningHardware to be nested under a new appProfiling parent field for better metadata organization. - appRunningTime -> appProfiling.runningTime - appRunningHardware -> appProfiling.hardware Closes #261
…EY6KQfeTXckc9b Address issue #251
* now all gunicorn workers are discarded after processing 1 request
* this will prevent VRAM leak caused by zombie workers holding VRAM doing nothing
* but in exchange of some loading time at worker initialization (e.g. model loading)
* can be overridden by setting `max_requests` passed to `serve_production()` in app.py
* number of gunicorn workers are set reasonably low when app declares GPU usage
* app metadata now has two new fields can be used to declare GPU usage
* for GPU apps, first-time VRAM usage for each parameter combination is now recorded in local cache directory see `$XDG_CACHE_HOME/clams`
updated cuda memory handling and related documentation
…me pydantic warnings
overwriting timestamp values of all "current" app's views after `_annotate()`
…rsions added a new metadata field for multi-model/family app
implemented new docs builder
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Overview
This release introduces three new fields to the app metadata to improve the orchestration and handling of large models within apps.
Additions
analyzer_versions: A new app metadata field (complementing the existinganalyzer_version) designed to store version strings for multiple models wrapped in a single CLAMS app (seeanalyzer_versionwhen using HF model family #251 (comment) for discussion).est_gpu_mem_minandest_gpu_mem_typ: New app metadata fields that allow developers to specify estimated GPU memory usage (minimum and typical) for better resource management.Changes
mmif-pythonSDK (1.2.1)appProfilingfield in the output view metadata (Refactor metadata structure to nest profiling and runtime metrics #261).timestampproperty of views is now updated based on the time when the_annotate()call completes. This ensures all views from a single app execution share a consistent "marker" for easier identification (Implement "run ID" property for views to distinguish app executions #269).est_gpu_mem_minvalue in production mode (--production) will now automatically limit the number of Gunicorn workers spawned to prevent GPU out-of-memory (OOM) errors.torchnow cache VRAM usage statistics in the local disk cache (typically in$XDG_CACHE_HOME) (gunicorn, torch, and cuda #243). Currently, this information is for logging only, and we plan to add a retrieval API in the future.clams-python/clams/app/__init__.py
Lines 348 to 361 in 671560b