Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
18 commits
Select commit Hold shift + click to select a range
f0684cb
docs: sync CONTRIBUTING.md with latest code
pr-test1[bot] Sep 16, 2025
501414e
docs: sync about.mdx with latest code
pr-test1[bot] Sep 16, 2025
e5c2fd9
docs: sync concepts/contributing.mdx with latest code
pr-test1[bot] Sep 16, 2025
43661e0
docs: sync concepts/deployment.mdx with latest code
pr-test1[bot] Sep 16, 2025
7b66f55
docs: sync concepts/fine-tuning.mdx with latest code
pr-test1[bot] Sep 16, 2025
f865987
docs: sync concepts/models.mdx with latest code
pr-test1[bot] Sep 16, 2025
5cb4781
docs: sync configuration/AWS.mdx with latest code
pr-test1[bot] Sep 16, 2025
bfd4022
docs: sync configuration/Azure.mdx with latest code
pr-test1[bot] Sep 16, 2025
6e920f2
docs: sync configuration/Environment.mdx with latest code
pr-test1[bot] Sep 16, 2025
584be61
docs: sync getting_started.md with latest code
pr-test1[bot] Sep 16, 2025
159d91a
docs: sync installation.mdx with latest code
pr-test1[bot] Sep 16, 2025
67b73c8
docs: sync mint.json with latest code
pr-test1[bot] Sep 16, 2025
f7fc473
docs: sync tutorials/deploying-llama-3-to-aws.mdx with latest code
pr-test1[bot] Sep 16, 2025
3545ff9
docs: sync tutorials/deploying-llama-3-to-azure.mdx with latest code
pr-test1[bot] Sep 16, 2025
889126b
docs: sync tutorials/deploying-llama-3-to-gcp.mdx with latest code
pr-test1[bot] Sep 16, 2025
6e42678
docs: sync updated_readme.md with latest code
pr-test1[bot] Sep 16, 2025
99600eb
docs: create concepts/jumpstart-custom-models.mdx
pr-test1[bot] Sep 16, 2025
277c5e6
docs: create concepts/openai-proxy.mdx
pr-test1[bot] Sep 16, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,4 +65,4 @@ By contributing, you agree that your contributions will be licensed under the Ap

## Questions?

Feel free to contact us at [support@slashml.com](mailto:support@slashml.com) if you have any questions about contributing!
Feel free to contact us at [support@slashml.com](mailto:support@slashml.com) if you have any questions about contributing!
6 changes: 3 additions & 3 deletions about.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,9 @@ Do submit your feature requests at https://magemaker.featurebase.app/
- Querying within Magemaker currently only works with text-based models
- Deleting a model is not instant, it may show up briefly after deletion
- Deploying the same model within the same minute will break
- Hugging-face models on Azure have different Ids than their Hugging-face counterparts. Follow the steps specified in the quick-start guide to find the relevant models
- For Azure deploying models other than Hugging-face is not supported yet.
- Python3.13 is not supported because of an open-issue by Azure. https://github.com/Azure/azure-sdk-for-python/issues/37600
- Hugging Face models on Azure have different IDs than their Hugging Face counterparts. Follow the steps specified in the quick-start guide to find the relevant models.
- For Azure, deploying models other than Hugging Face is not supported yet.
- Python 3.12 is **not** supported because of an open issue in the Azure SDK (see https://github.com/Azure/azure-sdk-for-python/issues/37600).


If there is anything we missed, do point them out at https://magemaker.featurebase.app/
Expand Down
2 changes: 1 addition & 1 deletion concepts/contributing.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -165,4 +165,4 @@ We are committed to providing a welcoming and inclusive experience for everyone.

## License

By contributing to Magemaker, you agree that your contributions will be licensed under the Apache 2.0 License.
By contributing to Magemaker, you agree that your contributions will be licensed under the Apache 2.0 License.
10 changes: 5 additions & 5 deletions concepts/deployment.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -62,8 +62,8 @@ deployment: !Deployment
destination: gcp
endpoint_name: opt-125m-gcp
instance_count: 1
machine_type: n1-standard-4
accelerator_type: NVIDIA_TESLA_T4
instance_type: n1-standard-4
accelerator_type: NVIDIA_TESLA_T4 # or NVIDIA_L4
accelerator_count: 1

models:
Expand Down Expand Up @@ -113,6 +113,7 @@ deployment: !Deployment
instance_count: 1
instance_type: ml.g5.12xlarge
num_gpus: 4
# quantization: bitsandbytes # Optional

models:
- !Model
Expand Down Expand Up @@ -202,10 +203,9 @@ Choose your instance type based on your model's requirements:
4. Set up monitoring and alerting for your endpoints

<Warning>
Make sure you setup budget monitory and alerts to avoid unexpected charges.
Make sure you set up budget monitoring and alerts to avoid unexpected charges.
</Warning>


## Troubleshooting Deployments

Common issues and their solutions:
Expand All @@ -225,4 +225,4 @@ Common issues and their solutions:
- Verify model ID and version
- Check instance memory requirements
- Validate Hugging Face token if required
- Endpoing deployed but deployment failed. Check the logs, and do report this to us if you see this issue.
- Endpoint deployed but deployment failed. Check the logs, and report this to us if you see this issue.
115 changes: 76 additions & 39 deletions concepts/fine-tuning.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ description: Guide to fine-tuning models with Magemaker

## Fine-tuning Overview

Fine-tuning allows you to adapt pre-trained models to your specific use case. Magemaker simplifies this process through YAML configuration.
Fine-tuning allows you to adapt pre-trained models to your specific use case. Currently, Magemaker supports fine-tuning **on AWS SageMaker only** (GCP and Azure fine-tuning are on the roadmap). The workflow is entirely YAML-driven so it can be automated in CI/CD pipelines.

### Basic Command

Expand All @@ -19,19 +19,25 @@ magemaker --train .magemaker_config/train-config.yaml

```yaml
training: !Training
destination: aws
instance_type: ml.p3.2xlarge
destination: aws # only "aws" is supported for now
instance_type: ml.p3.2xlarge # GPU instance for training
instance_count: 1
training_input_path: s3://your-bucket/training-data.csv
training_input_path: s3://your-bucket/training-data.csv # points to your dataset

models:
- !Model
id: your-model-id
source: huggingface
- !Model
id: your-model-id
source: huggingface
```

• **destination** – must be `aws` at the moment.
• **training_input_path** – S3 URI that Magemaker will pass directly to SageMaker.
• **instance_type / instance_count** – any SageMaker training instance type is supported.

### Advanced Configuration

Beyond the basics, you can supply custom hyperparameters. If omitted, Magemaker will attempt to infer sensible defaults based on the model family (see `get_hyperparameters_for_model()` in the codebase).

```yaml
training: !Training
destination: aws
Expand All @@ -49,20 +55,38 @@ training: !Training
save_steps: 1000
```

<Note>
If you omit `hyperparameters`, Magemaker will fall back to task-specific defaults. For example, text-generation models automatically receive the hyperparameters returned by `get_hyperparameters_for_model()`.
</Note>

### Optional Parameters

The `Training` schema also supports the following optional fields (all are optional and have sensible defaults):

| Field | Description |
| -------------------- | ------------------------------------------------------------------------- |
| `output_path` | S3 URI where training artifacts should be stored |
| `max_run` | Maximum training job runtime in seconds |
| `volume_size_in_gb` | Size of the EBS volume attached to the training instance |
| `spot` | `true/false` – use SageMaker Spot Training to save costs |
| `checkpoint_s3_uri` | S3 URI for incremental checkpoints (only relevant if `spot: true`) |

*(See the `Training` Pydantic model in `magemaker/schemas/training.py` for the full list.)*

## Data Preparation

### Supported Formats

<CardGroup>
<Card title="CSV Format" icon="file-csv">
- Simple tabular data
- Easy to prepare
- Simple tabular data<br />
- Easy to prepare<br />
- Good for classification tasks
</Card>

<Card title="JSON Lines" icon="file-code">
- Flexible data format
- Good for complex inputs
- Flexible data format<br />
- Good for complex inputs<br />
- Supports nested structures
</Card>
</CardGroup>
Expand All @@ -71,48 +95,43 @@ training: !Training

<Steps>
<Step title="Prepare Data">
Format your data according to model requirements
Format your dataset according to the model requirements (e.g., one JSON line per training example).
</Step>
<Step title="Upload to S3">
Use AWS CLI or console to upload data
Use the AWS CLI or console to upload your dataset:
```bash
aws s3 cp local/path/to/data.csv s3://your-bucket/data.csv
```
</Step>
<Step title="Configure Path">
Specify S3 path in training configuration
Reference the S3 URI in `training_input_path` of your YAML file.
</Step>
</Steps>

## Instance Selection

### Training Instance Types

Choose based on:
- Dataset size
- Model size
- Training time requirements
- Cost constraints
Choosing the right training instance impacts both training time and cost.

Popular choices:
- ml.p3.2xlarge (1 GPU)
- ml.p3.8xlarge (4 GPUs)
- ml.p3.16xlarge (8 GPUs)

## Hyperparameter Tuning
| Instance | GPUs | Typical use-case |
| ------------------- | ---- | -------------------------------------------- |
| `ml.p3.2xlarge` | 1 | Small to medium models (<7B parameters) |
| `ml.p3.8xlarge` | 4 | Larger models / shorter turnaround |
| `ml.p3.16xlarge` | 8 | Large-scale training / distributed workloads |

### Basic Parameters
<Warning>
Always check your current SageMaker GPU quota and request increases if necessary.
</Warning>

```yaml
hyperparameters: !Hyperparameters
epochs: 3
learning_rate: 2e-5
batch_size: 32
```
## Hyperparameter Tuning

### Advanced Tuning
While you can hard-code values, you can also pass ranges to let SageMaker perform hyperparameter tuning (HPO). Use the `min`, `max`, or `values` keys as shown below:

```yaml
hyperparameters: !Hyperparameters
epochs: 3
learning_rate:
learning_rate:
min: 1e-5
max: 1e-4
scaling: log
Expand All @@ -122,9 +141,27 @@ hyperparameters: !Hyperparameters

## Monitoring Training

### CloudWatch Metrics
Magemaker streams CloudWatch metrics for every training job. Key metrics include:

- `Train/Loss`
- `Eval/Loss`
- `LearningRate`
- `GPUUtilization`

You can access logs directly in the SageMaker console or via the AWS CLI:

Available metrics:
- Loss
- Learning rate
- GPU utilization
```bash
aws logs tail /aws/sagemaker/TrainingJobs --follow --since 1h
```

<Note>
Job status (Started, InProgress, Completed, Failed) is also surfaced in the CLI output.
</Note>

## Cleaning Up

Training jobs store artifacts (model checkpoints, logs) in S3. Delete these objects when no longer needed to avoid storage costs:

```bash
aws s3 rm --recursive s3://your-bucket/<training-job-name>
```
88 changes: 88 additions & 0 deletions concepts/jumpstart-custom-models.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
---
title: JumpStart & Custom Models
description: Deploy AWS JumpStart marketplace and custom models with Magemaker
---

## Overview
Besides Hugging Face models, Magemaker can now deploy two additional model types **on AWS SageMaker**:

1. **JumpStart models** – pre-packaged models provided by Amazon or third-party sellers.
2. **Custom models** – your own fine-tuned artefacts (.tar.gz or directory) stored locally or in S3.

<GithubContribTag />

---
## 1 · Deploying a JumpStart Model

### 1.1 Find the *model_id*
Browse the [JumpStart Model Zoo](https://docs.aws.amazon.com/sagemaker/latest/dg/jumpstart-model-zoo.html) or call:
```python
from sagemaker.jumpstart.notebook_utils import list_jumpstart_models
list_jumpstart_models()
```

### 1.2 Create a YAML
```yaml
models:
- !Model
id: huggingface-text2text-flan-t5-large # JumpStart model_id
source: sagemaker

deployment: !Deployment
destination: aws
instance_type: ml.g5.2xlarge
instance_count: 1
```

### 1.3 Deploy
```bash
magemaker --deploy .magemaker_config/flan-t5.yaml
```
Magemaker handles the EULA acceptance (`accept_eula=True`) automatically.

---
## 2 · Deploying a Custom Model
Custom models are useful when you have already trained a model locally or with SageMaker Training.

### 2.1 Package Artifacts
- For **Hugging Face** style, create a directory containing `config.json`, `tokenizer.json`, etc.
- Optionally compress to `model.tar.gz`.

### 2.2 Upload (optional)
If your artefact is not yet on S3, Magemaker will upload it for you.

### 2.3 YAML Example
```yaml
models:
- !Model
id: my-distilbert-finetuned
source: huggingface
location: ./artifacts/distilbert.tar.gz # or s3://bucket/key

deployment: !Deployment
destination: aws
instance_type: ml.m5.xlarge
```

### 2.4 Deploy
```bash
magemaker --deploy .magemaker_config/my-distilbert.yaml
```
Magemaker will:
1. Upload the local artefact to `s3://<default-bucket>/models/my-distilbert-finetuned/` (if needed)
2. Create a `HuggingFaceModel` pointing to that S3 path
3. Spin up the endpoint.

---
## Tips & Quotas
- **GPU quota** – JumpStart LLMs often need a GPU instance (`ml.g5.*`). Request a quota increase first.
- **Endpoint names** – If `endpoint_name` is omitted, Magemaker appends a timestamp to ensure uniqueness.
- **Cost control** – Delete endpoints when not in use: `magemaker --cloud aws → Delete a Model Endpoint`.

---
## FAQ
**Q : Do JumpStart models work on GCP / Azure?**
*A : Not yet. JumpStart is an AWS-specific marketplace.*

**Q : Can I pass inference parameters to a JumpStart model?**
*A : Yes. Use the same `predict:` block as with Hugging Face models.*
Loading