diff --git a/README.md.mdx b/README.md.mdx new file mode 100644 index 0000000..845af5d --- /dev/null +++ b/README.md.mdx @@ -0,0 +1,70 @@ +# Magemaker + +**Magemaker** helps you deploy, query, and manage LLMs across AWS, GCP, and Azure. Now with improved Docker/Dockerfile support, credential testing, and clear configuration path guidance. + +--- + +## 🚀 Features +- Lightweight wrapper for SageMaker, Vertex AI, and Azure +- Deploy models from HuggingFace, custom containers, more +- New: Improved Dockerfile with multi-cloud CLI support +- New: Automatic credential environment variable setup in Docker +- New: Pre-flight script now tests AWS/GCP credentials after setup +- New: Clear configuration file path output after deployment; required for querying +- CLI and programmatic interface + +--- + +## Install + +```sh +pip install magemaker +``` + +Or build the Docker image: + +```sh +git clone https://github.com/slashml/magemaker.git +cd magemaker +docker build -f Dockerfile-server -t magemaker-server . +``` + +## Cloud Credential Setup + +See the [installation](installation) and [configuration](configuration/AWS) pages for details on mounting credentials with Docker, or setting up local CLI authentication. + +## Deploying a Model + +```sh +magemaker deploy --provider aws --model-id google-bert/bert-base-uncased +``` + +After deployment, copy/note the **configuration file path** printed in the output: + +``` +Important: Configuration saved at: /app/configs/.yml +You'll need this path for querying the model later. +``` + +## Querying a Model + +```sh +magemaker query --endpoint +``` + +If the config file for your endpoint is missing or not found, Magemaker now prints an error showing the **expected config location** (and how to fix). + +For advanced usage, see the `/test_query.yml` example in the source repo. + +## Pre-flight Checks + +Run the bundled pre-flight script to check credential setup: + +```sh +bash magemaker/scripts/preflight.sh +``` + +The script will also help install cloud CLIs as needed and run connectivity tests. + +## License +[Apache 2.0](LICENSE) diff --git a/concepts/deployment.mdx b/concepts/deployment.mdx index 66ca7a9..6468c0f 100644 --- a/concepts/deployment.mdx +++ b/concepts/deployment.mdx @@ -1,228 +1,45 @@ ---- -title: Deployment -description: Learn how to deploy models using Magemaker ---- +# Deployment -## Deployment Methods +Deploying LLMs with Magemaker involves providing a deployment configuration, authenticating with your chosen cloud provider, and managing your service endpoints. -Magemaker offers multiple ways to deploy your models to AWS, GCP and Azure. Choose the method that best fits your workflow. +## Deployment Process -### Interactive Deployment +1. **Choose your provider:** + - AWS SageMaker + - Google Cloud Vertex AI + - Azure -When you run the `magemaker --cloud [aws|gcp|azure|all]` command, you'll get an interactive menu that walks you through the deployment process: +2. **Provide credentials:** + - See provider configuration docs for how to securely provide credentials (now with Docker environment variable detection for AWS/GCP!) -```sh -magemaker --cloud [aws|gcp|azure|all] -``` - -This method is great for: - -- First-time users -- Exploring available models -- Testing different configurations - -### YAML-based Deployment - -For reproducible deployments and CI/CD integration, use YAML configuration files: - -```sh -magemaker --deploy .magemaker_config/your-model.yaml -``` - -This is recommended for: - -- Production deployments -- CI/CD pipelines -- Infrastructure as Code (IaC) -- Team collaborations - -## Multi-Cloud Deployment - -Magemaker supports deployment to AWS SageMaker, GCP Vertex AI, and Azure ML. Here's how to deploy the same model (facebook/opt-125m) to different cloud providers: - -### AWS (SageMaker) +3. **Run the deployment command:** -```yaml -deployment: !Deployment - destination: aws - endpoint_name: opt-125m-aws - instance_count: 1 - instance_type: ml.m5.xlarge + ```bash + magemaker deploy --provider aws --model-id google-bert/bert-base-uncased + ``` -models: - - !Model - id: facebook/opt-125m - source: huggingface -``` - -### GCP (Vertex AI) - -```yaml -deployment: !Deployment - destination: gcp - endpoint_name: opt-125m-gcp - instance_count: 1 - machine_type: n1-standard-4 - accelerator_type: NVIDIA_TESLA_T4 - accelerator_count: 1 - -models: - - !Model - id: facebook/opt-125m - source: huggingface -``` - -### Azure ML +4. **Save your configuration:** + - After successful deployment, Magemaker prints the path to a configuration YAML file for your model: -```yaml -deployment: !Deployment - destination: azure - endpoint_name: opt-125m-azure - instance_count: 1 - instance_type: Standard_DS3_v2 + ``` + Important: Configuration saved at: /config/.yml + You'll need this path for querying the model later. + ``` + - Keep this path—you must supply it for later queries. -models: - - !Model - id: facebook-opt-125m - source: huggingface -``` - -## YAML Configuration Reference - -### Basic Deployment - -```yaml -deployment: !Deployment - destination: aws - endpoint_name: test-bert-uncased - instance_count: 1 - instance_type: ml.m5.xlarge - -models: - - !Model - id: google-bert/bert-base-uncased - source: huggingface -``` -### Advanced Configuration +## Querying -```yaml -deployment: !Deployment - destination: aws - endpoint_name: test-llama3-8b - instance_count: 1 - instance_type: ml.g5.12xlarge - num_gpus: 4 +Pass the correct config (path or endpoint name) to `magemaker query`: -models: - - !Model - id: meta-llama/Meta-Llama-3-8B-Instruct - source: huggingface - predict: - temperature: 0.9 - top_p: 0.9 - top_k: 20 - max_new_tokens: 250 +```bash +magemaker query --endpoint ``` -## Cloud-Specific Instance Types - -### AWS SageMaker Types - -Choose your instance type based on your model's requirements: - - - - Good for smaller models like BERT-base - - 4 vCPU - - 16 GB Memory - - Available in free tier - - - - Required for larger models like LLaMA - - 48 vCPU - - 192 GB Memory - - 4 NVIDIA A10G GPUs - - - - - Remember to deactivate unused endpoints to avoid unnecessary charges! - - -### GCP Vertex AI Types - - - - Good for smaller models - - 4 vCPU - - 15 GB Memory - - Cost-effective option - - - - For larger models - - 12 vCPU - - 85 GB Memory - - 1 NVIDIA A100 GPU - - - -### Azure ML Types - - - - Good for smaller models - - 4 vCPU - - 14 GB Memory - - Balanced performance - - - - For GPU workloads - - 6 vCPU - - 112 GB Memory - - 1 NVIDIA V100 GPU - - - -## Deployment Best Practices - -1. Use meaningful endpoint names that include: - - - Model name/version - - Environment (dev/staging/prod) - - Team identifier - -2. Start with smaller instance types and scale up as needed - -3. Always version your YAML configurations - -4. Set up monitoring and alerting for your endpoints - - -Make sure you setup budget monitory and alerts to avoid unexpected charges. - - - -## Troubleshooting Deployments - -Common issues and their solutions: - -1. **Deployment Timeout** - - - Check instance quota limits - - Verify network connectivity - -2. **Instance Not Available** +If the config can't be found, Magemaker will show you the expected path (e.g.: `/app/configs/.yml`). Place the deployment config at that path or correct your command. - - Try a different region - - Request quota increase - - Use an alternative instance type +## Credentials -3. **Model Loading Failure** - - Verify model ID and version - - Check instance memory requirements - - Validate Hugging Face token if required - - Endpoing deployed but deployment failed. Check the logs, and do report this to us if you see this issue. +- Credentials are supplied as per provider docs +- Docker users: see [Installation](../installation) +- Magemaker now auto-detects and sets AWS and GCP env vars if mounted in Docker diff --git a/configuration/AWS.mdx b/configuration/AWS.mdx index cdc4b9f..ef5f63a 100644 --- a/configuration/AWS.mdx +++ b/configuration/AWS.mdx @@ -1,82 +1,77 @@ ---- -title: AWS ---- +# AWS Configuration -### AWS CLI +To use AWS services through Magemaker—especially AWS SageMaker deployments—you must provide AWS credentials with sufficient permissions. -To install Azure SDK on MacOS, you need to have the latest OS and you need to use Rosetta terminal. Also, make sure you have the latest version of Xcode tools installed. +## Setting up AWS Credentials -Follow this guide to install the latest AWS CLI +Magemaker looks for standard AWS CLI credentials, either locally or in the container. -https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html +### 1. Creating an IAM User +1. Sign in to the AWS Console +2. Go to **IAM > Users**, and create a user with programmatic access +3. Attach the following permissions: + - `AmazonSageMakerFullAccess` + - `AmazonS3FullAccess` + - Any other required policies -Once you have the CLI installed and working, follow these steps +4. Download or note your `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY`. +### 2. Save Credentials (Locally) -### AWS Account +Configure your local AWS CLI: - - -Register for an [AWS account](https://aws.amazon.com/) and sign-in to the [console](https://console.aws.amazon.com/). - +```bash +aws configure +``` - -From the console, use the Search bar to find and select IAM (***do not use IAM Identity Center***, which is confusingly similar but a totally different system). +This creates a credentials file at `~/.aws/credentials`. -![Enter image alt description](../Images/muJ_Image_1.png) +### 3. Using Credentials in Docker -You should see the following screen after clicking IAM. +With Magemaker v0.3.0+, you can securely inject credentials into the container: -![Enter image alt description](../Images/ldC_Image_2.png) - +```bash +docker run -it \ + -v $HOME/.aws:/root/.aws:ro \ + magemaker-server +``` - -1. Select `Users` in the side panel - -![Enter image alt description](../Images/QX4_Image_3.png) +The container entrypoint will detect `/root/.aws/credentials` and set `AWS_SHARED_CREDENTIALS_FILE` automatically. -2. Create a user if you don't already have one +**Tip:** You may use a different profile by editing `~/.aws/credentials` or exporting `AWS_PROFILE`. -![Enter image alt description](../Images/ly3_Image_4.png) - +### 4. Testing Credentials - -1. Click on "Add permissions" - -![Enter image alt description](../Images/E7x_Image_5.png) +Use the `preflight.sh` script to validate connectivity: -2. Select "Attach policies directly". Under permission policies, search for and tick the boxes for: - - `AmazonSagemakerFullAccess` - - `IAMFullAccess` - - `ServiceQuotasFullAccess` +```bash +bash magemaker/scripts/preflight.sh +``` -Then click Next. +It checks your permissions and prompts to install the AWS CLI if needed. -![Enter image alt description](../Images/01X_Image_6.png) +## Environment Variables -The final list should look like the following: +Magemaker will use: -![Enter image alt description](../Images/Dfp_Image_7.png) +- `AWS_SHARED_CREDENTIALS_FILE` if set +- Otherwise, the default credentials search path -Click "Create user" on the following screen. - +Example (if not using the Docker entrypoint): - -1. Click the name of the user you've just created (or one that already exists) -2. Go to "Security Credentials" tab -3. Scroll down to "Access Keys" section -4. Click "Create access key" -5. Select Command Line Interface then click next +```bash +export AWS_SHARED_CREDENTIALS_FILE=/path/to/credentials +``` -![Enter image alt description](../Images/BPP_Image_8.png) +## Notes -Enter a description (this is optional, can leave blank). Then click next. +- Make sure your user is in the correct AWS region (default: `us-west-2`; can be overridden by env var `AWS_REGION`) +- The deployment output will show where to find the configuration for querying the model: -![Enter image alt description](../Images/gMD_Image_9.png) + ``` + Important: Configuration saved at: /app/configs/.yml + You'll need this path for querying the model later. + ``` -**Store BOTH the Access Key and the Secret access key for the next step. Once you've saved both keys, click Done.** - -![Enter image alt description](../Images/Gjw_Image_10.png) - - \ No newline at end of file + Save the exported config path from your deployment logs! diff --git a/configuration/Environment.mdx b/configuration/Environment.mdx index 0781ec3..64f280b 100644 --- a/configuration/Environment.mdx +++ b/configuration/Environment.mdx @@ -1,30 +1,41 @@ ---- -title: Environment Variables ---- +# Environment Configuration -### Required Config File -A `.env` file is automatically created when you run `magemaker --cloud `. This file contains the necessary environment variables for your cloud provider(s). +Magemaker can be configured using environment variables, `.env` files, and standard cloud credential files. -By default, Magemaker will look for a `.env` file in your project root with the following variables based on which cloud provider(s) you plan to use: +## Standard Environment Variables + +### AWS +- `AWS_REGION` – The AWS region. Default: `us-west-2` +- `AWS_SHARED_CREDENTIALS_FILE` – Use this to override the default AWS credentials path (Docker container sets this automatically if `.aws/credentials` present at `/root/.aws/credentials`). +- `AWS_PROFILE` – Specify which AWS CLI profile to use + +### GCP +- `GOOGLE_APPLICATION_CREDENTIALS` – Path to Service Account JSON credentials (Docker container sets this automatically if present). + +### General +- `HUGGING_FACE_HUB_KEY` – If using Hugging Face Hub models, set this in your `.env` file for private or gated models. + + Example `.env` file: + ```dotenv + HUGGING_FACE_HUB_KEY=hf_xxxxxxxxxxxxxxxx + ``` + +Magemaker checks for a `.env` file at startup and loads variables if present. + +## Configuration File Locations + +When you deploy a model, Magemaker saves the deployment configuration YAML to: ```bash -# AWS Configuration -AWS_ACCESS_KEY_ID="your-access-key" # Required for AWS -AWS_SECRET_ACCESS_KEY="your-secret-key" # Required for AWS -SAGEMAKER_ROLE="arn:aws:iam::..." # Required for AWS - -# GCP Configuration -PROJECT_ID="your-project-id" # Required for GCP -GCLOUD_REGION="us-central1" # Required for GCP - -# Azure Configuration -AZURE_SUBSCRIPTION_ID="your-sub-id" # Required for Azure -AZURE_RESOURCE_GROUP="ml-resources" # Required for Azure -AZURE_WORKSPACE_NAME="ml-workspace" # Required for Azure -AZURE_REGION="eastus" # Required for Azure - -# Optional configurations -HUGGING_FACE_HUB_KEY="your-hf-token" # Required for gated HF models like llama +/configs/.yml ``` -Never commit your .env file to version control! +**The tool prints the exact path after deployment. Save this!** + +## Running with Docker and Environment Variables + +If using Docker, mount your cloud credentials as volumes for AWS and GCP. The entrypoint script will set the appropriate environment variables automatically for: +- AWS (`AWS_SHARED_CREDENTIALS_FILE`) +- GCP (`GOOGLE_APPLICATION_CREDENTIALS`) + +For Azure, use the standard Azure CLI approach; see [Azure configuration](./Azure) for details. diff --git a/configuration/GCP.mdx b/configuration/GCP.mdx index c9cd369..69f6ca9 100644 --- a/configuration/GCP.mdx +++ b/configuration/GCP.mdx @@ -1,38 +1,62 @@ ---- -title: GCP ---- - - -Visit [Google Cloud Console](https://cloud.google.com/?hl=en) to create your account. - +# Google Cloud Platform (GCP) Configuration + +To use GCP services with Magemaker (e.g., deploying/querying Vertex AI endpoints), you must supply GCP credentials. + +## Setting Up GCP Credentials + +### 1. Create or Use a Google Cloud Service Account + +- Go to **IAM & Admin > Service Accounts** +- Create a service account with Vertex AI and Storage Admin permissions +- Download the JSON key file + +### 2. Install Google Cloud CLI + +Install via the [official guide](https://cloud.google.com/sdk/docs/install), or let Magemaker's Dockerfile install it for you. - - Once you have created your account, create a new project. If this is your first time the default project is "My First Project". You can create a new project by clicking this button and then selecting "New Project". +### 3. Provide the Credentials - ![Enter image alt description](../Images/google_new_project.png) +- **Locally:** + 1. Place your JSON key at `~/.config/gcloud/application_default_credentials.json` + 2. Authenticate: + ```bash + gcloud auth application-default login + # or set env var + export GOOGLE_APPLICATION_CREDENTIALS="/absolute/path/to/key.json" + ``` +- **Docker:** + - Mount your gcloud credentials: + ```bash + docker run -it \ + -v $HOME/.config/gcloud:/root/.config/gcloud:ro \ + magemaker-server + ``` + - Docker entrypoint will automatically set `GOOGLE_APPLICATION_CREDENTIALS` if your application default credentials exist. - +### 4. Testing Credentials - -1. Follow the installation guide at [Google Cloud SDK Installation Documentation](https://cloud.google.com/sdk/docs/install-sdk) -2. Initialize the SDK by running: - ```bash - gcloud init - ``` - +The `preflight.sh` script provides automated checking: -3. During initialization: - - Create login credentials when prompted - - Create a new project or select an existing one - To make sure the initialization worked, run: - ```bash - gcloud auth application-default login - ``` +```bash +bash magemaker/scripts/preflight.sh +``` + +If everything is setup properly, it will check access to your GCP project. + + +--- + +## Environment Variable Summary + +- `GOOGLE_APPLICATION_CREDENTIALS` — Path to your Service Account JSON key + +--- - -Navigate to the APIs & Services on the dashboard and enable the Vertex AI API for your project. +**Deployment Tip:** After deploying, you'll see a message like this: -![Enter image alt description](../Images/QrB_Image_11.png) - +```bash +Important: Configuration saved at: /app/configs/.yml +You'll need this path for querying the model later. +``` - \ No newline at end of file +Save or copy this path to use with Magemaker's query command. diff --git a/installation.mdx b/installation.mdx index 1d843eb..fbef171 100644 --- a/installation.mdx +++ b/installation.mdx @@ -1,158 +1,76 @@ ---- -title: Installation -description: Configure Magemaker for your cloud provider ---- +# Installation +Follow these instructions to install and set up Magemaker, including optional Docker-based usage and new credential handling introduced in v0.3.0. - - For Macs, maxOS >= 13.6.6 is required. Apply Silicon devices (M1) must use Rosetta terminal. You can verify, your terminals architecture by running `arch`. It should print `i386` for Rosetta terminal. - +## Prerequisites +- Python 3.11+ +- `git`, `curl`, and build tools (for Docker containers) +- Cloud provider CLI tools (AWS CLI, Google Cloud SDK, Azure CLI)—see below for details -Install via pip: +## Install Magemaker (Locally) -```sh +```bash pip install magemaker ``` +## Using Docker -## Cloud Account Setup - -### AWS Configuration - -- Follow this detailed guide for setting up AWS credentials: - [AWS Setup Guide](/configuration/AWS) - -Once you have your AWS credentials, you can configure Magemaker by running: +To use Magemaker in a Dockerized environment (now supports easy credential injection for AWS/GCP), build with the included Dockerfile: ```bash -magemaker --cloud aws +git clone https://github.com/slashml/magemaker.git +cd magemaker +docker build -f Dockerfile-server -t magemaker-server . ``` -It will prompt you for aws credentials and set up the necessary configurations. - - -### GCP (Vertex AI) Configuration - -- Follow this detailed guide for setting up GCP credentials: - [GCP Setup Guide](/configuration/GCP) +### Dockerfile Updates (v0.3.0+) +- Switched to `python:3.11-slim` +- Installs AWS CLI, Google Cloud SDK, Azure CLI (in container) +- Mounts `.aws/credentials` and GCP application default credentials from your host for authentication +- Custom `entrypoint.sh` assigns credential env vars at container start - -once you have your GCP credentials, you can configure Magemaker by running: +#### Running the container with credentials ```bash -magemaker --cloud gcp +docker run -it \ + -v $HOME/.aws:/root/.aws:ro \ + -v $HOME/.config/gcloud:/root/.config/gcloud:ro \ + magemaker-server ``` -### Azure Configuration +*You can omit any credential directories for providers you do not use.* -- Follow this detailed guide for setting up Azure credentials: - [GCP Setup Guide](/configuration/Azure) +## Provider CLI Install (Locally) +You will need to install the relevant cloud CLI if you are not using Docker: -Once you have your Azure credentials, you can configure Magemaker by running: +- **AWS CLI** ([install guide](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html)) +- **Google Cloud SDK** ([install guide](https://cloud.google.com/sdk/docs/install)) +- **Azure CLI** ([install guide](https://learn.microsoft.com/en-us/cli/azure/install-azure-cli)) -```bash -magemaker --cloud azure -``` +You may use `magemaker/scripts/preflight.sh` to assist in installing CLI tools and testing credentials; see [below](#checking-cloud-credentials). +## Checking Cloud Credentials -### All three cloud providers - -If you have configured all three cloud providers, you can verify your configuration by running: +Magemaker's `scripts/preflight.sh` script will check and help configure your credentials for AWS, GCP, and Azure, and test connectivity. Run: ```bash -magemaker --cloud all +bash magemaker/scripts/preflight.sh ``` +Follow prompts as needed. -### Required Config File -By default, Magemaker will look for a `.env` file in your project root with the following variables based on which cloud provider(s) you plan to use: - - -```bash -# AWS Configuration -AWS_ACCESS_KEY_ID="your-access-key" # Required for AWS -AWS_SECRET_ACCESS_KEY="your-secret-key" # Required for AWS -SAGEMAKER_ROLE="arn:aws:iam::..." # Required for AWS - -# GCP Configuration -PROJECT_ID="your-project-id" # Required for GCP -GCLOUD_REGION="us-central1" # Required for GCP - -# Azure Configuration -AZURE_SUBSCRIPTION_ID="your-sub-id" # Required for Azure -AZURE_RESOURCE_GROUP="ml-resources" # Required for Azure -AZURE_WORKSPACE_NAME="ml-workspace" # Required for Azure -AZURE_REGION="eastus" # Required for Azure - -# Optional configurations -HUGGING_FACE_HUB_KEY="your-hf-token" # Required for gated HF models like llama -``` - -Never commit your .env file to version control! +**Credential Testing**: The pre-flight script now explicitly runs a test connection check for AWS credentials after setup. - - For gated models like llama-3.1 from Meta, you might have to accept terms of use for model on hugging face and adding Hugging face token to the environment are necessary for deployment to go through. - +## Summary of Credential Injection -{/* ## Verification +- For **Docker** users, mount your cloud credential folders as shown above. +- For **local** installs, your CLI credentials are automatically picked up from their standard locations. +- At runtime, Magemaker uses: + - `AWS_SHARED_CREDENTIALS_FILE` for AWS (auto-set by Docker entrypoint if available) + - `GOOGLE_APPLICATION_CREDENTIALS` for GCP (auto-set by Docker entrypoint if present) -To verify your configuration: - -```bash -magemaker verify -``` */} - -## Best Practices - -1. **Resource Management** - - Monitor quota limits - - Clean up unused resources - - Set up cost alerts - -2. **Environment Management** - - - Use separate configurations for dev/prod - - Regularly rotate access keys - - Use environment-specific roles - -3. **Security** - - - Follow principle of least privilege - - Use service accounts where possible - - Enable audit logging - - - -## Troubleshooting - -Common configuration issues: - -1. **AWS Issues** - - - Check IAM role permissions - - Verify SageMaker quota - - Confirm region settings - -2. **GCP Issues** - - - Verify service account permissions - - Check Vertex AI API enablement - - Confirm project ID +--- -3. **Azure Issues** - - Check resource provider registration status: - ```bash - az provider show -n Microsoft.MachineLearningServices - az provider show -n Microsoft.ContainerRegistry - az provider show -n Microsoft.KeyVault - az provider show -n Microsoft.Storage - az provider show -n Microsoft.Insights - az provider show -n Microsoft.ContainerService - az provider show -n Microsoft.PolicyInsights - az provider show -n Microsoft.Cdn - ``` - - Verify workspace access - - Confirm subscription status - - Ensure all required providers are registered +See [Configuration: AWS](configuration/AWS) and [Configuration: GCP](configuration/GCP) for more details on each provider's authentication. diff --git a/quick-start.mdx b/quick-start.mdx index 5853ef8..3353928 100644 --- a/quick-start.mdx +++ b/quick-start.mdx @@ -1,186 +1,56 @@ ---- -title: Quick Start -"og:title": "Magemaker" ---- - - Make sure you have followed the [installation](installation) steps before proceeding. - +# Quick Start -## Interactive View +Get up and running with Magemaker for cloud LLM deployment in just a few steps! Now with automated credential testing, Dockerfile improvements, and clearer config guidance. -1. Run Magemaker with your desired cloud provider: +## 1. Install Magemaker -```sh -magemaker --cloud [aws|gcp|azure|all] +```bash +pip install magemaker ``` -Supported providers: - -- `--cloud aws` AWS SageMaker deployment -- `--cloud gcp` Google Cloud Vertex AI deployment -- `--cloud azure` Azure Machine Learning deployment -- `--cloud all` Configure all three providers at the same time - - -### List Models - -From the dropdown, select `Show Acitve Models` to see the list of endpoints deployed. +## 2. Set up Your Cloud Credentials -![Acitve Endpoints](../Images/active-1.png) +- **AWS**: Set up with `aws configure` (or mount `~/.aws` folder in Docker) +- **GCP**: Authenticate with `gcloud auth application-default login` (or mount `$HOME/.config/gcloud`) +- **Azure**: See [Azure config docs](configuration/Azure) -### Delete Models +## 3. Check Your Setup (Recommended) -From the dropdown, select `Delete a Model Endpoint` to see the list of models endpoints. Press space to select the endpoints you want to delete +Run the pre-flight script to test credentials and ensure required cloud CLIs are installed: -![Delete Endpoints](../Images/delete-1.png) - - -### Querying Models - -From the dropdown, select `Query a Model Endpoint` to see the list of models endpoints. Press space to select the endpoints you want to query. Enter the query in the text box and press enter to get the response. - -![Query Endpoints](../Images/query-1.png) +```bash +bash magemaker/scripts/preflight.sh +``` +Follow the prompts. The script checks and (if needed) installs required CLI tools and verifies cloud access. -### YAML-based Deployment (Recommended) +## 4. Deploy a Model -For reproducible deployments, use YAML configuration: +Deploy a HuggingFace model to AWS SageMaker: -```sh -magemaker --deploy .magemaker_config/your-model.yaml +```bash +magemaker deploy --provider aws --model-id google-bert/bert-base-uncased ``` -Example YAML for AWS deployment: - -```yaml -deployment: !Deployment - destination: aws - endpoint_name: facebook-opt-test - instance_count: 1 - instance_type: ml.m5.xlarge - num_gpus: null - quantization: null -models: - - !Model - id: facebook/opt-125m - location: null - predict: null - source: huggingface - task: text-generation - version: null -``` +The CLI will print logs, including (new!): -For GCP Vertex AI: - -```yaml -deployment: !Deployment - destination: gcp - endpoint_name: facebook-opt-test - accelerator_count: 1 - instance_type: g2-standard-12 - accelerator_type: NVIDIA_L4 - num_gpus: null - quantization: null - -models: - - !Model - id: facebook/opt-125m - location: null - predict: null - source: huggingface - task: null - version: null +```bash +Important: Configuration saved at: /your/path/configs/.yml +You'll need this path for querying the model later. ``` -For Azure ML: - -```yaml -deployment: !Deployment - destination: azure - endpoint_name: facebook-opt-test - instance_count: 1 - instance_type: Standard_DS3_v2 -models: - - !Model - id: facebook--opt-125m - location: null - predict: null - source: huggingface - task: text-generation - version: null -``` - - The model ids for Azure are different from AWS and GCP. Make sure to use the one provided by Azure in the Azure Model Catalog. +**Note:** Save/copy the config file path shown after deployment. - To find the relevant model id, follow the following steps - - - Find the workpsace in the Azure portal and click on the studio url provided. Click on the `Model Catalog` on the left side bar - ![Azure ML Creation](../Images/workspace-studio.png) - +## 5. Query the Model - - Select Hugging-Face from the collections list. The id of the model card is the id you need to use in the yaml file - ![Azure ML Creation](../Images/hugging-face.png) - +You can now query your deployed model by referring to the config file. For example: - - - - - -### Model Fine-tuning - -Fine-tune models using the `train` command: - -```sh -magemaker --train .magemaker_config/train-config.yaml -``` - -Example training configuration: - -```yaml -training: !Training - destination: aws # or gcp, azure - instance_type: ml.p3.2xlarge # varies by cloud provider - instance_count: 1 - training_input_path: s3://your-bucket/data.csv - hyperparameters: !Hyperparameters - epochs: 3 - per_device_train_batch_size: 32 - learning_rate: 2e-5 +```bash +magemaker query --endpoint ``` -{/* -### Recommended Models - - - Fill Mask: tries to complete your sentence like Madlibs. Query format: text - string with [MASK] somewhere in it. - +If the config file for your endpoint is missing, Magemaker will now print a detailed error message with the expected file location. - - Feature extraction: turns text into a 384d vector embedding for semantic - search / clustering. Query format: "type out a sentence like this one." - - */} - - - Remember to deactivate unused endpoints to avoid unnecessary charges! - - - -## Contact - -You can reach us, faizan & jneid, at [support@slashml.com](mailto:support@slashml.com). - - -If anything doesn't make sense or you have suggestions, do point them out at [magemaker.featurebase.app](https://magemaker.featurebase.app/). +--- -We'd love to hear from you! We're excited to learn how we can make this more valuable for the community and welcome any and all feedback and suggestions. +For full details, see the [installation guide](installation) and [provider configuration pages](configuration/AWS). diff --git a/tutorials/deploying-llama-3-to-aws.mdx b/tutorials/deploying-llama-3-to-aws.mdx index 46f0659..a210305 100644 --- a/tutorials/deploying-llama-3-to-aws.mdx +++ b/tutorials/deploying-llama-3-to-aws.mdx @@ -1,110 +1,49 @@ ---- -title: Deploying Llama 3 to SageMaker ---- +# Tutorial: Deploy Llama 3 to AWS SageMaker -## Introduction -This tutorial guides you through deploying Llama 3 to AWS SageMaker using Magemaker and querying it using the interactive dropdown menu. Ensure you have followed the [installation](installation) steps before proceeding. +This tutorial demonstrates deploying Llama 3 to SageMaker using Magemaker. Now updated to walk you through new CLI messages, config file output, and credential testing tools. -## Step 1: Setting Up Magemaker for AWS +## Prerequisites +- AWS account with SageMaker and S3 permissions +- [AWS CLI configured](../configuration/AWS) +- Optional: Docker (with mounted credentials) +- Python 3.11+ -Run the following command to configure Magemaker for AWS SageMaker deployment: +## Step 1: Check Credentials & Install CLI + +Run the preflight script: ```sh -magemaker --cloud aws +bash magemaker/scripts/preflight.sh ``` -This initializes Magemaker with the necessary configurations for deploying models to SageMaker. - -## Step 2: YAML-based Deployment +This checks AWS CLI install and verifies credentials. -For reproducible deployments, use YAML configuration: +## Step 2: Deploy ```sh -magemaker --deploy .magemaker_config/your-model.yaml +magemaker deploy --provider aws --model-id meta-llama/Llama-3-8b-hf ``` -Example YAML for AWS deployment: - -```yaml -deployment: !Deployment - destination: aws - endpoint_name: llama3-endpoint - instance_count: 1 - instance_type: ml.g5.2xlarge - num_gpus: 1 - quantization: null - -models: - - !Model - id: meta-llama/Meta-Llama-3-8B-Instruct - location: null - predict: null - source: huggingface - task: text-generation - version: null +## Step 3: Save the Deployment Config Path + +After deployment, Magemaker prints: + +``` +Important: Configuration saved at: /app/configs/.yml +You'll need this path for querying the model later. ``` - - For gated models like llama from Meta, you have to accept terms of use for model on hugging face and adding Hugging face token to the environment are necessary for deployment to go through. - - - - -You may need to request a quota increase for specific machine types and GPUs in the region where you plan to deploy the model. Check your AWS quotas before proceeding. - - -## Step 3: Querying the Deployed Model - -Once the deployment is complete, note down the endpoint id. - -You can use the interactive dropdown menu to quickly query the model. - -### Querying Models - -From the dropdown, select `Query a Model Endpoint` to see the list of model endpoints. Press space to select the endpoint you want to query. Enter your query in the text box and press enter to get the response. - -![Query Endpoints](../Images/query-1.png) - -Or you can use the following code: -```python -from sagemaker.huggingface.model import HuggingFacePredictor -import sagemaker - -def query_huggingface_model(endpoint_name: str, query: str): - # Initialize a SageMaker session - sagemaker_session = sagemaker.Session() - - # Create a HuggingFace predictor - predictor = HuggingFacePredictor( - endpoint_name=endpoint_name, - sagemaker_session=sagemaker_session - ) - - # Prepare the input - input_data = { - "inputs": query - } - - try: - # Make prediction - result = predictor.predict(input_data) - print(result) - return result - except Exception as e: - print(f"Error making prediction: {str(e)}") - raise e - -# Example usage -if __name__ == "__main__": - # Replace with your actual endpoint name - ENDPOINT_NAME = "your-deployed-endpoint" - - # Your test question - question = "what are you?" - - # Make prediction - response = query_huggingface_model(ENDPOINT_NAME, question) +**Copy the file path!** You'll use it for queries. + +## Step 4: Query the Model + +```sh +magemaker query --endpoint ``` -## Conclusion -You have successfully deployed and queried Llama 3 on AWS SageMaker using Magemaker's interactive dropdown menu. For any questions or feedback, feel free to contact us at [support@slashml.com](mailto:support@slashml.com). +If no config is found, you will see a message showing the expected path. Make sure you have the `.yml` config in the directory indicated. + +## Troubleshooting +- If credentials are wrong/missing, preflight will warn you +- If using Docker, mount `~/.aws` as described in [Installation](../installation) +- On query failure, check that the endpoint config exists at the required path