Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 43 additions & 1 deletion speculative_decoding_trn2_vllm/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -161,6 +161,48 @@ export CLUSTER_NAME=your-cluster-name
aws s3 mb s3://${S3_BUCKET_NAME} --region ${AWS_REGION}
```

Create Least-Privilege S3 IAM Policy

Define your bucket name:

```bash
export S3_BUCKET=YOUR_BUCKET
export POLICY_NAME=vllm-trn2-s3-policy
```

Create the policy document:

```bash
cat <<EOF > s3-policy.json
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::${S3_BUCKET}",
"arn:aws:s3:::${S3_BUCKET}/*"
]
}
]
}
EOF
```

Create the IAM policy:

```bash
POLICY_ARN=$(aws iam create-policy \
--policy-name $POLICY_NAME \
--policy-document file://s3-policy.json \
--query 'Policy.Arn' \
--output text)
```
Create an IAM service account for S3 access:

```bash
Expand All @@ -170,7 +212,7 @@ eksctl create iamserviceaccount \
--cluster ${CLUSTER_NAME} \
--role-name s3-csi-driver-sa-role \
--region ${AWS_REGION} \
--attach-policy-arn arn:aws:iam::aws:policy/AmazonS3FullAccess \
--attach-policy-arn ${POLICY_ARN} \
--approve
```

Expand Down
6 changes: 3 additions & 3 deletions speculative_decoding_trn2_vllm/qwen-sd-vllm-deploy.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ spec:
volumeMounts:
- name: dshm
mountPath: /dev/shm
- name: 621547421844-ap-southeast-4-pvc
- name: my-pvc
mountPath: /var/mdl
env:
- name: VLLM_NEURON_FRAMEWORK
Expand Down Expand Up @@ -97,6 +97,6 @@ spec:
emptyDir:
medium: Memory
sizeLimit: 128Gi
- name: 621547421844-ap-southeast-4-pvc
- name: my-pvc
persistentVolumeClaim:
claimName: 621547421844-ap-southeast-4-pvc
claimName: my-pvc
6 changes: 3 additions & 3 deletions speculative_decoding_trn2_vllm/qwen-vllm-deploy.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ spec:
volumeMounts:
- name: dshm
mountPath: /dev/shm
- name: 621547421844-ap-southeast-4-pvc
- name: my-pvc
mountPath: /var/mdl
env:
- name: VLLM_NEURON_FRAMEWORK
Expand Down Expand Up @@ -93,6 +93,6 @@ spec:
emptyDir:
medium: Memory
sizeLimit: 128Gi
- name: 621547421844-ap-southeast-4-pvc
- name: my-pvc
persistentVolumeClaim:
claimName: 621547421844-ap-southeast-4-pvc
claimName: my-pvc
10 changes: 5 additions & 5 deletions speculative_decoding_trn2_vllm/s3-csi-pv.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
apiVersion: v1
kind: PersistentVolume
metadata:
name: 621547421844-ap-southeast-4-pv
name: my-pv
spec:
capacity:
storage: 1200Gi
Expand All @@ -10,7 +10,7 @@ spec:
storageClassName: ""
claimRef:
namespace: default
name: 621547421844-ap-southeast-4-pvc
name: my-pvc
mountOptions:
- region=ap-southeast-4
- allow-delete
Expand All @@ -19,17 +19,17 @@ spec:
driver: s3.csi.aws.com
volumeHandle: s3-csi-driver-volume
volumeAttributes:
bucketName: 621547421844-ap-southeast-4
bucketName: my-bucket
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: 621547421844-ap-southeast-4-pvc
nmy-pvcame: my-pvc
spec:
accessModes:
- ReadWriteMany # Supported options: ReadWriteMany / ReadOnlyMany
storageClassName: "" # Required for static provisioning
resources:
requests:
storage: 1200Gi
volumeName: 621547421844-ap-southeast-4-pv
volumeName: my-pv