Openshift AI系列4-部署DeepSeek V3 671B

Openshift AI Console中添加一个新的 Serving Runtime。Openshift AI Console中，在项目中部署一个模型，使用单机8卡。不使用route方式提供对外服务，采用nodeport访问对外提供服务。使用下面的定义建立Serving runtime。

mikerain123

892人浏览 · 2025-03-19 10:12:14

mikerain123 · 2025-03-19 10:12:14 发布

建立自定义的Serving Runtime

Openshift AI Console中添加一个新的 Serving Runtime

使用下面的定义建立Serving runtime

apiVersion: serving.kserve.io/v1alpha1

kind: ServingRuntime

labels:

opendatahub.io/dashboard: "true"

metadata:

annotations:

openshift.io/display-name: sglang 0.4.3

spec:

builtInAdapter:

modelLoadingTimeoutMillis: 90000

containers:

- image: quay.io/qxu/sglang:v0.4.3.post2-cu125

command:

- python3

- -m

- sglang.launch_server

args:

- --model

- /mnt/models/

- --served-model-name

- DeepSeek-V3

- --tp

- "8"

- --trust-remote-code

- --port

- "8080"

- --quantization

- fp8

- --mem-fraction-static

- "0.90"

- --context-length

- "64000"

- --enable-metrics

- --enable-torch-compile

- --log-requests

ports:

- containerPort: 8080

protocol: TCP

multiModel: false

supportedModelFormats:

- autoSelect: true

建立deepseek模型服务

Openshift AI Console中，在项目中部署一个模型，使用单机8卡

建立nodeport 服务，提供对外访问

不使用route方式提供对外服务，采用nodeport访问对外提供服务。

kind: Service

apiVersion: v1

metadata:

labels:

app: isvc.deepseek-v3-predictor

component: predictor

opendatahub.io/dashboard: 'true'

serving.kserve.io/inferenceservice: deepseek-v3

spec:

ports:

- protocol: TCP

port: 8080

targetPort: 8080

nodePort: 30001

type: NodePort

selector:

app: isvc.deepseek-v3-predictor

DeepSeek技术社区

欢迎加入DeepSeek 技术社区。在这里，你可以找到志同道合的朋友，共同探索AI技术的奥秘。

更多推荐

【华为开发者空间 x DeepSeek】基于华为开发者空间云主机DeepSeek助力电商企业AI海报文案驱动的最佳实践落地

DeepSeek技术社区

基于华为云主机 + DeepSeek一键部署快速搭建Dify-LLM应用开发鸿蒙学习助手

DeepSeek技术社区

【实战利器】大模型开源项目全盘点！超详细，一定记得收藏！

DeepSeek技术社区

所有评论(0)

查看更多评论

mikerain123

@mikerain123

已为社区贡献1条内容