config the vLLM engineArgs in config.pbtxt#63
config the vLLM engineArgs in config.pbtxt#63activezhao wants to merge 17 commits intotriton-inference-server:mainfrom
Conversation
) * Initial Commit * Mount model repo so changes reflect, parameter tweaking, README file * Image name error * Incorporating review comments. Separate docker and model repo builds, add README, restructure repo * Tutorial restructuring. Using static model configurations * Bump triton container and update README * Remove client script * Incorporating review comments * Modify WIP line in vLLM tutorial * Remove trust_remote_code parameter from falcon model * Removing Mistral * Incorporating Feedback * Change input/output names * Pre-commit format * Different perf_analyzer example, config file format fixes * Deep dive changes to Triton tools section * Remove unused variable
Added Llama2 tutorial for TensorRT-LLM backend
…ference-server#65) * Updated vLLM tutorial's README to use vllm container --------- Co-authored-by: dyastremsky <58150256+dyastremsky@users.noreply.github.com>
|
Hi @activezhao , may I kindly ask you to re-base your PR on top of the main branch and send us a CLA: https://github.com/triton-inference-server/server/blob/main/CONTRIBUTING.md#contributor-license-agreement-cla |
Hi @oandreeva-nv OK, I will do it. But, I find that the structure of Quick_Deploy/vLLM has changed a lot, will this pr still be OK? |
# Conflicts: # Quick_Deploy/vLLM/config.pbtxt
Hi @oandreeva-nv Because the Could you please close this PR and do CR in the new PR? Thanks. |
|
Closing this PR per @activezhao request |
We get vLLm engineArgs from vllm_engine_args.json before, now, we can get them from config.pbtxt.