File Limit Request: nm-vllm - 400 MiB #4076

mgoin · 2024-05-20T14:07:22Z

Project URL

https://pypi.org/project/nm-vllm/

Does this project already exist?

Yes

New Limit

400 MB

Update issue title

I have updated the title.

Which indexes

PyPI

About the project

vLLM is a fast and easy-to-use library for LLM inference and serving that already has a file limit increase to 400 MB in issue # 3792. nm-vllm is an enterprise-supported fork of vLLM that requires similar file size because of the amount of compiled kernels.

Reasons for the request

Pre-compiling these kernels means that users can deploy quickly and deterministically, rather than needing to setup a compilation environment where they need to deploy. As we extend our optimized inference for more hardware platforms, the binary size will grow so we would like to follow the standard vLLM sets.

Code of Conduct

I agree to follow the PSF Code of Conduct

mgoin added the limit request label May 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

File Limit Request: nm-vllm - 400 MiB #4076

File Limit Request: nm-vllm - 400 MiB #4076

mgoin commented May 20, 2024

File Limit Request: nm-vllm - 400 MiB #4076

File Limit Request: nm-vllm - 400 MiB #4076

Comments

mgoin commented May 20, 2024

Project URL

Does this project already exist?

New Limit

Update issue title

Which indexes

About the project

Reasons for the request

Code of Conduct