Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Qlora是否考虑添加 #1139

Open
freelancerllm opened this issue May 29, 2023 · 0 comments
Open

[Feature] Qlora是否考虑添加 #1139

freelancerllm opened this issue May 29, 2023 · 0 comments

Comments

@freelancerllm
Copy link

Is your feature request related to a problem? Please describe.

  1. Qlora是一种训练微调方法,通过这种方式可以在单个48G的GPU显卡上微调65B的参数模型,采用这种方式训练的模型可以保持16字节微调任务的性能。QLoRA通过冻结的int4量化预训练语言模型反向传播梯度到低秩适配器LoRA来实现微调。
  2. 目前所提供的是6B的模型,一定程度上讲,可以不进行优化,但是针对后边比较大的模型,是需要进一步优化的,还有一个潜在的问题就是比6B大的模型其性能如何呢?这个是否有效果的对比呢。

Solutions

  1. https://arxiv.org/pdf/2305.14314.pdf
  2. https://github.com/feihuamantian/qlora
  3. 4-bit QLoRA via bitsandbytes (4-bit base model + LoRA) huggingface/transformers#23479

Additional context

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant