ImportError: Using `bitsandbytes` 8-bit quantization requires Accelerate: `pip install accelerate` and the latest version of bitsandbytes: `pip install -i https://pypi.org/simple/ bitsandbytes` #13569

AnandUgale · 2024-05-18T01:58:17Z

Bug Description

ImportError: Using bitsandbytes 8-bit quantization requires Accelerate: pip install accelerate and the latest version of bitsandbytes: pip install -i https://pypi.org/simple/ bitsandbytes

Environment

Packages installed with CUDA 11.8:

torch - 2.3.0+cu118
llama-index - 0.10.37
llama-index-llms-huggingface - 0.2.0
transformers - 4.39.0
accelerate - 0.27.0
bitsandbytes - 0.43.1

Version

0.10.37

Steps to Reproduce

import torch
from llama_index.llms.huggingface import HuggingFaceLLM

Optional quantization to 4bit

from transformers import BitsAndBytesConfig

quantization_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_compute_dtype=torch.float16,
bnb_4bit_quant_type="nf4",
bnb_4bit_use_double_quant=True,
)

llm = HuggingFaceLLM(
model_name="meta-llama/Meta-Llama-3-8B-Instruct",
model_kwargs={
"token": hf_token,
"torch_dtype": torch.bfloat16, # comment this line and uncomment below to use 4bit
# "quantization_config": quantization_config
},
generate_kwargs={
"do_sample": True,
"temperature": 0.6,
"top_p": 0.9,
},
tokenizer_name="meta-llama/Meta-Llama-3-8B-Instruct",
tokenizer_kwargs={"token": hf_token},
stopping_ids=stopping_ids,
)

Relevant Logs/Tracbacks

ImportError: Using `bitsandbytes` 8-bit quantization requires Accelerate: `pip install accelerate` and the latest version of bitsandbytes: `pip install -i https://pypi.org/simple/ bitsandbytes`

The text was updated successfully, but these errors were encountered:

dosubot · 2024-05-18T01:58:47Z

Hey @AnandUgale, great to run into you again on here! 🚀 It looks like you've stumbled upon an intriguing challenge. I'm diving into the details now and will circle back with a more comprehensive response soon. Stay tuned!

logan-markewich · 2024-05-18T01:59:52Z

This isn't really a llama index issue, it's a huggingface issue 😅 if you are in a notebook, you might have to restart your notebook after installing

dosubot · 2024-05-18T02:07:32Z

To address the ImportError you're encountering with bitsandbytes and 8-bit quantization, here are targeted steps to resolve the issue:

Reinstall bitsandbytes for CUDA 11.8: Ensure bitsandbytes is correctly installed for your CUDA version by running:
```
pip install bitsandbytes-cuda118
```
This command installs the version compiled specifically for CUDA 11.8, which might resolve compatibility issues.
Check Environment Variables: Verify that CUDA_HOME and LD_LIBRARY_PATH are correctly set to reflect your CUDA 11.8 installation. Misconfigured environment variables can lead to import errors.
Isolate the ImportError: Determine exactly where the ImportError occurs. Is it during the import of a specific module from bitsandbytes or another package? Pinpointing the location can help identify the cause.
Consult Documentation and Issues: Look into the GitHub repositories and documentation for bitsandbytes, llama_index, and HuggingFace's transformers for similar issues. Sometimes, specific version conflicts or bugs are documented.
Environment Clean-Up: If necessary, create a new virtual environment and reinstall your dependencies. This can help eliminate conflicts or issues from previous installations.

For integrating bitsandbytes with HuggingFace's transformers, ensure your BitsAndBytesConfig and model initialization align with the provided example. This includes setting the quantization_config in model_kwargs when creating the HuggingFaceLLM instance. If these steps don't resolve the issue, please provide more details about the error and the code causing it for further assistance.

Details

Sources

llama-index-integrations/llms/llama-index-llms-huggingface/llama_index/llms/huggingface/base.py

About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

To continue the conversation, mention @dosu.

AnandUgale added bug Something isn't working triage Issue needs to be triaged/prioritized labels May 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ImportError: Using `bitsandbytes` 8-bit quantization requires Accelerate: `pip install accelerate` and the latest version of bitsandbytes: `pip install -i https://pypi.org/simple/ bitsandbytes` #13569

ImportError: Using `bitsandbytes` 8-bit quantization requires Accelerate: `pip install accelerate` and the latest version of bitsandbytes: `pip install -i https://pypi.org/simple/ bitsandbytes` #13569

AnandUgale commented May 18, 2024

dosubot bot commented May 18, 2024

logan-markewich commented May 18, 2024

dosubot bot commented May 18, 2024

Details

ImportError: Using bitsandbytes 8-bit quantization requires Accelerate: pip install accelerate and the latest version of bitsandbytes: pip install -i https://pypi.org/simple/ bitsandbytes #13569

ImportError: Using bitsandbytes 8-bit quantization requires Accelerate: pip install accelerate and the latest version of bitsandbytes: pip install -i https://pypi.org/simple/ bitsandbytes #13569

Comments

AnandUgale commented May 18, 2024

Bug Description

Version

Steps to Reproduce

Optional quantization to 4bit

Relevant Logs/Tracbacks

dosubot bot commented May 18, 2024

logan-markewich commented May 18, 2024

dosubot bot commented May 18, 2024

Details

ImportError: Using `bitsandbytes` 8-bit quantization requires Accelerate: `pip install accelerate` and the latest version of bitsandbytes: `pip install -i https://pypi.org/simple/ bitsandbytes` #13569

ImportError: Using `bitsandbytes` 8-bit quantization requires Accelerate: `pip install accelerate` and the latest version of bitsandbytes: `pip install -i https://pypi.org/simple/ bitsandbytes` #13569