-
Notifications
You must be signed in to change notification settings - Fork 260
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Option --self_attn_type scaled-dot-flash is not supported (supported values are: scaled-dot) #1702
Comments
Okay apparently these are already on pull requests, I will test those. |
Okay I tried the modifications that are not yet committed and I got this error: Traceback (most recent call last):
|
post your yaml file (of onmt-py training) |
|
Yeah it's unclear between multiquery and num_kv. you need to force multiquery to false in your checkpoint. |
I know it's hard to document everything, thank you for your immense work Vincent. On the OpenNMT-py side setting num_kv to half of the number of heads seems to be working okay, I tried some short runs. |
Hello,
I'm trying to convert a model with CTranslate2 4.3 that has been trained with OpenNMT-py 3.5.1 but I get this error:
Converting Saved_Data/Models/fr_en_step_195000.pt to ctranslate2 format...
Traceback (most recent call last):
File "/home/username/anaconda3/envs/neu/bin/ct2-opennmt-py-converter", line 8, in
sys.exit(main())
File "/home/username/anaconda3/envs/neu/lib/python3.10/site-packages/ctranslate2/converters/opennmt_py.py", line 355, in main
OpenNMTPyConverter(args.model_path).convert_from_args(args)
File "/home/username/anaconda3/envs/neu/lib/python3.10/site-packages/ctranslate2/converters/converter.py", line 50, in convert_from_args
return self.convert(
File "/home/username/anaconda3/envs/neu/lib/python3.10/site-packages/ctranslate2/converters/converter.py", line 89, in convert
model_spec = self._load()
File "/home/username/anaconda3/envs/neu/lib/python3.10/site-packages/ctranslate2/converters/opennmt_py.py", line 181, in _load
check_opt(checkpoint["opt"], num_source_embeddings=len(src_vocabs))
File "/home/username/anaconda3/envs/neu/lib/python3.10/site-packages/ctranslate2/converters/opennmt_py.py", line 55, in check_opt
check.validate()
File "/home/username/anaconda3/envs/neu/lib/python3.10/site-packages/ctranslate2/converters/utils.py", line 106, in validate
raise_unsupported(self._unsupported_reasons)
File "/home/username/anaconda3/envs/neu/lib/python3.10/site-packages/ctranslate2/converters/utils.py", line 93, in raise_unsupported
raise ValueError(message)
ValueError: The model you are trying to convert is not supported by CTranslate2. We identified the following reasons:
I trained the model using Flash Attention in OpenNMY-py 3.5.1:
self_attn_type: scaled-dot-flash
If I modify the opennmt_py.py converter to accept scaled-dot-flash by replacing scaled-dot with it I once again get this error:
Traceback (most recent call last):
File "/home/username/anaconda3/envs/neu/bin/ct2-opennmt-py-converter", line 8, in
sys.exit(main())
File "/home/username/anaconda3/envs/neu/lib/python3.10/site-packages/ctranslate2/converters/opennmt_py.py", line 355, in main
OpenNMTPyConverter(args.model_path).convert_from_args(args)
File "/home/username/anaconda3/envs/neu/lib/python3.10/site-packages/ctranslate2/converters/converter.py", line 50, in convert_from_args
return self.convert(
File "/home/username/anaconda3/envs/neu/lib/python3.10/site-packages/ctranslate2/converters/converter.py", line 89, in convert
model_spec = self._load()
File "/home/username/anaconda3/envs/neu/lib/python3.10/site-packages/ctranslate2/converters/opennmt_py.py", line 200, in _load
return _get_model_spec_seq2seq(
File "/home/username/anaconda3/envs/neu/lib/python3.10/site-packages/ctranslate2/converters/opennmt_py.py", line 90, in _get_model_spec_seq2seq
set_transformer_spec(model_spec, variables)
File "/home/username/anaconda3/envs/neu/lib/python3.10/site-packages/ctranslate2/converters/opennmt_py.py", line 210, in set_transformer_spec
set_transformer_encoder(spec.encoder, variables)
File "/home/username/anaconda3/envs/neu/lib/python3.10/site-packages/ctranslate2/converters/opennmt_py.py", line 215, in set_transformer_encoder
set_input_layers(spec, variables, "encoder")
File "/home/username/anaconda3/envs/neu/lib/python3.10/site-packages/ctranslate2/converters/opennmt_py.py", line 241, in set_input_layers
set_position_encodings(
File "/home/username/anaconda3/envs/neu/lib/python3.10/site-packages/ctranslate2/converters/opennmt_py.py", line 341, in set_position_encodings
spec.encodings = _get_variable(variables, "%s.pe" % scope).squeeze()
File "/home/username/anaconda3/envs/neu/lib/python3.10/site-packages/ctranslate2/converters/opennmt_py.py", line 345, in _get_variable
return variables[name]
KeyError: 'encoder.embeddings.make_embedding.pe.pe'
Probably because it can't handle RoPE, my settings are:
position_encoding: false
max_relative_positions: -1
The model trains and inferences without problems in OpenNMT-py.
The text was updated successfully, but these errors were encountered: