convert.py fails importing a new model architecture #7406

JohnSully · 2024-05-20T06:30:07Z

I am trying to port a new model I've created to GGUF however I'm hitting issues in convert.py. Specifically it seems to be confused that my lm_head has two linear layers.

I get the error:

Exception: Unexpected tensor name: lm_head.linear1.bias

If I add lm_head.linear1 and lm_head.linear2 to gguf-py/gguf/tensor_mapping.py convert will run however trying to actually use it with llama.cpp it will complain about 2 missing layers.

error loading model: done_getting_tensors: wrong number of tensors; expected 293, got 291
llama_load_model_from_file: failed to load model
main: error: unable to load model

Can you provide some tips on what I need to modify to make this work? Also if there is any documentation on porting new model architectures I would appreciate it if you could point me to it.

The text was updated successfully, but these errors were encountered:

jukofyork · 2024-05-20T20:22:52Z

If I add lm_head.linear1 and lm_head.linear2

Even if this works it will likely think this is just two linear .weight type projections in series, whereas to use a .bias it needs to do an affine projection.

I don't know enough about llama.cpp to help more, but IIRC the Qwen models have some affine projections in then and use .bias as well as .weight, so this might be worth a look.

compilade · 2024-05-21T18:34:05Z

Can you provide some tips on what I need to modify to make this work?

If it's a variation of an existing architecture, you might be able to simply specify new optional tensors on model load and then detect their presence in the compute graph to use them when they are present.

This is kind of how StableLM2 1.6B support was added in #5052.

Also if there is any documentation on porting new model architectures I would appreciate it if you could point me to it.

https://github.com/ggerganov/llama.cpp/blob/master/docs/HOWTO-add-model.md

JohnSully · 2024-05-21T18:36:22Z

Thanks this looks like what I need. Google has gotten really bad at finding things lately.

…

On Tue, May 21, 2024 at 2:34 PM compilade ***@***.***> wrote: Can you provide some tips on what I need to modify to make this work? If it's a variation of an existing architecture, you might be able to simply specify new optional tensors on model load and then detect their presence in the compute graph to use them when they are present. This is kind of how StableLM2 1.6B support was added <https://github.com/ggerganov/llama.cpp/pull/5052/files#diff-150dc86746a90bad4fc2c3334aeb9b5887b3adad3cc1459446717638605348ef> in #5052 <#5052>. Also if there is any documentation on porting new model architectures I would appreciate it if you could point me to it. https://github.com/ggerganov/llama.cpp/blob/master/docs/HOWTO-add-model.md — Reply to this email directly, view it on GitHub <#7406 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AA5W4ARHYKKYEEKE3PG73E3ZDOHTHAVCNFSM6AAAAABH7EAZ6OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRTGIYTCOBRGA> . You are receiving this because you authored the thread.Message ID: ***@***.***>

mofosyne added the question Further information is requested label May 21, 2024

Galunid closed this as completed May 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

convert.py fails importing a new model architecture #7406

convert.py fails importing a new model architecture #7406

JohnSully commented May 20, 2024 •

edited

jukofyork commented May 20, 2024 •

edited

compilade commented May 21, 2024

JohnSully commented May 21, 2024 via email

convert.py fails importing a new model architecture #7406

convert.py fails importing a new model architecture #7406

Comments

JohnSully commented May 20, 2024 • edited

jukofyork commented May 20, 2024 • edited

compilade commented May 21, 2024

JohnSully commented May 21, 2024 via email

JohnSully commented May 20, 2024 •

edited

jukofyork commented May 20, 2024 •

edited