-
Notifications
You must be signed in to change notification settings - Fork 216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is there anyone who can't generate images correctly? #122
Comments
@leejet It's my impression or it seems that the CUDA backend is experiencing synchronization issues even from the CLIP model; it tends to happen sometimes. build\bin\Release\sd -m models/kotosmix_v10-f16.gguf -p "beautiful anime girl, white hair, blue eyes, realistic, masterpiece, azur lane, 4k, high quality" -n "bad quality, ugly, face malformed, bad anatomy" --sampling-method dpm++2m --steps 20 -s 424354 with cpu backend (and cuda backend sometimes): Incorrect image since an incorrect (incomplete) embedding is generated, I don't really know. negative embedding invalid. Investigating this synchronization issue is very challenging; it tends to occur sporadically, and replicating it isn't easy. I tried printing the output tensor of the clip, and after 10 repetitions, I identified a change in the values of the embedding. |
google colab T4 cuda, in img2img mode VAE without --vae-tiling always producing solid color image.
|
@FSSRepo Please try colab ========= COMPUTE-SANITIZER
ggml_init_cublas: GGML_CUDA_FORCE_MMQ: no
ggml_init_cublas: CUDA_USE_TENSOR_CORES: yes
ggml_init_cublas: found 1 CUDA devices:
Device 0: Tesla T4, compute capability 7.5
[INFO] stable-diffusion.cpp:5386 - loading model from 'v1-5-pruned-emaonly.safetensors'
[INFO] model.cpp:638 - load v1-5-pruned-emaonly.safetensors using safetensors format
[INFO] stable-diffusion.cpp:5412 - Stable Diffusion 1.x
[INFO] stable-diffusion.cpp:5418 - Stable Diffusion weight type: f32
[INFO] stable-diffusion.cpp:5573 - total memory buffer size = 2731.37MB (clip 470.66MB, unet 2165.24MB, vae 95.47MB)
[INFO] stable-diffusion.cpp:5579 - loading model from 'v1-5-pruned-emaonly.safetensors' completed, taking 2.45s
[INFO] stable-diffusion.cpp:5593 - running in eps-prediction mode
[INFO] stable-diffusion.cpp:6486 - apply_loras completed, taking 0.00s
========= Error: Race reported between Read access at 0xbd0 in soft_max_f32(const float *, const float *, float *, int, int, float)
========= and Write access at 0x1d60 in soft_max_f32(const float *, const float *, float *, int, int, float) [80384 hazards]
=========
========= Error: Race reported between Read access at 0xbd0 in soft_max_f32(const float *, const float *, float *, int, int, float)
========= and Write access at 0x1d60 in soft_max_f32(const float *, const float *, float *, int, int, float) [79488 hazards]
=========
========= Error: Race reported between Read access at 0xbd0 in soft_max_f32(const float *, const float *, float *, int, int, float)
========= and Write access at 0x1d60 in soft_max_f32(const float *, const float *, float *, int, int, float) [77952 hazards]
=========
========= Error: Race reported between Read access at 0xbd0 in soft_max_f32(const float *, const float *, float *, int, int, float)
========= and Write access at 0x1d60 in soft_max_f32(const float *, const float *, float *, int, int, float) [75264 hazards]
=========
========= Error: Race reported between Read access at 0xbd0 in soft_max_f32(const float *, const float *, float *, int, int, float)
========= and Write access at 0x1d60 in soft_max_f32(const float *, const float *, float *, int, int, float) [81408 hazards]
=========
========= Error: Race reported between Read access at 0xbd0 in soft_max_f32(const float *, const float *, float *, int, int, float)
========= and Write access at 0x1d60 in soft_max_f32(const float *, const float *, float *, int, int, float) [79360 hazards]
=========
========= Error: Race reported between Read access at 0xbd0 in soft_max_f32(const float *, const float *, float *, int, int, float)
========= and Write access at 0x1d60 in soft_max_f32(const float *, const float *, float *, int, int, float) [80768 hazards]
=========
========= Error: Race reported between Read access at 0xbd0 in soft_max_f32(const float *, const float *, float *, int, int, float)
========= and Write access at 0x1d60 in soft_max_f32(const float *, const float *, float *, int, int, float) [80384 hazards]
=========
========= Error: Race reported between Read access at 0xbd0 in soft_max_f32(const float *, const float *, float *, int, int, float)
========= and Write access at 0x1d60 in soft_max_f32(const float *, const float *, float *, int, int, float) [78976 hazards]
=========
========= Error: Race reported between Read access at 0xbd0 in soft_max_f32(const float *, const float *, float *, int, int, float)
========= and Write access at 0x1d60 in soft_max_f32(const float *, const float *, float *, int, int, float) [78080 hazards]
=========
========= Error: Race reported between Read access at 0xbd0 in soft_max_f32(const float *, const float *, float *, int, int, float)
========= and Write access at 0x1d60 in soft_max_f32(const float *, const float *, float *, int, int, float) [79104 hazards]
=========
========= Error: Race reported between Read access at 0xbd0 in soft_max_f32(const float *, const float *, float *, int, int, float)
========= and Write access at 0x1d60 in soft_max_f32(const float *, const float *, float *, int, int, float) [78080 hazards] |
@Cyberhan123 Could you send me the CLI commands to perform this test? Your link is not allowing me to access Colab. |
I modified the link and the command is as follows
|
@leejet to fix race condition of softmax in cuda comment the line 6499, this may solve the errors with artifacts when using VAE tiling: while (nth < ncols_x && nth < CUDA_SOFT_MAX_BLOCK_SIZE) nth *= 2; // comment this line |
@leejet I've been testing the SDXL rendering. I did find some issues:
However, when I use the same meta data in SD.app, I get this instead... SDXL does have two text encoders - I'm not sure if this is dealt with in SD.cpp.... (NOTE: as a test for deterministic image generation, I did SD.cpp with SD1.5). Here is the example SD1.5: And this was reproduced in SD.cpp using the same meta data.... |
I'm getting the same horrible results while using SD-Turbo and SDXL-Turbo. |
|
@leejet I think the parameters @YAY-3M-TA3 Y set are wrong. He may have set CFG Scale to 7.0 |
For the SDXL base model, setting the CFG scale to 7 should be fine. In my example above, the CFG scale is also 7 (the default value). |
I'm also seeing this created #187 |
This is an existing problem I have seen. Some have been solved and some are weird. If you encounter it, please be patient and click in and check the comments. If you encounter something similar, please leave a message below.
Related issues:
The text was updated successfully, but these errors were encountered: