Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run pre/post processing in threadpool #7327

Merged
merged 4 commits into from
Feb 15, 2024

Conversation

freddyaboulton
Copy link
Collaborator

@freddyaboulton freddyaboulton commented Feb 6, 2024

Description

Closes: #7241

In some cases, the component pre/postprocessing takes a long time (like big galleries or chatbots) and it can block the event loop. We can run the processing in the thread pool to avoid the block. Tested this with the following demo and running two concurrent users:

import gradio as gr
import time

def get(image):
    output = [image] * 30
    return output

gr.Interface(get, gr.Image(type="pil"), gr.Gallery(), concurrency_limit=2).launch()

main - notice how the demo on the right will freeze until the first one finishes

postprocess_main_compressed.mov

pr - both finish at the same time

postprocess_pr_compressed.mov

🎯 PRs Should Target Issues

Before your create a PR, please check to see if there is an existing issue for this change. If not, please create an issue before you create this PR, unless the fix is very small.

Not adhering to this guideline will result in the PR being closed.

Tests

  1. PRs will only be merged if tests pass on CI. To run the tests locally, please set up your Gradio environment locally and run the tests: bash scripts/run_all_tests.sh

  2. You may need to run the linters: bash scripts/format_backend.sh and bash scripts/format_frontend.sh

@gradio-pr-bot
Copy link
Contributor

gradio-pr-bot commented Feb 6, 2024

🪼 branch checks and previews

Name Status URL
Spaces ready! Spaces preview
Website ready! Website preview
🦄 Changes detecting...

Install Gradio from this PR

pip install https://gradio-builds.s3.amazonaws.com/8d851a6c2fea26aa6e08012d06c79abf817712f4/gradio-4.19.0-py3-none-any.whl

Install Gradio Python Client from this PR

pip install "gradio-client @ git+https://github.com/gradio-app/gradio@8d851a6c2fea26aa6e08012d06c79abf817712f4#subdirectory=client/python"

@gradio-pr-bot
Copy link
Contributor

gradio-pr-bot commented Feb 6, 2024

🦄 change detected

This Pull Request includes changes to the following packages.

Package Version
gradio patch
  • Maintainers can select this checkbox to manually select packages to update.

With the following changelog entry.

Run pre/post processing in threadpool

Maintainers or the PR author can modify the PR title to modify this entry.

Something isn't right?

  • Maintainers can change the version label to modify the version bump.
  • If the bot has failed to detect any changes, or if this pull request needs to update multiple packages to different versions or requires a more comprehensive changelog entry, maintainers can update the changelog file directly.

@freddyaboulton freddyaboulton changed the title WIP: Run pre/post processing in threadpool Run pre/post processing in threadpool Feb 6, 2024
@freddyaboulton freddyaboulton marked this pull request as ready for review February 7, 2024 13:39
@abidlabs
Copy link
Member

abidlabs commented Feb 7, 2024

The real issue is that Gallery takes criminally long to postprocess 30 images, which is really not that many. I can't reproduce the 15-second wait, but it does take several seconds on my local computer. Can we do something to improve that?

Otherwise, these changes look good. I tested with gradio/lite as well because IIRC threads don't play too nicely with pyodide but there wasn't any issue running the demos that I tried. So LGTM

@whitphx
Copy link
Member

whitphx commented Feb 8, 2024

Thank you @abidlabs , that's so helpful you tested this with Lite.
anyio.run_sync is mocked in Lite as below to bypass threading so theoretically it shouldn't have any good and bad effect on Lite. So it should be ok as @abidlabs 's test worked well :D

console.debug("Mocking async libraries.");
updateProgress("Mocking async libraries");
// FastAPI uses `anyio.to_thread.run_sync` internally which, however, doesn't work in Wasm environments where the `threading` module is not supported.
// So we mock `anyio.to_thread.run_sync` here not to use threads.
await pyodide.runPythonAsync(`
async def mocked_anyio_to_thread_run_sync(func, *args, cancellable=False, limiter=None):
return func(*args)
import anyio.to_thread
anyio.to_thread.run_sync = mocked_anyio_to_thread_run_sync
`);
console.debug("Async libraries are mocked.");

@freddyaboulton
Copy link
Collaborator Author

freddyaboulton commented Feb 9, 2024

Can we do something to improve that?

I think the only thing we could do is maybe saving them in parallel. I just profiled running gallery postprocess on 30 images and half the time is spent converting to bytes (for the hash) and the other half is calling PIL.save (go by the length of the bar not the time under the function name). Let me see how easy it is to save them in parallel.

image

@freddyaboulton freddyaboulton force-pushed the 724i-run-processing-in-thread-pool branch from 069f225 to 7b11ab3 Compare February 15, 2024 00:07
@freddyaboulton
Copy link
Collaborator Author

@abidlabs I made two speed-ups to the gallery postprocess - 1) only doing one call to PIL.Image.save instead of two and 2) running the images concurrently with a ThreadPoolExecutor.

I stil lhink we should be running pre/postprocess in the threadpool because if for whatever reason we add a new component/custom component whose pre/postprocess takes more than a second the whole page will freeze which is not ideal.

This should be good for another look.

Copy link
Member

@abidlabs abidlabs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. Beautiful speedup @freddyaboulton!

@freddyaboulton freddyaboulton merged commit fb1f6be into main Feb 15, 2024
7 checks passed
@freddyaboulton freddyaboulton deleted the 724i-run-processing-in-thread-pool branch February 15, 2024 00:51
@freddyaboulton
Copy link
Collaborator Author

Thanks for the review!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Gallery Postprocessing Blocks the Demo
4 participants