Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Release downloader lock before re-attempting fallback download #938

Merged
merged 2 commits into from
Nov 14, 2024

Conversation

jpcorreia99
Copy link
Contributor

Description

In #837 locking logic was added to the download reporter to prevent multiple downloads.

This was clashing with the logic added in #797 where, if we detect we're trying to download a zip that has data descriptors, we fallback to doing a full download and interrupt the streaming decompression download.

The issue comes from us not releasing the lock before attempting the second download, hitting this line

"on_download_start was called multiple times"

As a fix, we report the previous download as completed before attempting the second one! Have verified in my custom setup that this works.

@jpcorreia99 jpcorreia99 changed the title Release downloader lock before re-attempting fallback download fix: Release downloader lock before re-attempting fallback download Nov 13, 2024
@jpcorreia99
Copy link
Contributor Author

@baszalmstra @wolfv @nichmor Hey folks! After updating rattler I saw our setup where the conda packages have zip descriptors start to fail downloads. Put up this quick fix for it :)

Copy link
Collaborator

@baszalmstra baszalmstra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it makes sense to add an additional function to the reporter or an argument to download complete to signal this case? I can imagine this can be important for reporters.

@jpcorreia99
Copy link
Contributor Author

@baszalmstra what would we want to do with this extra method/argument? We already have the

tracing::warn!("Failed to stream decompress conda package from '{}' due to the presence of zip data descriptors. Falling back to non streaming decompression", url);

message which is very visible.

@baszalmstra
Copy link
Collaborator

Sure thats for users, but I can imagine that a Reporter doesnt expect a download to be completed twice. If we add an additional parameter to the function we can tell the reporter that a download failed and it could handle the completion of the second properly.

@jpcorreia99
Copy link
Contributor Author

@baszalmstra In this flow, the sequence
reporter. on_download_start() and on_download_complete() is called twice, sequentially (so there aren't two completions for a single start). So, logically, one download starts and completes and another one starts and completes. Does that still match your concern?

Currently, when calling on_download_complete, the reporter releases the lock and calls on_download_complete for an inner reporter:

fn on_download_complete(&self) {
let index = self
.index
.lock()
.take()
.expect("on_download_start was not called");
self.reporter.on_download_completed(index);
}
}

This inner reporter seems to do something for the visual progress bar:

fn on_download_completed(&self, cache_entry: usize) {
let mut inner = self.inner.lock();
inner.end_downloading = Some(Instant::now());
inner.packages_downloading.remove(&cache_entry);
inner.packages_downloaded.insert(cache_entry);
if inner.packages_downloading.is_empty() {
inner
.download_progress
.as_ref()
.expect("progress bar not set")
.set_style(inner.style(ProgressStyleProperties {
status: ProgressStatus::Paused,
determinate: true,
progress_type: ProgressType::Bytes,
track: ProgressTrack::DownloadAndExtract,
}));
}
inner.update_download_message();
}
.

I understand the proposition of trying to change from

  • on_download_start
  • on_download_finish
  • on_download_start
  • on_download_finish

to

  • start
  • failure
  • finish

However, it will be pretty hard (and possibly mess up existing logic) to get rid of the second on_download_start because we must call it to get a new reader to start the download here

https://github.com/conda/rattler/b

Also, visually, I even with this existing logic it all looked completely fine to me. I didn't notice any strangeness.

@baszalmstra
Copy link
Collaborator

Ah I didnt understand correctly then! It was my understanding that download start was only called once! But in that case it should be fine.

@baszalmstra baszalmstra merged commit 9a5d91d into conda:main Nov 14, 2024
15 checks passed
@baszalmstra baszalmstra mentioned this pull request Nov 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants