Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Small buffer in io.Copy() causes extremely slow S3 media download #516

Closed
PeterCxy opened this issue Nov 29, 2023 · 8 comments · Fixed by #519
Closed

Small buffer in io.Copy() causes extremely slow S3 media download #516

PeterCxy opened this issue Nov 29, 2023 · 8 comments · Fixed by #519
Labels
bug download performance s3 Issues that are S3-specific

Comments

@PeterCxy
Copy link

Since a while back, I have been experiencing extremely poor download performance from my matrix-media-repo deployment. The download speed is so slow that most larger media files are barely viewable at all through most Matrix clients. I would see something like

HTTP request sent, awaiting response... 200 OK
Length: 11447735 (11M) [video/mp4]
Saving to: ‘da853e40b7955c825b6b770e01b335d6cdcfeb241729811806945280000.18’

      da853e40b7955c825b6b770e01b3   2%[>                                                                ] 303.14K  32.1KB/s    eta 5m 35s

Originally I was blaming my own ad-hoc S3 cluster based on SeaweedFS, but after finally deciding to take a look into this issue, it seems that SeaweedFS itself was performing just fine. If I download the file directly from the S3 backend, from the server running matrix-media-repo itself,

Length: 11447735 (11M) [video/mp4]
Saving to: ‘da853e40b7955c825b6b770e01b335d6cdcfeb241729811806945280000.10’

da853e40b7955c825b6b770e01b335d6cdc 100%[================================================================>]  10.92M  4.69MB/s    in 2.3s

After some debugging, what I eventually concluded is that the problem seems to be io.Copy(). By simply increasing the size of the buffer, I was able to significantly increase the download speed. It looks like every call to Read or Write involves some sort of high latency overhead that results in such a behavior. Note that my S3 server itself does have some access latency to the matrix-media-repo server, but I wouldn't expect the latency to matter once the TCP stream is started.

The patch I used to temporarily "fix" the issue is attached below.

diff --git a/api/_routers/98-use-rcontext.go b/api/_routers/98-use-rcontext.go
index 6a3b8e9..344a9d0 100644
--- a/api/_routers/98-use-rcontext.go
+++ b/api/_routers/98-use-rcontext.go
@@ -266,7 +266,11 @@ beforeParseDownload:
        r = writeStatusCode(w, r, proposedStatusCode)

        defer stream.Close()
-       written, err := io.Copy(w, stream)
+       // Force 1MB buffer for io.CopyBuffer
+       buf := make([]byte, 1024 * 1024)
+       // Hide 'ReaderFrom' type so that io.CopyBuffer actually uses the buffer
+       _w := io.MultiWriter(w)
+       written, err := io.CopyBuffer(_w, stream, buf)
        if err != nil {
                panic(err) // blow up this request
        }
@turt2live turt2live added bug s3 Issues that are S3-specific performance download labels Nov 29, 2023
@PeterCxy
Copy link
Author

Even weirder, I cannot seem to reproduce this issue if I just write a simple Go program that io.Copy's a HTTP stream from S3 to an HTTP client.

package main

import (
    "net/http"
    "io"
)

func get(w http.ResponseWriter, r *http.Request) {
    resp, _ := http.Get("<redacted>")
    defer resp.Body.Close()
    w.Header().Set("Content-Type", "application/octet-stream")
    w.Header().Set("Content-Length", "11447735")
    io.Copy(w, resp.Body)
}

func main() {
    http.HandleFunc("/", get)
    http.ListenAndServe("127.0.0.1:1234", nil)
}

@PeterCxy
Copy link
Author

After some more digging around, it looks like this might actually be a minio-go bug. It looks like minio-go sends a completely new HTTP request every time Read() is called on its objects (see https://github.com/minio/minio-go/blob/master/api-get-object.go#L196), incurring an overhead of at least a few round trips for every one of these calls.

It might take some refactoring upstream to avoid this behavior, but in the mean time maybe downstream can expose a larger / configurable buffer size as a workaround?

@PeterCxy PeterCxy changed the title io.Copy() causes extremely slow S3 media download Small buffer in io.Copy() causes extremely slow S3 media download Nov 30, 2023
@szclsya
Copy link

szclsya commented Dec 1, 2023

I have similar experience with matrix-media-repo on master and MinIO RELEASE.2023-11-20T22-40-07Z.

On master, download a relatively large image (1.7MB) estimated at around 40s:

[REDACTED] 16%[========>                                               ] 295.19K  43.3KB/s    eta 37s

After applying the provided patch it takes less than a sec:

[REDACTED] 100%[=======================================================] 1.74M  1.70MB/s    in 1.0s

@turt2live
Copy link
Member

Would be good to get a minimal test case set up using the minio library to demonstrate the issue. If that also has problems then this might be best to start upstream before engaging the relatively slower release process on MMR's end.

@PeterCxy
Copy link
Author

PeterCxy commented Dec 1, 2023

I am currently looking into minio-go, and it looks like Read() calls do not necessarily trigger a new request every time, but is based on a set of flags on the GetObject context. Might be worth finding out what exactly here in MMR that caused this behavior, as it seems like this is not a common issue among minio-go users (or at least not so bad to be reported yet).

@turt2live
Copy link
Member

@PeterCxy
Copy link
Author

PeterCxy commented Dec 1, 2023

@turt2live The culprit seems to be go-singleflight-streams:

https://github.com/t2bot/go-singleflight-streams/blob/main/seeker.go#L70

It keeps calling Seek() on Read() requests, which results in the HTTP connection from minio-go being invalidated:

https://github.com/minio/minio-go/blob/master/api-get-object.go#L591

Avoiding the Seek() call, if possible, would fix this issue.

@PeterCxy
Copy link
Author

PeterCxy commented Dec 1, 2023

Upstream fix minio/minio-go#1908 has been merged. Simply upgrading minio-go after the next release tag should resolve this issue.

turt2live added a commit that referenced this issue Dec 10, 2023
turt2live added a commit that referenced this issue Dec 10, 2023
* Update compiled libheif version

* Update dependencies

* Update minio to fix stream performance

Fixes #516

* Update Actions dependencies

* Update Docker dependencies

* Pin Alpine version throughout
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug download performance s3 Issues that are S3-specific
Projects
Development

Successfully merging a pull request may close this issue.

3 participants