-
Notifications
You must be signed in to change notification settings - Fork 5.6k
The libary sometimes cannot send files over 800MB with local bot server.[BUG] #4339
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi as asked in the group, please enable debug logging and share the output of a failed send |
I set logging level to DEBUG Like I said, there is no error nor exception. In additional context, I included a function I used to upload the file successfully. The local server log says it did not receive any data or it is receiving bits of data instead of megabytes of data. I will include the local server log in some minutes. |
The log you provided is not from PTB when it sends a file... |
I will check and update the issue. |
You are correct, the log is not from PTB. Within 2 hours (I am away from my PC) I will include the correct log, I will also include the log from local server. |
I have updated the log with an authentic for PTB. |
Hi. I see that you made multiple edits to your original issue discription. TBH it's rather hard to figure out which log is now generated from where and what code you're using. Please
|
Hello Joshi. Here is the MME. I have also included the bot log and the local server bot log. In the local bot server log, you will notice that the server is having trouble receiving data from the client. I wanted to include the log generated by the server where I used the custom method 'send_document_raw' for comparison, but I can't as the file size rose to 150 MB. It would take some time to download to my device before I can upload it to gist. 'bot.send_document' did not work; I can upload files of up to 600 MB without the program abruptly stopping. I cannot provide the document I was trying to upload as it contains sensitive accounting data (we are trying to take advantage of Telegram unlimited storage as temporary storage). Nonetheless, I did try uploading HD movies I obtained from the internet which did not work with I have also inlcuded the binary for the bot sever, it is complied in ubuntu 24.04 (LTS) x64. Thank You! |
Thanks for the updates and elaborated information!
TBH I'm a bit a loss here, not having worked much with the local api server. IISC correctly the relevant section is
but I'm not sure if I understand it correctly. I guess "Close connection while reading request/response" means that the server aborts the connection but I don't get where you see that the server has trouble reading data from the client … Not sure if seeing functioning logs will help me here, but at least for completeness it would interest me 😅 On PTB side, the relevant networking logs apparently are of the form 2024-07-12 21:40:06,751 - httpcore.http11 - DEBUG - send_request_body.started request=<Request [b'POST']>
2024-07-12 21:40:06,751 - httpcore.http11 - DEBUG - send_request_body.complete where the second line is missing in your case, i.e. the request never completes. To double check: You've specified a generous timeout of 1800 seconds = 30mins. How long does your PTB is using the httpx mweimport asyncio
import logging
from pathlib import Path
from typing import Union
import httpx
from telegram import Bot
logging.basicConfig(
format="%(asctime)s - %(name)s - %(levelname)s - %(message)s", level=logging.DEBUG
)
async def send_with_ptb(
token: str,
base_url: str,
chat_id: Union[str, int],
filename: str,
file_path: Path,
kwargs: dict[str, Union[str, bool]],
timeout: int,
):
async with Bot(
token=token,
base_url=base_url,
) as bot:
await bot.send_document(
chat_id=chat_id,
document=file_path.read_bytes(),
filename=filename,
read_timeout=timeout,
write_timeout=timeout,
connect_timeout=timeout,
**kwargs
)
async def send_with_httpx(
token: str,
base_url: str,
chat_id: Union[str, int],
filename: str,
file_path: Path,
kwargs: dict[str, Union[str, bool]],
timeout: int,
):
async with httpx.AsyncClient() as client:
response = await client.post(
url=f"{base_url}{token}/sendDocument",
data={"chat_id": chat_id, **kwargs},
files={"document": (filename, file_path.read_bytes(), "text/plain")},
timeout=timeout,
)
response.raise_for_status()
async def main():
token = "token"
chat_id = 123456
file_path = Path("tests/data/telegram.png")
filename = "test_file_name.png"
kwargs = {
"caption": "caption",
"disable_content_type_detection": True,
"disable_notification": True,
}
base_url = "https://api.telegram.org/bot"
timeout = 1800
for callback, description in (
(send_with_ptb, "send_with_ptb"),
(send_with_httpx, "send_with_httpx"),
):
try:
await callback(
timeout=timeout,
base_url=base_url,
token=token,
chat_id=chat_id,
filename=filename,
file_path=file_path,
kwargs=kwargs,
)
except Exception as exc:
print(f"{description} failed: {exc}")
if __name__ == "__main__":
asyncio.run(main()) One last thing that I noticed is that you set the mime type as |
Yes it is the relevant part.
sorry, the client is the one having issue sending the file. i know don't much about networking but i did come to the conclusion that the client is the one having issue that was why i wrote
Below is what the log should look like, i can't upload the full log as the server generates a big log per file upload. this is just 4mb of what is generated when i use
Yes, that was when the program stops.
The server is hosted on a digital ocean drop so i do have a very good upload and download speed.
When i was stuck, i suspected that PTB or httpx was the issue so i tried using
I will, please bear with me as it could takes a few hours or 1 day.
Yes, it does work with To be clear, i can upload file less than 700mb or somewhere between 0 mb - 756mb. i don't actually know where it stops. i do know that i was able to upload a 756mb file but failed for an 800mb one. Please don't mind my grammar, i hope you understand me. |
not to worry
👍 sorry, the client is the one having issue sending the file. i know don't much about networking but i did come to the conclusion that the client is the one having issue that was why i wrote send_document_raw. i got it mixed up when writing the comment. 👍 Thanks also for the pointes to the SO question, which also lead me to psf/requests#1584. Again, I'm not sure if I understand the issue with the TL;DR: I'm interested to see if passing a file handle to httpx works better. I extended the MWE for that: MWE extended for file handleimport asyncio
import logging
from pathlib import Path
from typing import Union
import httpx
from telegram import Bot
logging.basicConfig(
format="%(asctime)s - %(name)s - %(levelname)s - %(message)s", level=logging.DEBUG
)
async def send_with_ptb(
token: str,
base_url: str,
chat_id: Union[str, int],
filename: str,
file_path: Path,
kwargs: dict[str, Union[str, bool]],
timeout: int,
):
async with Bot(
token=token,
base_url=base_url,
) as bot:
await bot.send_document(
chat_id=chat_id,
document=file_path.read_bytes(),
filename=filename,
read_timeout=timeout,
write_timeout=timeout,
connect_timeout=timeout,
**kwargs,
)
async def send_with_httpx_postponed_loading(
token: str,
base_url: str,
chat_id: Union[str, int],
filename: str,
file_path: Path,
kwargs: dict[str, Union[str, bool]],
timeout: int,
):
async with httpx.AsyncClient() as client:
response = await client.post(
url=f"{base_url}{token}/sendDocument",
data={"chat_id": chat_id, **kwargs},
files={"document": (filename, file_path.open("rb"), "text/plain")},
timeout=timeout,
)
response.raise_for_status()
async def send_with_httpx_preponed_loading(
token: str,
base_url: str,
chat_id: Union[str, int],
filename: str,
file_path: Path,
kwargs: dict[str, Union[str, bool]],
timeout: int,
):
async with httpx.AsyncClient() as client:
response = await client.post(
url=f"{base_url}{token}/sendDocument",
data={"chat_id": chat_id, **kwargs},
files={"document": (filename, file_path.read_bytes(), "text/plain")},
timeout=timeout,
)
response.raise_for_status()
async def main():
token = "token"
chat_id = 123456
file_path = Path("tests/data/telegram.mp4")
filename = "test_file_name.png"
kwargs = {
"caption": "caption",
"disable_content_type_detection": True,
"disable_notification": True,
}
base_url = "https://api.telegram.org/bot"
timeout = 1800
for callback in (
send_with_httpx_postponed_loading,
send_with_httpx_preponed_loading,
send_with_ptb,
):
try:
await callback(
timeout=timeout,
base_url=base_url,
token=token,
chat_id=chat_id,
filename=filename,
file_path=file_path,
kwargs=kwargs,
)
except Exception as exc:
print(f"{callback.__name__} failed: {exc}")
if __name__ == "__main__":
asyncio.run(main()) |
I ran the mwe, the program stops after uploading the file once . so i added an info log to tell which function is causing the program to stop, the log indicated that the program stops after see log2024-07-13 16:23:14,874 - asyncio - DEBUG - Using selector: EpollSelector
2024-07-13 16:23:14,876 - httpx - DEBUG - load_ssl_context verify=True cert=None trust_env=True http2=False
2024-07-13 16:23:14,878 - httpx - DEBUG - load_verify_locations cafile='/home/codespace/.cache/pypoetry/virtualenvs/[redacted]/lib/python3.10/site-packages/certifi/cacert.pem'
2024-07-13 16:23:14,908 - root - INFO - send_with_httpx_postponed_loading started
2024-07-13 16:23:14,925 - httpcore.connection - DEBUG - connect_tcp.started host='[redacted]' port=8080 local_address=None timeout=1800 socket_options=None
2024-07-13 16:23:14,943 - httpcore.connection - DEBUG - connect_tcp.complete return_value=<httpcore._backends.anyio.AnyIOStream object at 0x7111f2baf7f0>
2024-07-13 16:23:14,943 - httpcore.http11 - DEBUG - send_request_headers.started request=<Request [b'POST']>
2024-07-13 16:23:14,944 - httpcore.http11 - DEBUG - send_request_headers.complete
2024-07-13 16:23:14,944 - httpcore.http11 - DEBUG - send_request_body.started request=<Request [b'POST']>
2024-07-13 16:23:27,769 - httpcore.http11 - DEBUG - send_request_body.complete
2024-07-13 16:23:27,769 - httpcore.http11 - DEBUG - receive_response_headers.started request=<Request [b'POST']>
2024-07-13 16:25:07,830 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Connection', b'keep-alive'), (b'Content-Type', b'application/json'), (b'Content-Length', b'499')])
2024-07-13 16:25:07,831 - httpx - INFO - HTTP Request: POST http://[redacted]:8080/bot6859301078:AAH-2zJuFAuCGl1M5Zk7uUUM3Uf_1ghGBds/sendDocument "HTTP/1.1 200 OK"
2024-07-13 16:25:07,831 - httpcore.http11 - DEBUG - receive_response_body.started request=<Request [b'POST']>
2024-07-13 16:25:07,831 - httpcore.http11 - DEBUG - receive_response_body.complete
2024-07-13 16:25:07,831 - httpcore.http11 - DEBUG - response_closed.started
2024-07-13 16:25:07,832 - httpcore.http11 - DEBUG - response_closed.complete
2024-07-13 16:25:07,832 - root - INFO - send_with_httpx_postponed_loading took 112.92 seconds
2024-07-13 16:25:07,832 - httpcore.connection - DEBUG - close.started
2024-07-13 16:25:07,832 - httpcore.connection - DEBUG - close.complete
2024-07-13 16:25:07,833 - httpx - DEBUG - load_ssl_context verify=True cert=None trust_env=True http2=False
2024-07-13 16:25:07,833 - httpx - DEBUG - load_verify_locations cafile='/home/codespace/.cache/pypoetry/virtualenvs/[redacted]/lib/python3.10/site-packages/certifi/cacert.pem'
2024-07-13 16:25:07,838 - root - INFO - send_with_httpx_preponed_loading started
2024-07-13 16:25:08,569 - httpcore.connection - DEBUG - connect_tcp.started host='[redacted]' port=8080 local_address=None timeout=1800 socket_options=None
2024-07-13 16:25:08,588 - httpcore.connection - DEBUG - connect_tcp.complete return_value=<httpcore._backends.anyio.AnyIOStream object at 0x7111f29fb1f0>
2024-07-13 16:25:08,588 - httpcore.http11 - DEBUG - send_request_headers.started request=<Request [b'POST']>
2024-07-13 16:25:08,589 - httpcore.http11 - DEBUG - send_request_headers.complete
2024-07-13 16:25:08,589 - httpcore.http11 - DEBUG - send_request_body.started request=<Request [b'POST']> so i removed see log2024-07-13 16:32:30,326 - asyncio - DEBUG - Using selector: EpollSelector
2024-07-13 16:32:30,326 - httpx - DEBUG - load_ssl_context verify=True cert=None trust_env=True http2=False
2024-07-13 16:32:30,328 - httpx - DEBUG - load_verify_locations cafile='/home/codespace/.cache/pypoetry/virtualenvs/[redacted]/lib/python3.10/site-packages/certifi/cacert.pem'
2024-07-13 16:32:30,333 - root - INFO - send_with_httpx_postponed_loading started
2024-07-13 16:32:30,348 - httpcore.connection - DEBUG - connect_tcp.started host='[redacted]' port=8080 local_address=None timeout=1800 socket_options=None
2024-07-13 16:32:30,366 - httpcore.connection - DEBUG - connect_tcp.complete return_value=<httpcore._backends.anyio.AnyIOStream object at 0x750d28bfb7f0>
2024-07-13 16:32:30,366 - httpcore.http11 - DEBUG - send_request_headers.started request=<Request [b'POST']>
2024-07-13 16:32:30,366 - httpcore.http11 - DEBUG - send_request_headers.complete
2024-07-13 16:32:30,366 - httpcore.http11 - DEBUG - send_request_body.started request=<Request [b'POST']>
2024-07-13 16:32:43,924 - httpcore.http11 - DEBUG - send_request_body.complete
2024-07-13 16:32:43,925 - httpcore.http11 - DEBUG - receive_response_headers.started request=<Request [b'POST']>
2024-07-13 16:34:24,360 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Connection', b'keep-alive'), (b'Content-Type', b'application/json'), (b'Content-Length', b'499')])
2024-07-13 16:34:24,361 - httpx - INFO - HTTP Request: POST http://[redacted]:8080/[redacted]/sendDocument "HTTP/1.1 200 OK"
2024-07-13 16:34:24,362 - httpcore.http11 - DEBUG - receive_response_body.started request=<Request [b'POST']>
2024-07-13 16:34:24,362 - httpcore.http11 - DEBUG - receive_response_body.complete
2024-07-13 16:34:24,362 - httpcore.http11 - DEBUG - response_closed.started
2024-07-13 16:34:24,362 - httpcore.http11 - DEBUG - response_closed.complete
2024-07-13 16:34:24,362 - root - INFO - send_with_httpx_postponed_loading took 114.03 seconds
2024-07-13 16:34:24,362 - httpcore.connection - DEBUG - close.started
2024-07-13 16:34:24,363 - httpcore.connection - DEBUG - close.complete
2024-07-13 16:34:24,363 - httpx - DEBUG - load_ssl_context verify=True cert=None trust_env=True http2=False
2024-07-13 16:34:24,363 - httpx - DEBUG - load_verify_locations cafile='/home/codespace/.cache/pypoetry/virtualenvs/[redacted]/lib/python3.10/site-packages/certifi/cacert.pem'
2024-07-13 16:34:24,368 - httpx - DEBUG - load_ssl_context verify=True cert=None trust_env=True http2=False
2024-07-13 16:34:24,369 - httpx - DEBUG - load_verify_locations cafile='/home/codespace/.cache/pypoetry/virtualenvs/[redacted]/lib/python3.10/site-packages/certifi/cacert.pem'
2024-07-13 16:34:24,374 - telegram.Bot - DEBUG - Calling Bot API endpoint `getMe` with parameters `{}`
2024-07-13 16:34:24,374 - httpcore.connection - DEBUG - connect_tcp.started host='[redacted]' port=8080 local_address=None timeout=5.0 socket_options=None
2024-07-13 16:34:24,392 - httpcore.connection - DEBUG - connect_tcp.complete return_value=<httpcore._backends.anyio.AnyIOStream object at 0x750d28a475b0>
2024-07-13 16:34:24,393 - httpcore.http11 - DEBUG - send_request_headers.started request=<Request [b'POST']>
2024-07-13 16:34:24,393 - httpcore.http11 - DEBUG - send_request_headers.complete
2024-07-13 16:34:24,393 - httpcore.http11 - DEBUG - send_request_body.started request=<Request [b'POST']>
2024-07-13 16:34:24,393 - httpcore.http11 - DEBUG - send_request_body.complete
2024-07-13 16:34:24,393 - httpcore.http11 - DEBUG - receive_response_headers.started request=<Request [b'POST']>
2024-07-13 16:34:24,410 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Connection', b'keep-alive'), (b'Content-Type', b'application/json'), (b'Content-Length', b'231')])
2024-07-13 16:34:24,410 - httpx - INFO - HTTP Request: POST http://[redacted]:8080/[redacted]/getMe "HTTP/1.1 200 OK"
2024-07-13 16:34:24,410 - httpcore.http11 - DEBUG - receive_response_body.started request=<Request [b'POST']>
2024-07-13 16:34:24,410 - httpcore.http11 - DEBUG - receive_response_body.complete
2024-07-13 16:34:24,410 - httpcore.http11 - DEBUG - response_closed.started
2024-07-13 16:34:24,411 - httpcore.http11 - DEBUG - response_closed.complete
2024-07-13 16:34:24,411 - telegram.Bot - DEBUG - Call to Bot API endpoint `getMe` finished with return value `{'id': 6859301078, 'is_bot': True, 'first_name': 'My test Bot', 'username': '[redacted]', 'can_join_groups': True, 'can_read_all_group_messages': False, 'supports_inline_queries': False, 'can_connect_to_business': False}`
2024-07-13 16:34:24,411 - root - INFO - send_with_ptb started
2024-07-13 16:34:25,815 - telegram.Bot - DEBUG - Calling Bot API endpoint `sendDocument` with parameters `{'chat_id': '@[redacted]', 'document': <telegram._files.inputfile.InputFile object at 0x750d28a5bf40>, 'disable_content_type_detection': True, 'disable_notification': True, 'caption': 'caption'}`
2024-07-13 16:34:25,816 - httpcore.http11 - DEBUG - send_request_headers.started request=<Request [b'POST']>
2024-07-13 16:34:25,816 - httpcore.http11 - DEBUG - send_request_headers.complete
2024-07-13 16:34:25,816 - httpcore.http11 - DEBUG - send_request_body.started request=<Request [b'POST']> only Unfortunately, i can't provide the server log file as my supervisor denied me access to the bot server. |
Nice :) Then I guess we've found the problem and should now look for a way to allow passing file handlers directly to the networking backend. I guess already makes sense from the viewpoint that you don't want to always load large files into memory, especially if several of these requests are running in parallel. We'll have to review how the connection between |
I looked into it a bit. Not reading data in
|
Backwards compatibility with a note in the release notes should be plenty. |
I've started implementing this in #4388. Haven't gotten around to unit testing yet, but it's already functional on my end. @daviddanielng could you kindly
Thanks! import asyncio
import logging
from pathlib import Path
from typing import Union
from telegram import Bot, InputFile
logging.basicConfig(
format="%(asctime)s - %(name)s - %(levelname)s - %(message)s", level=logging.DEBUG
)
async def send_with_ptb(
token: str,
base_url: str,
chat_id: Union[str, int],
filename: str,
file_path: Path,
kwargs: dict[str, Union[str, bool]],
timeout: int,
):
async with Bot(
token=token,
base_url=base_url,
) as bot:
await bot.send_document(
chat_id=chat_id,
document=InputFile(
obj=file_path.open("rb"), filename=filename, attach=True, read_file_handle=False
),
filename=filename,
read_timeout=timeout,
write_timeout=timeout,
connect_timeout=timeout,
**kwargs,
)
async def main():
token = "TOKEN"
chat_id = 123
file_path = Path("tests/data/telegram.mp4")
filename = "test_file_name.png"
kwargs = {
"caption": "caption",
"disable_content_type_detection": True,
"disable_notification": True,
}
base_url = "https://api.telegram.org/bot"
timeout = 1800
for callback in (
send_with_ptb,
):
try:
await callback(
timeout=timeout,
base_url=base_url,
token=token,
chat_id=chat_id,
filename=filename,
file_path=file_path,
kwargs=kwargs,
)
except Exception as exc:
print(f"{callback.__name__} failed: {exc}")
if __name__ == "__main__":
asyncio.run(main()) |
I ran it, It worked flawlessly. I noticed that you set the file as Thanks for the work you put into this library. |
Awesome, then I'll try to get the unit tests up to speed :) Yes, in my implementation |
Steps to Reproduce
.mkv
file,FilesBot.send_docuemnt
(I have put a link to FilesBot program or class at the end, it is in additional context).Expected behaviour
Libary should send the file to Local server, then local server should process the file and upload it to group while libaray should return the result
Actual behaviour
No error nor exception, everthing just stop.
Operating System
ubuntu 22LTS on github codespace
Version of Python, python-telegram-bot & dependencies
Relevant log output
Additional Context
the filesbot program is this.
send_document
is the one uploading the videoI was able to upload the file after some time, i also used a third party libaray request toolbelt
The text was updated successfully, but these errors were encountered: