Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Max recursion depth RecursionError when max clients exceeded with Sentinel #2866

Open
rad-pat opened this issue Jul 28, 2023 · 8 comments
Open
Labels
bug Bug

Comments

@rad-pat
Copy link

rad-pat commented Jul 28, 2023

Version: redis-py 4.6.0, redis

Platform: Python 3.11.4 on Debian

Description:
Similar to a bug reported here #2563.
When maximum number of clients are used up for a Sentinel, there is an issue with the health check that causes a maximum recursion

Exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/redis/asyncio/connection.py", line 778, in read_response
    response = await self._parser.read_response(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/redis/asyncio/connection.py", line 411, in read_response
    await self.read_from_socket()
  File "/usr/local/lib/python3.11/site-packages/redis/asyncio/connection.py", line 392, in read_from_socket
    buffer = await self._stream.read(self._read_size)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/asyncio/streams.py", line 690, in read
    await self._wait_for_data('read')
  File "/usr/local/lib/python3.11/asyncio/streams.py", line 520, in _wait_for_data
    self._waiter = self._loop.create_future()
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "uvloop/loop.pyx", line 1412, in uvloop.loop.Loop.create_future
  File "uvloop/loop.pyx", line 718, in uvloop.loop.Loop._new_future
  File "/usr/local/lib/python3.11/traceback.py", line 231, in extract_stack
    stack = StackSummary.extract(walk_stack(f), limit=limit)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/traceback.py", line 393, in extract
    return klass._extract_from_extended_frame_gen(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/traceback.py", line 416, in _extract_from_extended_frame_gen
    for f, (lineno, end_lineno, colno, end_colno) in frame_gen:
  File "/usr/local/lib/python3.11/traceback.py", line 390, in extended_frame_gen
    for f, lineno in frame_gen:
RecursionError: maximum recursion depth exceeded

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/plaid/src/plaid/plaid/web/endpoints/liveness.py", line 38, in dispatch
    await super().dispatch()
  File "/usr/local/lib/python3.11/site-packages/starlette/endpoints.py", line 42, in dispatch
    response = await handler(request)
               ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/plaid/src/plaid/plaid/web/endpoints/liveness.py", line 72, in get
    await self._check_redis()
  File "/home/plaid/src/plaid/plaid/web/endpoints/liveness.py", line 55, in _check_redis
    await redis_con.setex(
  File "/home/plaid/src/plaid/plaid/core/data/redis_connector.py", line 85, in execute_command
    return await super().execute_command(*args, **options)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/redis/asyncio/client.py", line 513, in execute_command
    conn = self.connection or await pool.get_connection(command_name, **options)
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/redis/asyncio/connection.py", line 1375, in get_connection
    await connection.connect()
  File "/usr/local/lib/python3.11/site-packages/redis/asyncio/sentinel.py", line 61, in connect
    return await self.retry.call_with_retry(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/redis/asyncio/retry.py", line 59, in call_with_retry
    return await do()
           ^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/redis/asyncio/sentinel.py", line 51, in _connect_retry
    await self.connect_to(await self.connection_pool.get_master_address())
  File "/usr/local/lib/python3.11/site-packages/redis/asyncio/sentinel.py", line 41, in connect_to
    await super().connect()
  File "/usr/local/lib/python3.11/site-packages/redis/asyncio/connection.py", line 592, in connect
    await self.on_connect()
  File "/usr/local/lib/python3.11/site-packages/redis/asyncio/connection.py", line 659, in on_connect
    await self.send_command("SELECT", self.db)
  File "/usr/local/lib/python3.11/site-packages/redis/asyncio/connection.py", line 752, in send_command
    await self.send_packed_command(
  File "/usr/local/lib/python3.11/site-packages/redis/asyncio/connection.py", line 715, in send_packed_command
    await self.check_health()
  File "/usr/local/lib/python3.11/site-packages/redis/asyncio/connection.py", line 703, in check_health
    await self.retry.call_with_retry(self._send_ping, self._ping_failed)
  File "/usr/local/lib/python3.11/site-packages/redis/asyncio/retry.py", line 59, in call_with_retry
    return await do()
           ^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/redis/asyncio/connection.py", line 689, in _send_ping
    await self.send_command("PING", check_health=False)
  File "/usr/local/lib/python3.11/site-packages/redis/asyncio/connection.py", line 752, in send_command
    await self.send_packed_command(
  File "/usr/local/lib/python3.11/site-packages/redis/asyncio/connection.py", line 713, in send_packed_command
    await self.connect()
  File "/usr/local/lib/python3.11/site-packages/redis/asyncio/sentinel.py", line 61, in connect
    return await self.retry.call_with_retry(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/redis/asyncio/retry.py", line 59, in call_with_retry
    return await do()
           ^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/redis/asyncio/sentinel.py", line 51, in _connect_retry
    await self.connect_to(await self.connection_pool.get_master_address())
  File "/usr/local/lib/python3.11/site-packages/redis/asyncio/sentinel.py", line 41, in connect_to
    await super().connect()
  File "/usr/local/lib/python3.11/site-packages/redis/asyncio/connection.py", line 592, in connect
    await self.on_connect()
  File "/usr/local/lib/python3.11/site-packages/redis/asyncio/connection.py", line 659, in on_connect
    await self.send_command("SELECT", self.db)
  File "/usr/local/lib/python3.11/site-packages/redis/asyncio/connection.py", line 752, in send_command
    await self.send_packed_command(
  File "/usr/local/lib/python3.11/site-packages/redis/asyncio/connection.py", line 715, in send_packed_command
    await self.check_health()
  File "/usr/local/lib/python3.11/site-packages/redis/asyncio/connection.py", line 703, in check_health
    await self.retry.call_with_retry(self._send_ping, self._ping_failed)
  File "/usr/local/lib/python3.11/site-packages/redis/asyncio/retry.py", line 59, in call_with_retry
    return await do()
           ^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/redis/asyncio/connection.py", line 689, in _send_ping
    await self.send_command("PING", check_health=False)
  File "/usr/local/lib/python3.11/site-packages/redis/asyncio/connection.py", line 752, in send_command
    await self.send_packed_command(
  File "/usr/local/lib/python3.11/site-packages/redis/asyncio/connection.py", line 713, in send_packed_command
    await self.connect()
  File "/usr/local/lib/python3.11/site-packages/redis/asyncio/sentinel.py", line 61, in connect
    return await self.retry.call_with_retry(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/redis/asyncio/retry.py", line 59, in call_with_retry
    return await do()
           ^^^^^^^^^^
  <snip>

** Code to reproduce **
The health check is deliberately made short to generate the error

import asyncio
import traceback
import uuid
from redis.asyncio import Redis, Sentinel


def get_redis_conn():
    sentinel_connection = Sentinel(
        [('rfs-plaid', 26379)],
        db=6,
        health_check_interval=2,
        retry_on_timeout=True,
    )
    return sentinel_connection.master_for(
        'mymaster',
        redis_class=Redis,
        decode_responses=True
    )

async def check_connection_limit_bug():
    conn = get_redis_conn()
    await conn.config_set('maxclients', 1000)
    try:
        await consume_client_connections()
        while True:
            await conn.get('test')
            await asyncio.sleep(2)
    finally:
        await conn.config_set('maxclients', 10000)
        pass


async def consume_client_connections():
    for i in range(1010):
        try:
            conn = get_redis_conn()
            unique_value = str(uuid.uuid4())
            unique_key = f'liveness-check:{unique_value}'

            await conn.setex(
                name=unique_key,
                time=60,
                value=unique_value
            )
            await conn.getdel(
                name=unique_key,
            )

        except:
            print(traceback.format_exc())


asyncio.run(check_connection_limit_bug())

@rad-pat
Copy link
Author

rad-pat commented Jul 28, 2023

The sudden increase in connection usage is likely due to #2755
That will hopefully be fixed by fixing issue #2831

This is a separate error ass a side-effect of using up all the client connections, I think

@nsteinmetz
Copy link

@rad-pat I confirm that #2831 is fixed on master version.

@dvora-h
Copy link
Collaborator

dvora-h commented Aug 29, 2023

@rad-pat Can you confirm that now you don't get this error (with code from master) and we can close this issue?

@dvora-h dvora-h added the bug Bug label Aug 29, 2023
@rad-pat
Copy link
Author

rad-pat commented Aug 30, 2023

@dvora-h I can confirm that the bug is not fixed on master. The code above still fails with max recursion error. I would think this is related to the call_with_retry and the exception raised for max number of clients reached

@rad-pat
Copy link
Author

rad-pat commented Aug 30, 2023

One important point to note is that I do not get the error with version 4.5.5

@dvora-h
Copy link
Collaborator

dvora-h commented Aug 30, 2023

Thanks for the response. I will try to find & fix the problem

@rad-pat
Copy link
Author

rad-pat commented Aug 30, 2023

I can also confirm that it happens without Sentinel too, and also without async.

import asyncio
import traceback
import uuid
import redis
import redis.asyncio
import time


def get_redis_conn_async():
    return redis.asyncio.Redis(db=6, health_check_interval=2, retry_on_timeout=True)
    # sentinel_connection = redis.asyncio.Sentinel(
    #     [('rfs-plaid.production', 26379)],
    #     db=6,
    #     health_check_interval=2,
    #     retry_on_timeout=True,
    # )
    # return sentinel_connection.master_for(
    #     'mymaster',
    #     redis_class=Redis,
    #     decode_responses=True
    # )

async def check_connection_limit_bug_async():
    conn = get_redis_conn_async()
    await conn.config_set('maxclients', 20)
    try:
        await consume_client_connections_async()
        while True:
            await conn.get('test')
            await asyncio.sleep(2)
    finally:
        await conn.config_set('maxclients', 10000)
        pass


async def consume_client_connections_async():
    for i in range(25):
        try:
            conn = get_redis_conn_async()
            unique_value = str(uuid.uuid4())
            unique_key = f'liveness-check:{unique_value}'

            await conn.setex(
                name=unique_key,
                time=60,
                value=unique_value
            )
            await conn.getdel(
                name=unique_key,
            )

        except:
            print(traceback.format_exc())


# asyncio.run(check_connection_limit_bug_async())


def get_redis_conn_sync():
    return redis.Redis(db=6, health_check_interval=2, retry_on_timeout=True)
    # sentinel_connection = Sentinel(
    #     [('rfs-plaid.production', 26379)],
    #     db=6,
    #     health_check_interval=2,
    #     retry_on_timeout=True,
    # )
    # return sentinel_connection.master_for(
    #     'mymaster',
    #     redis_class=Redis,
    #     decode_responses=True
    # )

def check_connection_limit_bug_sync():
    conn = get_redis_conn_sync()
    conn.config_set('maxclients', 20)
    try:
        consume_client_connections_sync()
        while True:
            conn.get('test')
            time.sleep(2)
    finally:
        conn.config_set('maxclients', 10000)
        pass


def consume_client_connections_sync():
    for i in range(25):
        try:
            conn = get_redis_conn_sync()
            unique_value = str(uuid.uuid4())
            unique_key = f'liveness-check:{unique_value}'

            conn.setex(
                name=unique_key,
                time=60,
                value=unique_value
            )
            conn.getdel(
                name=unique_key,
            )

        except:
            print(traceback.format_exc())


check_connection_limit_bug_sync()

@rad-pat
Copy link
Author

rad-pat commented Aug 30, 2023

One important point to note is that I do not get the error with version 4.5.5

Apologies, this was incorrect information, the problem is still there with 4.5.5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bug
Projects
None yet
Development

No branches or pull requests

3 participants