Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(integrations): Add integration for asyncpg #2314

Merged
merged 30 commits into from Sep 11, 2023

Conversation

mimre25
Copy link
Contributor

@mimre25 mimre25 commented Aug 19, 2023

Closes #1278
I think this is close to being ready (see #2314 (comment)) for an update.

So far this records every statement that is directly issued, as well as the SQL statements that are used for cursors and prepared statements.

I have a few open questions:

  1. I noticed asyncpg uses the postgres binary protocol to communicate to postgres. I haven't had any encounter with that yet, but it's making it more challenging to record the creation of cursors, transactions, and prepared statements properly. Do you have a suggestion what to do about this? Keep in mind that all of this is done in Cython, ie we would need to tap into that lower level for recording the exact statements and reverse parsing the binary protocol.
  2. For cursors and prepared statements, what is the desired outcome? Should the creation be recorded, and every subsequent use, or just the creation, or just the usages?
  3. I've noticed that tracing_utils:record_sql_queries checks the hub.client.options["_experiments"]["record_sql_params"] and strips out the parameters if it's True. However, I did not want to rely on it to strip out parameters, as there is a TODO right with it, that hints at a future removal of that check. Therefore I strip them out per default, but provide a config option to the integration for recording the parameters.
  4. I've noticed that the test-requirements.txt doesn't contain pytest-asyncio, yet the pytest.ini configures the asyncio_mode to strict. Am I missing something here, or is this on purpose?

@antonpirker please let me know what you think :)

@antonpirker
Copy link
Member

Hey @mimre25 !

Finally I have some time to review. Looks very clean at the first glance! Great job!
Will test it with a local demo project to see what the spans look like.

One thing I noticed, maybe we need to set more data on the span like this: https://github.com/getsentry/sentry-python/blob/antonpirker/2262-traces-sampler-url-instead-of-generic-asgi-request/sentry_sdk/integrations/sqlalchemy.py#L57-L71

@antonpirker
Copy link
Member

To answer some of your questions:

  1. I've noticed that tracing_utils:record_sql_queries checks the hub.client.options["_experiments"]["record_sql_params"] and strips out the parameters if it's True. However, I did not want to rely on it to strip out parameters, as there is a TODO right with it, that hints at a future removal of that check. Therefore I strip them out per default, but provide a config option to the integration for recording the parameters.

Sounds good. Forget the _experiments.

  1. I've noticed that the test-requirements.txt doesn't contain pytest-asyncio, yet the pytest.ini configures the asyncio_mode to strict. Am I missing something here, or is this on purpose?

This is not really on purpose. This is probably because we have not had time to look at our test setup in a long time...

@antonpirker
Copy link
Member

Screenshot 2023-08-30 at 14 45 55

I get db spans from your integration! Yay! \o/

@antonpirker
Copy link
Member

I also added asyncpg to the test matrix so the tests run in CI.

@antonpirker
Copy link
Member

  1. I noticed asyncpg uses the postgres binary protocol to communicate to postgres. I haven't had any encounter with that yet, but it's making it more challenging to record the creation of cursors, transactions, and prepared statements properly. Do you have a suggestion what to do about this? Keep in mind that all of this is done in Cython, ie we would need to tap into that lower level for recording the exact statements and reverse parsing the binary protocol.

For the first iteration of the asyncpg support it is enough if we record spans for execute, executemany (and maybe db connect if it is easy) The rest we can ignore for now. This will already give a lot of value to users.

  1. For cursors and prepared statements, what is the desired outcome? Should the creation be recorded, and every subsequent use, or just the creation, or just the usages?

I think we can omit those for now. See above.

@antonpirker antonpirker self-assigned this Aug 30, 2023
@antonpirker
Copy link
Member

Oh, and for the connection to the test postgres database in CI to work, you need to read these env vars:

SENTRY_PYTHON_TEST_POSTGRES_USER
SENTRY_PYTHON_TEST_POSTGRES_PASSWORD
SENTRY_PYTHON_TEST_POSTGRES_NAME
SENTRY_PYTHON_TEST_POSTGRES_HOST

See here: https://github.com/getsentry/sentry-python/blob/master/tests/integrations/django/myapp/settings.py#L126-L130

@antonpirker
Copy link
Member

All in all really great work @mimre25 !

If you have addressed my comments I will do another round of test/review!

@happyleavesaoc
Copy link

I tried this out too, and it worked great. I did notice that the span duration is effectively zero in my test (and also the screenshot above).

@mimre25
Copy link
Contributor Author

mimre25 commented Aug 30, 2023

Thanks for the review! I'll probably only get to it this weekend, but I'll work in your comments .

@sentrivana sentrivana added the New Integration Integrating with a new framework or library label Aug 31, 2023
…nment

This allows running tests locally and in the CI pipeline.
Previously, the wrapper functions did not await the call to the db and
thus did not record the actual timing.
The spans for the cursor now record every "execute" of the cursor,
both in manual mode, and in iterator mode.
@mimre25
Copy link
Contributor Author

mimre25 commented Sep 2, 2023

Alright, the new commits do the following:

  • Reading test connection params from env
  • Fix the incorrect span durations (see below)
  • Record additional information (executemany, cursor, etc)
  • Recording connect statements
  • Record execute queries without parameters (I noticed this doesn't happen at the moment)
  • Some refactorings as it got a bit messy

The span durations were incorrectly recorded because I returned the coroutines instead of awaiting them and returning them.

Screenshot of span duration of old implementation:

image

Screenshot of span duration of new implementation:

image

Quite happy how it turned out:

image

Corresponding source code:
import asyncio
from asyncpg import connect
import sentry_sdk
from sentry_sdk.integrations.asyncpg import AsyncPGIntegration
import datetime

sentry_sdk.init(
  dsn="http://0db24a8128f5bbaa253a141ecd729c16@127.0.0.1:9000/2",
  traces_sample_rate=1.0,
  integrations=[AsyncPGIntegration(record_params=True)],
  _experiments={"record_sql_params": True}
)

async def main():

    with sentry_sdk.start_transaction(op="test", name="pg_sleep"):
        connection = await connect("postgresql://foo:bar@localhost/")
        await connection.execute("DROP TABLE IF EXISTS users")
        await connection.execute(
        """
            CREATE TABLE users(
                id serial PRIMARY KEY,
                name text,
                password text,
                dob date
            )
        """
        )
 
        await connection.execute("SELECT pg_sleep($1);", 3)
        await connection.executemany(
            "INSERT INTO users(name, password, dob) VALUES($1, $2, $3)",
            [
                ("Bob", "secret_pw", datetime.date(1984, 3, 1)),
                ("Alice", "pw", datetime.date(1990, 12, 25)),
            ],
        )
    
        async with connection.transaction():
            # Postgres requires non-scrollable cursors to be created
            # and used in a transaction.
            async for record in connection.cursor(
                "SELECT * FROM users WHERE dob > $1", datetime.date(1970, 1, 1)
            ):
                print(record)
    
        await connection.close()


asyncio.run(main())

sentry_sdk/integrations/asyncpg.py Outdated Show resolved Hide resolved
sentry_sdk/tracing_utils.py Show resolved Hide resolved
sentry_sdk/tracing_utils.py Outdated Show resolved Hide resolved
@mimre25 mimre25 marked this pull request as ready for review September 3, 2023 15:02
Copy link
Member

@antonpirker antonpirker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey! Did a second round of review. Looks very good already!

I updated the data we store on a span, to make it include all the information Sentry needs for doing some magic stuff with database spans.

There is one instrumentation of the __await__ that we can not release (see comment in the review)

And could you please fix the type problems in the tests that break the tests? Thanks!

sentry_sdk/integrations/asyncpg.py Outdated Show resolved Hide resolved
sentry_sdk/integrations/asyncpg.py Show resolved Hide resolved
sentry_sdk/tracing_utils.py Outdated Show resolved Hide resolved
sentry_sdk/integrations/asyncpg.py Outdated Show resolved Hide resolved
@antonpirker
Copy link
Member

I also added a PR to the docs to add the new integration: getsentry/sentry-docs#7756

@antonpirker antonpirker changed the title feat(integrations): Add integration for asyncpg >= 0.23.0 (WIP) feat(integrations): Add integration for asyncpg Sep 6, 2023
antonpirker and others added 5 commits September 6, 2023 11:09
This adapts the tests to the different method of recording connection
parameters in the span.
This is done to avoid recording of too many "queries" in a single span.
Instead, we now record cursor creation.
@mimre25
Copy link
Contributor Author

mimre25 commented Sep 9, 2023

I've removed the cursor instrumentation, and now only record the creation of a cursor instead:
image

image

All typing and lint fixes are done, and tox passes locally for py3.{7-11}-asyncpg & linters

@antonpirker antonpirker enabled auto-merge (squash) September 11, 2023 06:59
Copy link
Member

@antonpirker antonpirker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are ready! Really great work @mimre25 ! Thanks for being so responsive in bringing this over the finish line! Will be included in the next release of Sentry Python SDK.

@antonpirker
Copy link
Member

Oh, CI tests can not connect to postgres. I will fix that.

@antonpirker antonpirker merged commit 87d582d into getsentry:master Sep 11, 2023
244 of 250 checks passed
sentrivana pushed a commit that referenced this pull request Sep 18, 2023
So far this records every statement that is directly issued, as well as the SQL statements that are used for cursors and prepared statements.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
New Integration Integrating with a new framework or library
Projects
None yet
Development

Successfully merging this pull request may close these issues.

asynpg support?
4 participants