Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revise proxy configuration, add integration testing #325

Merged
merged 32 commits into from
Mar 21, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
0d2ef35
Add integration tests with proxy
jhamon Mar 19, 2024
c36f4b7
Adjust working dir for github actions
jhamon Mar 19, 2024
9662045
Update docs
jhamon Mar 19, 2024
c228cde
Add debug output in tests
jhamon Mar 19, 2024
8ee2695
Simplify compose up
jhamon Mar 19, 2024
8ab5da0
Iterate on test setup
jhamon Mar 19, 2024
1e4555c
More debug output
jhamon Mar 19, 2024
e5f0130
More debug output
jhamon Mar 19, 2024
88a8c62
List running containers
jhamon Mar 19, 2024
9f5f036
Adjust docker volumes
jhamon Mar 19, 2024
8ac4bdf
Revert "Adjust docker volumes"
jhamon Mar 19, 2024
da52763
Adjust volume config
jhamon Mar 19, 2024
4653d03
Remove name from compose file
jhamon Mar 19, 2024
74ee9e7
Consolidate docker run into same step as test run
jhamon Mar 19, 2024
8786d3d
Experiment: file permissions
jhamon Mar 19, 2024
0b97dd3
Docker compose with no daemon option
jhamon Mar 19, 2024
3c24904
Adjust user permissions
jhamon Mar 19, 2024
b9517f2
Remove docker compose
jhamon Mar 20, 2024
677d282
Adjust urls
jhamon Mar 20, 2024
0e51730
Remove files accidentally committed
jhamon Mar 20, 2024
174816c
Upload test log as artifacts
jhamon Mar 20, 2024
a6a1e9e
Log docker status
jhamon Mar 20, 2024
cb3ad73
Adjust sleep
jhamon Mar 20, 2024
87092a5
Bump upload-artifact
jhamon Mar 20, 2024
edfb88c
Temporarily comment other tests
jhamon Mar 20, 2024
72dd0b2
Temporarily keep containers after they exit
jhamon Mar 20, 2024
c3a6566
Adjust log out
jhamon Mar 20, 2024
e05923e
Disable proxy tests in CI
jhamon Mar 21, 2024
9714afd
Reenable other tests
jhamon Mar 21, 2024
a057145
Update README
jhamon Mar 21, 2024
70b2d9a
Update READMe
jhamon Mar 21, 2024
5450689
Add inline comments
jhamon Mar 21, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
16 changes: 16 additions & 0 deletions .github/workflows/testing-dependency.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -158,3 +158,19 @@ jobs:
index_name: '${{ needs.dependency-matrix-setup.outputs.index_name }}'
PINECONE_API_KEY: '${{ secrets.PINECONE_API_KEY }}'
urllib3_version: '${{ matrix.urllib3_version }}'

deps-cleanup:
Copy link
Collaborator Author

@jhamon jhamon Mar 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not really related to the config changes, but keeps us from getting failures due to 20 index per-project limit.

name: Deps cleanup
runs-on: ubuntu-latest
needs:
- dependency-matrix-setup
- dependency-matrix-grpc
- dependency-matrix-grpc-312
- dependency-matrix-rest
- dependency-matrix-rest-312
steps:
- uses: actions/checkout@v4
- uses: ./.github/actions/delete-index
with:
index_name: '${{ needs.dependency-matrix-setup.outputs.index_name }}'
PINECONE_API_KEY: '${{ secrets.PINECONE_API_KEY }}'
49 changes: 48 additions & 1 deletion .github/workflows/testing-integration.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,54 @@ name: "Integration Tests"
workflow_call: {}

jobs:
# setup-index:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving this commented code in as I have plans to bring back CI integration tests in a future PR.

# name: Setup proxyconfig test index
# runs-on: ubuntu-latest
# outputs:
# index_name: ${{ steps.create-index.outputs.index_name }}
# steps:
# - uses: actions/checkout@v4
# - name: Create index
# id: create-index
# uses: ./.github/actions/create-index
# with:
# PINECONE_API_KEY: ${{ secrets.PINECONE_API_KEY }}
# NAME_PREFIX: 'proxyconfig-'
# REGION: 'us-west-2'
# CLOUD: 'aws'
# DIMENSION: 1536
# METRIC: 'cosine'

# proxy-config:
# name: Proxy config tests
# needs: [setup-index]
# runs-on: ubuntu-latest
# steps:
# - uses: actions/checkout@v4
# - uses: actions/setup-python@v5
# with:
# python-version: 3.9
# - name: Setup Poetry
# uses: ./.github/actions/setup-poetry
# - name: 'Run integration tests (proxy config)'
# run: |
# poetry run pytest tests/integration/proxy_config -s -v
# env:
# PINECONE_API_KEY: '${{ secrets.PINECONE_API_KEY }}'
# PINECONE_INDEX_NAME: ${{ needs.setup-index.outputs.index_name }}
# - name: Upload logs
# if: always()
# uses: actions/upload-artifact@v4
# with:
# name: proxy_config_test_logs
# path: tests/integration/proxy_config/logs
# - name: Cleanup index
# if: always()
# uses: ./.github/actions/delete-index
# with:
# PINECONE_API_KEY: ${{ secrets.PINECONE_API_KEY }}
# INDEX_NAME: ${{ needs.setup-index.outputs.index_name }}

data-plane-serverless:
name: Data plane serverless integration tests
runs-on: ubuntu-latest
Expand All @@ -27,7 +75,6 @@ jobs:
spec: '${{ matrix.spec }}'
PINECONE_API_KEY: '${{ secrets.PINECONE_API_KEY }}'
freshness_timeout_seconds: 600

# data-plane-pod:
# name: Data plane pod integration tests
# runs-on: ubuntu-latest
Expand Down
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -154,3 +154,5 @@ dmypy.json
# Datasets
*.hdf5
*~

tests/integration/proxy_config/logs
79 changes: 77 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,14 +71,14 @@ from pinecone import Pinecone
pc = Pinecone() # This reads the PINECONE_API_KEY env var
```

#### Using a configuration object
#### Using configuration keyword params

If you prefer to pass configuration in code, for example if you have a complex application that needs to interact with multiple different Pinecone projects, the constructor accepts a keyword argument for `api_key`.

If you pass configuration in this way, you can have full control over what name to use for the environment variable, sidestepping any issues that would result
from two different client instances both needing to read the same `PINECONE_API_KEY` variable that the client implicitly checks for.

Configuration passed with keyword arguments takes precedent over environment variables.
Configuration passed with keyword arguments takes precedence over environment variables.

```python
import os
Expand All @@ -87,6 +87,81 @@ from pinecone import Pinecone
pc = Pinecone(api_key=os.environ.get('CUSTOM_VAR'))
```

### Proxy configuration

If your network setup requires you to interact with Pinecone via a proxy, you will need
to pass additional configuration using optional keyword parameters. These optional parameters are forwarded to `urllib3`, which is the underlying library currently used by the Pinecone client to make HTTP requests. You may find it helpful to refer to the [urllib3 documentation on working with proxies](https://urllib3.readthedocs.io/en/stable/advanced-usage.html#http-and-https-proxies) while troubleshooting these settings.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice docs-touch referring folks out to urllib3 👍


Here is a basic example:

```python
from pinecone import Pinecone

pc = Pinecone(
api_key='YOUR_API_KEY',
proxy_url='https://your-proxy.com'
)

pc.list_indexes()
```

If your proxy requires authentication, you can pass those values in a header dictionary using the `proxy_headers` parameter.

```python
from pinecone import Pinecone
import urllib3 import make_headers

pc = Pinecone(
api_key='YOUR_API_KEY',
proxy_url='https://your-proxy.com',
proxy_headers=make_headers(proxy_basic_auth='username:password')
)

pc.list_indexes()
```

### Using proxies with self-signed certificates

By default the Pinecone Python client will perform SSL certificate verification
using the CA bundle maintained by Mozilla in the [certifi](https://pypi.org/project/certifi/) package.

If your proxy server is using a self-signed certificate, you will need to pass the path to the certificate in PEM format using the `ssl_ca_certs` parameter.

```python
from pinecone import Pinecone
import urllib3 import make_headers

pc = Pinecone(
api_key="YOUR_API_KEY",
proxy_url='https://your-proxy.com',
proxy_headers=make_headers(proxy_basic_auth='username:password'),
ssl_ca_certs='path/to/cert-bundle.pem'
)

pc.list_indexes()
```

### Disabling SSL verification

If you would like to disable SSL verification, you can pass the `ssl_verify`
parameter with a value of `False`. We do not recommend going to production with SSL verification disabled.

```python
from pinecone import Pinecone
import urllib3 import make_headers

pc = Pinecone(
api_key='YOUR_API_KEY',
proxy_url='https://your-proxy.com',
proxy_headers=make_headers(proxy_basic_auth='username:password'),
ssl_ca_certs='path/to/cert-bundle.pem',
ssl_verify=False
)

pc.list_indexes()

```

### Working with GRPC (for improved performance)

If you've followed instructions above to install with optional `grpc` extras, you can unlock some performance improvements by working with an alternative version of the client imported from the `pinecone.grpc` subpackage.
Expand Down
40 changes: 31 additions & 9 deletions pinecone/config/config.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
from typing import NamedTuple, Optional, Dict
import os
import copy

from pinecone.exceptions import PineconeConfigurationError
from pinecone.config.openapi import OpenApiConfigFactory
Expand All @@ -10,7 +9,10 @@
class Config(NamedTuple):
api_key: str = ""
host: str = ""
openapi_config: Optional[OpenApiConfiguration] = None
proxy_url: Optional[str] = None
proxy_headers: Optional[Dict[str, str]] = None
ssl_ca_certs: Optional[str] = None
ssl_verify: Optional[bool] = None
additional_headers: Optional[Dict[str, str]] = {}

class ConfigBuilder:
Expand All @@ -34,7 +36,10 @@ class ConfigBuilder:
def build(
api_key: Optional[str] = None,
host: Optional[str] = None,
openapi_config: Optional[OpenApiConfiguration] = None,
proxy_url: Optional[str] = None,
proxy_headers: Optional[Dict[str, str]] = None,
ssl_ca_certs: Optional[str] = None,
ssl_verify: Optional[bool] = None,
additional_headers: Optional[Dict[str, str]] = {},
**kwargs,
) -> Config:
Expand All @@ -47,11 +52,28 @@ def build(
if not host:
raise PineconeConfigurationError("You haven't specified a host.")

return Config(api_key, host, proxy_url, proxy_headers, ssl_ca_certs, ssl_verify, additional_headers)

@staticmethod
def build_openapi_config(
config: Config, openapi_config: Optional[OpenApiConfiguration] = None, **kwargs
) -> OpenApiConfiguration:
if openapi_config:
openapi_config = copy.deepcopy(openapi_config)
openapi_config.host = host
openapi_config.api_key = {"ApiKeyAuth": api_key}
else:
openapi_config = OpenApiConfigFactory.build(api_key=api_key, host=host)
openapi_config = OpenApiConfigFactory.copy(openapi_config=openapi_config, api_key=config.api_key, host=config.host)
elif openapi_config is None:
openapi_config = OpenApiConfigFactory.build(api_key=config.api_key, host=config.host)

return Config(api_key, host, openapi_config, additional_headers)
# Check if value passed before overriding any values present
# in the openapi_config. This means if the user has passed
# an openapi_config object and a kwarg for the same setting,
# the kwarg will take precedence.
Comment on lines +68 to +69
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may be reading this incorrectly, but it doesn't look like we're doing anything with kwargs here or in build?

Is the comment meant to be referencing overriding openapi_config with config?

Copy link
Collaborator Author

@jhamon jhamon Mar 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The config object is built using kwarg values (specifically PineconeConfig.build called from the Pinecone constructor). Sorry it's not more clear. Things get confusing when trying to manage and merge these different sources of configuration, which is a big reason I want to deprecate passing openapi_config.

if (config.proxy_url):
openapi_config.proxy = config.proxy_url
if (config.proxy_headers):
openapi_config.proxy_headers = config.proxy_headers
if (config.ssl_ca_certs):
openapi_config.ssl_ca_cert = config.ssl_ca_certs
if (config.ssl_verify != None):
openapi_config.verify_ssl = config.ssl_verify

return openapi_config
28 changes: 27 additions & 1 deletion pinecone/config/openapi.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@

import certifi
import socket
import copy
import warnings

from urllib3.connection import HTTPConnection

Expand All @@ -17,11 +19,35 @@ class OpenApiConfigFactory:
@classmethod
def build(cls, api_key: str, host: Optional[str] = None, **kwargs):
openapi_config = OpenApiConfiguration()
openapi_config.api_key = {"ApiKeyAuth": api_key}
openapi_config.host = host
openapi_config.ssl_ca_cert = certifi.where()
openapi_config.socket_options = cls._get_socket_options()
openapi_config.api_key = {"ApiKeyAuth": api_key}
return openapi_config

@classmethod
def copy(cls, openapi_config: OpenApiConfiguration, api_key: str, host: str) -> OpenApiConfiguration:
'''
Copy a user-supplied openapi configuration and update it with the user's api key and host.
If they have not specified other socket configuration, we will use the default values.
We expect these objects are being passed mainly a vehicle for proxy configuration, so
we don't modify those settings.
'''
copied = copy.deepcopy(openapi_config)
warnings.warn("Passing openapi_config is deprecated and will be removed in a future release. Please pass settings such as proxy_url, proxy_headers, ssl_ca_certs, and ssl_verify directly to the Pinecone constructor as keyword arguments. See the README at https://github.com/pinecone-io/pinecone-python-client for examples.", DeprecationWarning)

copied.api_key = {"ApiKeyAuth": api_key}
copied.host = host

# Set sensible defaults if the user hasn't set them
if not copied.socket_options:
copied.socket_options = cls._get_socket_options()

# We specifically do not modify the user's ssl_ca_cert or proxy settings, as
# they may have set them intentionally. This is the main reason somebody would
# pass an openapi_config in the first place.

return copied

@classmethod
def _get_socket_options(
Expand Down