-
Make sure you've setup and initialized your Database.
-
You must have the following APIs Enabled:
gcloud services enable run.googleapis.com \ cloudbuild.googleapis.com \ artifactregistry.googleapis.com \ iam.googleapis.com
-
To create an IAM account, you must have the following IAM permissions (or roles):
- Create Service Account role (roles/iam.serviceAccountCreator)
-
To deploy from source, you must have ONE of the following IAM permission:
- Owner role
- Editor role
- The following set of roles:
- Cloud Build Editor role (roles/cloudbuild.builds.editor)
- Artifact Registry Admin role (roles/artifactregistry.admin)
- Storage Admin role (roles/storage.admin)
- Cloud Run Admin role (roles/run.admin)
- Service Account User role (roles/iam.serviceAccountUser)
Notes:
- If you are under a domain restriction organization policy restricting unauthenticated invocations for your project, you will need to access your deployed service as described under Testing private services.
- If you are using VPC based datastore, make sure your Cloud Run service and datastore are in the same VPC network.
-
Create a backend service account if you don't already have one:
gcloud iam service-accounts create retrieval-identity
-
Grant permissions to access your database:
-
For AlloyDB Omni:
gcloud projects add-iam-policy-binding $PROJECT_ID \ --member serviceAccount:retrieval-identity@$PROJECT_ID.iam.gserviceaccount.com \ --role roles/alloydb.client
-
-
Grant permissions to use VertexAI to generate embeddings for similarity searches:
gcloud projects add-iam-policy-binding $PROJECT_ID \ --member serviceAccount:retrieval-identity@$PROJECT_ID.iam.gserviceaccount.com \ --role roles/aiplatform.user
Set up configuration for retrieval_service/config.yml
:
provider |
---|
AlloyDB |
Cloud SQL for Postgres |
Non-cloud Postgres (e.g. AlloyDB Omni) |
- For AlloyDB Omni, replace host with
host: <YOUR ALLOYDB_IP_ADDRESS>
.
-
From the root
genai-databases-retrieval-app
directory, deploy the retrieval service to Cloud Run using the following command:gcloud run deploy retrieval-service \ --source=./retrieval_service/\ --no-allow-unauthenticated \ --service-account retrieval-identity \ --region us-central1
If you are using a VPC network, use the command below:
gcloud alpha run deploy retrieval-service \ --source=./retrieval_service/\ --no-allow-unauthenticated \ --service-account retrieval-identity \ --region us-central1 \ --network=default \ --subnet=default
Next, we will use gcloud to authenticate requests to our Cloud Run instance:
-
Run the
run services proxy
to proxy connections to Cloud Run:gcloud run services proxy retrieval-service --port=8080 --region=us-central1
If you are prompted to install the proxy, reply Y to install.
-
Finally, use
curl
to verify the endpoint works:curl http://127.0.0.1:8080