Deploy GrandLine on GCP GKE
In this guide
1. Before you start
1.1 What you'll end up with
A production-grade GrandLine install running on GKE (Autopilot-compatible) in one GCP region, reachable at https://grandline.yourdomain.com, backed by Cloud SQL Postgres 16 (regional HA), Memorystore for Redis (Standard tier), and a private GCS bucket. Total baseline cost ~$320/mo in us-central1.
1.2 Accounts, tools, and access
On your laptop:
gcloudCLI 470+ (gcloud version)terraform≥ 1.6 (terraform version)kubectlwith thegke-gcloud-auth-plugincomponent (gcloud components install gke-gcloud-auth-plugin)helm≥ 3.14opensslfor generating secretsgitfor cloning the self-hosted repo
On GCP:
- A GCP project you can deploy into with Owner on the project (Terraform creates service accounts, IAM bindings, VPC, and enables APIs).
- A billing account linked to the project.
gcloud auth application-default loginandgcloud config set project <PROJECT_ID>.gcloud config get-value projectreturns the intended project.- Default quota is sufficient on a fresh project: 1 VPC, 1 GKE cluster, 1 Cloud SQL instance, 1 Memorystore Redis instance, 1 GCS bucket.
1.3 Clone the self-hosted repo
git clone https://github.com/GrandLineZoro/grandline-self-hosted.git
cd grandline-self-hosted
2. DNS. pick your hostname
Same two choices as the other clouds:
- Two-subdomain layout (recommended). dashboard at
grandline.yourdomain.com, API atapi.grandline.yourdomain.com. - Single-subdomain layout. dashboard at
grandline.yourdomain.com, API under/api. Requires a custom dashboard image build.
Rest of this guide assumes grandline.acme.com + api.grandline.acme.com.
3. Deploy the dependencies (Terraform)
The Terraform module enables the required APIs, provisions a VPC with a private subnet and secondary IP ranges for pods/services, a GKE cluster (Standard or Autopilot) with Workload Identity enabled, a private Cloud SQL Postgres 16 instance (regional HA + IAM auth), a Memorystore for Redis Standard instance, a private GCS bucket with uniform bucket-level access, a Google Service Account (GSA) for the worker with roles/storage.objectAdmin on the bucket and roles/cloudsql.client, and the Workload Identity binding between the GSA and the Kubernetes grandline service account.
3.1 Copy the tfvars template
cd terraform/gcp
cp terraform.tfvars.example terraform.tfvars
3.2 Edit terraform.tfvars
project_id = "acme-grandline-prod"
region = "us-central1"
name = "grandline"
# Pick a parent domain. used for the Cloud DNS zone the module creates
# (or skip and bring your own zone with `existing_dns_zone_name`).
domain_name = "grandline.acme.com"
# GKE control-plane access. lock down for production
master_authorized_networks = [
{ cidr_block = "0.0.0.0/0", display_name = "open" }
]
3.3 Apply
terraform init
terraform apply
Apply takes ~12 minutes. Cloud SQL is the slowest (~8 min) because regional HA provisions a standby in a second zone.
google_project_service for container, sqladmin, redis, compute, dns, cloudresourcemanager, iam, iamcredentials, storage. If you see "API not enabled" errors on first apply, re-run terraform apply. GCP sometimes takes a minute to propagate activation.3.4 Grab the outputs
terraform output -raw helm_install_hint > /tmp/grandline-helm-install.sh
chmod +x /tmp/grandline-helm-install.sh
cat /tmp/grandline-helm-install.sh
4. Full configuration reference
Identity
project_id. GCP project. Required.name. resource name prefix. Default:"grandline". Used for cluster name, GSA name, bucket name suffix.labels. map applied to every resource that supports labels. Default:{}.
Networking
region. GCP region. Required. Any region with at least 3 zones.vpc_cidr. primary subnet CIDR. Default:10.80.0.0/20.pods_cidr. secondary range for Pods. Default:10.84.0.0/14.services_cidr. secondary range for Services. Default:10.88.0.0/20.master_authorized_networks. who can reach the GKE control plane. Default:[](private endpoint, no public access). Add entries for your laptop/bastion before rollout.enable_private_nodes. default:true. Nodes get no public IPs.
GKE
kubernetes_version. default:"1.30".mode."Standard"or"Autopilot". Default:"Standard". Autopilot removes node-pool config (skip thenode_*vars).node_machine_type. default:"e2-standard-4". Bump ton2-standard-8for heavier scan loads.node_count/min_count/max_count. defaults3 / 3 / 6, autoscaling on.workload_identity_enabled. default:true. Cannot be disabled. the chart requires it.
Data services
postgres_tier. default:"db-custom-2-8192". Cloud SQL Postgres 16.postgres_availability_type. default:"REGIONAL". Multi-zone failover standby.postgres_backup_retention_days. default:7. Bump before org rollout.redis_tier. default:"STANDARD_HA". Use"BASIC"for dev-only.redis_memory_size_gb. default:1.gcs_storage_class. default:"STANDARD". Replication is always regional multi-zone.
DNS / TLS
domain_name. primary hostname. Setting this creates a Cloud DNS public zone.existing_dns_zone_name. pass an existing Cloud DNS zone name instead.
Module outputs (visible via terraform output):
cluster_name,cluster_location. forgcloud container clusters get-credentials.database_url,database_private_ip.redis_url,redis_host.gcs_bucket_name.worker_service_account_email. the GSA for Workload Identity.helm_install_hint.
5. Configure DNS to point at the cluster
5.1 Get cluster credentials
CLUSTER=$(terraform output -raw cluster_name)
LOCATION=$(terraform output -raw cluster_location)
PROJECT=$(terraform output -raw project_id)
gcloud container clusters get-credentials "$CLUSTER" \
--region "$LOCATION" --project "$PROJECT"
kubectl get nodes
5.2 Install an ingress controller
Two common options on GKE:
Option A. GKE Ingress + Google-managed certs (recommended):
# No install needed. GKE Ingress is built in.
# Create a BackendConfig for each service (see values later) and set
# ingress.className = "gce" in Helm values.
Option B. ingress-nginx (simpler, uses a Google Network LB):
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm upgrade --install ingress-nginx ingress-nginx/ingress-nginx \
--namespace ingress-nginx --create-namespace \
--set controller.service.type=LoadBalancer \
--set controller.service.externalTrafficPolicy=Local
5.3 Get the LB address
# ingress-nginx:
kubectl get svc -n ingress-nginx ingress-nginx-controller \
-o jsonpath='{.status.loadBalancer.ingress[0].ip}'
# GCLB (after Ingress is created in step 6):
kubectl -n grandline get ingress grandline \
-o jsonpath='{.status.loadBalancer.ingress[0].ip}'
5.4 Create the DNS records
If Terraform created the Cloud DNS zone, add two A records inside it. Otherwise add them at your provider.
grandline.acme.com A → <LB IP>
api.grandline.acme.com A → <LB IP>
Example with Cloud DNS:
gcloud dns record-sets transaction start --zone=acme-com
gcloud dns record-sets transaction add <LB IP> \
--name=grandline.acme.com. --ttl=60 --type=A --zone=acme-com
gcloud dns record-sets transaction add <LB IP> \
--name=api.grandline.acme.com. --ttl=60 --type=A --zone=acme-com
gcloud dns record-sets transaction execute --zone=acme-com
Verify: dig +short grandline.acme.com.
6. Install the app with Helm
6.1 Generate JWT secrets
openssl rand -base64 32 > access.key
openssl rand -base64 32 > refresh.key
chmod 600 access.key refresh.key
6.2 Copy and edit values
cd../..
cp helm/values.example.yaml my-values.yaml
Edit my-values.yaml. GKE-specific pieces:
serviceAccount:
name: grandline
annotations:
iam.gke.io/gcp-service-account: <worker_service_account_email from terraform>
ingress:
className: nginx # or "gce" for GKE Ingress + managed certs
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
hosts:
- host: grandline.acme.com
paths: [{ path: /, service: dashboard }]
- host: api.grandline.acme.com
paths: [{ path: /, service: api }]
tls:
- secretName: grandline-tls
hosts: [grandline.acme.com, api.grandline.acme.com]
publicUrls:
app: https://grandline.acme.com
api: https://api.grandline.acme.com
cookieDomain: acme.com
cookieSecure: true
auth:
mfaRequired: true
webauthnRpId: grandline.acme.com
webauthnOrigin: https://grandline.acme.com
s3:
# GCS is accessed via the S3-compatible interoperability path or
# native GCS SDK. Chart switches when blobStore.provider: gcs.
bucket: <gcs_bucket_name>
endpoint: https://storage.googleapis.com
blobStore:
provider: gcs
6.3 Run the install
helm install grandline./helm \
--namespace grandline --create-namespace \
-f my-values.yaml \
--set serviceAccount.name=grandline \
--set-file auth.jwtAccessSecret=./access.key \
--set-file auth.jwtRefreshSecret=./refresh.key \
--set postgres.url="$(terraform -chdir=terraform/gcp output -raw database_url)" \
--set redis.url="$(terraform -chdir=terraform/gcp output -raw redis_url)"
--set serviceAccount.name=grandline. The chart default produces <release>-grandline which breaks the Workload Identity binding between the KSA and the GSA.6.4 Watch pods come up
kubectl -n grandline get pods -w
7. SSL / TLS setup
7.1 Option A. cert-manager + Let's Encrypt (recommended with ingress-nginx)
helm repo add jetstack https://charts.jetstack.io
helm install cert-manager jetstack/cert-manager \
--namespace cert-manager --create-namespace \
--set installCRDs=true
cat <<EOF | kubectl apply -f -
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: [email protected]
privateKeySecretRef:
name: letsencrypt-prod-key
solvers:
- http01:
ingress:
class: nginx
EOF
cert-manager provisions a SAN cert covering both hostnames into the grandline-tls Secret.
7.2 Option B. Google-managed certificates (GKE Ingress)
If you're using GKE Ingress (className: gce), Google manages the cert for you. Create a ManagedCertificate:
cat <<EOF | kubectl apply -f -
apiVersion: networking.gke.io/v1
kind: ManagedCertificate
metadata:
name: grandline-cert
namespace: grandline
spec:
domains:
- grandline.acme.com
- api.grandline.acme.com
EOF
Then annotate the Ingress: networking.gke.io/managed-certificates: grandline-cert. Provisioning takes ~15-30 minutes (Google validates domain ownership by DNS; the A record must already point at the LB).
7.3 Option C. Bring your own cert
kubectl create secret tls grandline-tls \
--cert=fullchain.pem --key=privkey.pem \
-n grandline
7.4 Verify
curl -sS -o /dev/null -w "%{http_code}\n" https://grandline.acme.com # 200
curl -sS https://api.grandline.acme.com/healthz # {"ok": true}
curl -sS -o /dev/null -w "%{http_code}\n" http://grandline.acme.com # 301/308
8. First login & creating users
8.1 Get the bootstrap admin credentials
kubectl get secret grandline-bootstrap-credentials -n grandline \
-o jsonpath='{.data.admin-credentials\.txt}' | base64 -d
8.2 Log in and enrol MFA
Open https://grandline.acme.com, sign in, scan the QR into an authenticator app, enter the 6-digit code.
kubectl delete secret grandline-bootstrap-credentials -n grandline
8.3 Configure email transport
Pick one and helm upgrade --reuse-values -f my-values.yaml:
SMTP (works with Google Workspace SMTP relay, SendGrid, Postmark, etc.):
email:
transport: smtp
from: "GrandLine <[email protected]>"
smtp:
host: smtp-relay.gmail.com
port: 587
user: smtp-user
existingSecret: grandline-smtp
Resend:
email:
transport: resend
from: "GrandLine <[email protected]>"
resend:
existingSecret: grandline-resend
GCP has no first-party transactional email service. use SMTP relay or a third party.
8.4 Invite users
Dashboard → Access → Users → Invite user. Fill in email, role (Admin/Member/Viewer), send invite. Invitee receives an email with a single-use setup link (48h TTL), sets a password (min 12 chars), signs in, and enrols MFA on first login.
8.5 Roles
- Owner. full admin. Promoted from Admin via Access → Users.
- Admin. add users, manage connectors, change settings.
- Member. view dashboards, generate/download reports.
- Viewer. read-only, no report generation.
8.6 Groups (data scoping)
Access → Groups. Create a Read group, add members, grant specific CloudAccounts. Members/Viewers in the group see only those accounts; Admins/Owners bypass group scope.
8.7 CLI fallback for test accounts
kubectl -n grandline exec -it deploy/grandline-api -- sh -c '
TENANT_ID=<cuid> \
[email protected] \
PASSWORD="SomeStrongPass!23" \
ROLE=Admin \
node scripts/create-test-user.js'
8.8 SSO status
SAML, OIDC, and OAuth SSO are not wired yet. Current path is password + TOTP. SSO (including Google Workspace federation) is on the roadmap for Enterprise tier.
9. Connect your AWS / Azure / GCP accounts
9.1 GCP (same cloud as the install. Workload Identity)
- Dashboard → Connectors → Add connector → GCP. Pick "Workload Identity".
- The connector dialog shows the worker's GSA email (matches
terraform output worker_service_account_email). - In the target project, grant that GSA Viewer + Cloud Asset Viewer:
TARGET=<target-project>
WORKER_GSA=$(terraform output -raw worker_service_account_email)
gcloud projects add-iam-policy-binding "$TARGET" \
--member "serviceAccount:${WORKER_GSA}" \
--role roles/viewer
gcloud projects add-iam-policy-binding "$TARGET" \
--member "serviceAccount:${WORKER_GSA}" \
--role roles/cloudasset.viewer
Alternative (works from anywhere): create a discovery GSA in the target project and upload its JSON key into the connector dialog.
gcloud iam service-accounts create grandline-discovery \
--project "$TARGET" --display-name "GrandLine discovery"
gcloud projects add-iam-policy-binding "$TARGET" \
--member "serviceAccount:grandline-discovery@${TARGET}.iam.gserviceaccount.com" \
--role roles/viewer
gcloud projects add-iam-policy-binding "$TARGET" \
--member "serviceAccount:grandline-discovery@${TARGET}.iam.gserviceaccount.com" \
--role roles/cloudasset.viewer
gcloud iam service-accounts keys create grandline-sa.json \
--iam-account "grandline-discovery@${TARGET}.iam.gserviceaccount.com"
9.2 AWS (cross-cloud from GKE)
Create a cross-account role in the target AWS account. From GKE the worker has no AWS credentials by default, so use one of:
- AWS IAM OIDC federation with the GKE workload identity pool. trust the target role to your GSA. Paste the role ARN + generated ExternalId into the AWS connector dialog.
- Static access key. create an IAM user with the discovery policy, paste key + secret into the connector dialog.
aws iam create-role --role-name GrandLineDiscoveryRole \
--assume-role-policy-document file://trust.json
aws iam attach-role-policy --role-name GrandLineDiscoveryRole \
--policy-arn arn:aws:iam::aws:policy/ReadOnlyAccess
9.3 Azure (cross-cloud from GKE)
az ad sp create-for-rbac \
--name grandline-discovery \
--role Reader \
--scopes /subscriptions/<TARGET_SUBSCRIPTION_ID>
Paste appId / tenant / password into the Azure connector dialog.
10. Verify the install
curl https://grandline.acme.comreturns200.curl https://api.grandline.acme.com/healthzreturns{"ok":true}.- Log in, enrol MFA, reach Overview.
- Invite a second user, they set password + MFA, sign in.
- Add a GCP connector. within 5 minutes the Resources page shows scanned resources.
- Generate a PDF report from Reports.
kubectl -n grandline get pdb,hpahealthy.
11. Troubleshooting
Cloud SQL "connection refused" from the cluster
The Terraform module provisions Cloud SQL with a private IP in the same VPC and wires the VPC-peering for servicenetworking.googleapis.com. Check: gcloud compute addresses list --global --filter="purpose=VPC_PEERING" shows an allocated range; gcloud services vpc-peerings list --network=<vpc> shows servicenetworking-googleapis-com. If not, re-run terraform apply.
Workload Identity "unable to generate access token"
Check the binding between the KSA (grandline/grandline) and the GSA:
gcloud iam service-accounts get-iam-policy \
$(terraform output -raw worker_service_account_email)
It must contain a binding of roles/iam.workloadIdentityUser to serviceAccount:<project>.svc.id.goog[grandline/grandline]. If you overrode the ServiceAccount name, update the binding's member to [<ns>/<sa>].
GKE Ingress ManagedCertificate stuck in "Provisioning"
Google validates the domain via the A record; if DNS is not yet propagated, provisioning sits for up to 60 minutes. kubectl describe managedcertificate grandline-cert -n grandline shows the status. If it says FailedNotVisible, the hostname does not resolve to the GCLB IP yet. wait for DNS propagation or fix your A record.
Memorystore connection fails on boot
Memorystore Standard listens on port 6379 on a private IP; TLS (port 6378) is available but off by default on legacy instances. The Terraform module enables TLS. make sure your redis.url starts with rediss:// (two s's) and targets port 6378. If TLS is disabled, use redis:// on 6379.
GCS 403 from worker
The GSA needs roles/storage.objectAdmin on the bucket. The Terraform module grants this by default. confirm with gsutil iam get gs://<bucket> and look for the serviceAccount: binding. Also check the worker pod is using the annotated KSA: kubectl -n grandline get pod <name> -o yaml | grep serviceAccount.
Pod stuck in ImagePullBackOff
GHCR is public but private clusters without Cloud NAT cannot reach it. Confirm: gcloud compute routers list --filter="region=$LOCATION" shows a NAT router. The Terraform module provisions one by default; if you disabled private nodes, it's not needed.
Invitee sees "Token invalid"
Token expired (48h) or was already used or was invalidated by re-invite. Re-send the invite from Access → Users → [user] → Re-invite.