Deploy GrandLine on Azure AKS
In this guide
1. Before you start
1.1 What you'll end up with
A production-grade GrandLine install running on AKS in one Azure region, reachable at https://grandline.yourdomain.com, backed by zone-redundant Flexible Server Postgres 16, Azure Cache for Redis (Standard), and a Blob Storage account (ZRS, AAD-auth-only). Total baseline cost ~$380/mo in East US.
1.2 Accounts, tools, and access
On your laptop:
azCLI v2.60+ (az version)terraform≥ 1.6 (terraform version)kubectlhelm≥ 3.14opensslfor generating secretsgitfor cloning the self-hosted repo
On Azure:
- An Azure subscription you can deploy into with Owner on the subscription (Terraform creates role assignments and Key Vault access policies).
az loginandaz account set --subscription <SUBSCRIPTION_ID>.az account showreturns the intended subscription.- Enough quota for one VNet, one AKS cluster (3 system nodes), one Flexible Server Postgres, one Azure Cache for Redis Standard instance, and one Storage Account. All default on a fresh subscription.
1.3 Clone the self-hosted repo
git clone https://github.com/GrandLineZoro/grandline-self-hosted.git
cd grandline-self-hosted
2. DNS. pick your hostname
Same two choices as the other clouds:
- Two-subdomain layout (recommended). dashboard at
grandline.yourdomain.com, API atapi.grandline.yourdomain.com. - Single-subdomain layout. dashboard at
grandline.yourdomain.com, API under/api. Requires a custom dashboard image build.
Rest of this guide assumes grandline.acme.com + api.grandline.acme.com.
3. Deploy the dependencies (Terraform)
The Terraform module provisions VNet + AKS subnet + delegated Postgres subnet, AKS 1.30 with a system node pool, Postgres Flexible Server 16 (zone-redundant HA), Azure Cache for Redis Standard (TLS 6380), a Storage Account with Blob container, a User-Assigned Managed Identity (UAMI) for the worker with a federated credential trusting the AKS OIDC issuer, and a Key Vault for secrets.
3.1 Copy the tfvars template
cd terraform/azure
cp terraform.tfvars.example terraform.tfvars
3.2 Edit terraform.tfvars
name = "grandline"
location = "eastus"
# Pick a parent domain. used for the DNS zone the module creates
# (or skip and bring your own zone with `existing_dns_zone_id`).
domain_name = "grandline.acme.com"
# AKS API-server access. lock down for production
aks_authorized_ip_ranges = ["0.0.0.0/0"]
3.3 Apply
terraform init
terraform apply
Apply takes ~18 minutes. Flexible Server is the slowest.
azurerm ~> 4.0. If you previously ran an older revision on 3.x, run rm.terraform.lock.hcl && terraform init -upgrade before the apply.3.4 Grab the outputs
terraform output -raw helm_install_hint > /tmp/grandline-helm-install.sh
chmod +x /tmp/grandline-helm-install.sh
cat /tmp/grandline-helm-install.sh
4. Full configuration reference
Identity
name. resource name prefix. Default:"grandline". Used for resource-group name, cluster name, UAMI name.tags. map applied to every resource. Default:{}.
Networking
location. Azure region. Required. Any region with Availability Zones.vnet_cidr. CIDR for the new VNet. Default:10.70.0.0/16.aks_authorized_ip_ranges. who can reach the AKS API server. Default:["0.0.0.0/0"]. Lock this down before org rollout.
AKS
kubernetes_version. default:"1.30".node_vm_size. default:"Standard_D4s_v5". UseStandard_D8s_v5for heavier scan loads.node_count/min_count/max_count. defaults3 / 3 / 6, autoscaling on.workload_identity_enabled. default:true. Enables the OIDC issuer + federated credential path.
Data services
postgres_sku. default:"GP_Standard_D2s_v3". Flexible Server 16.postgres_backup_retention_days. default:7. Bump before org rollout.postgres_high_availability. default:"ZoneRedundant". Multi-AZ failover target.redis_sku_name. default:"Standard". Use"Premium"if you need VNet injection.redis_capacity. default:1(1 GB C1 node).storage_account_tier. default:"Standard". Replication is always ZRS.
DNS / TLS
domain_name. primary hostname. Setting this creates an Azure DNS zone.existing_dns_zone_id. pass an existing zone ID instead of creating one.
Module outputs (visible via terraform output):
cluster_name,resource_group_name. foraz aks get-credentials.database_url_template,database_password_secret_id. Key Vault reference.redis_url_template,redis_auth_secret_id.storage_account_name,storage_container_name.worker_identity_client_id,tenant_id.helm_install_hint.
5. Configure DNS to point at the cluster
5.1 Get cluster credentials
RG=$(terraform output -raw resource_group_name)
CLUSTER=$(terraform output -raw cluster_name)
az aks get-credentials --resource-group "$RG" --name "$CLUSTER"
kubectl get nodes
5.2 Install an ingress controller
Two common options on AKS:
Option A. Application Gateway Ingress Controller (AGIC) (best if you need Azure-native WAF and TLS offload):
az aks enable-addons -a ingress-appgw \
-g "$RG" -n "$CLUSTER" \
--appgw-name grandline-agw \
--appgw-subnet-cidr 10.70.2.0/24
Option B. ingress-nginx (simpler, uses an Azure Standard LB):
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm upgrade --install ingress-nginx ingress-nginx/ingress-nginx \
--namespace ingress-nginx --create-namespace \
--set controller.service.type=LoadBalancer \
--set controller.service.externalTrafficPolicy=Local
5.3 Get the LB address
# ingress-nginx:
kubectl get svc -n ingress-nginx ingress-nginx-controller \
-o jsonpath='{.status.loadBalancer.ingress[0].ip}'
# AGIC: the Application Gateway's public IP is on the az resource:
az network public-ip show -g "$RG" -n grandline-agw-pip \
--query ipAddress -o tsv
5.4 Create the DNS records
If Terraform created the Azure DNS zone, add two A records inside it. Otherwise add them at your provider.
grandline.acme.com A → <LB IP>
api.grandline.acme.com A → <LB IP>
Example with Azure DNS:
az network dns record-set a add-record \
-g "$RG" -z acme.com \
-n grandline --ipv4-address <LB IP>
az network dns record-set a add-record \
-g "$RG" -z acme.com \
-n api.grandline --ipv4-address <LB IP>
Verify: dig +short grandline.acme.com.
6. Install the app with Helm
6.1 Generate JWT secrets
openssl rand -base64 32 > access.key
openssl rand -base64 32 > refresh.key
chmod 600 access.key refresh.key
6.2 Copy and edit values
cd../..
cp helm/values.example.yaml my-values.yaml
Edit my-values.yaml. AKS-specific pieces:
serviceAccount:
name: grandline
annotations:
azure.workload.identity/client-id: <worker_identity_client_id from terraform>
ingress:
className: nginx # or "azure-application-gateway" for AGIC
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
hosts:
- host: grandline.acme.com
paths: [{ path: /, service: dashboard }]
- host: api.grandline.acme.com
paths: [{ path: /, service: api }]
tls:
- secretName: grandline-tls
hosts: [grandline.acme.com, api.grandline.acme.com]
publicUrls:
app: https://grandline.acme.com
api: https://api.grandline.acme.com
cookieDomain: acme.com
cookieSecure: true
auth:
mfaRequired: true
webauthnRpId: grandline.acme.com
webauthnOrigin: https://grandline.acme.com
s3:
# Blob storage is accessed via the S3-compatible gateway path.
# The chart maps the s3.* block to Blob when blobStore.provider: azure.
bucket: <storage_container_name>
endpoint: https://<storage_account_name>.blob.core.windows.net
blobStore:
provider: azure
6.3 Run the install
helm install grandline./helm \
--namespace grandline --create-namespace \
-f my-values.yaml \
--set serviceAccount.name=grandline \
--set-file auth.jwtAccessSecret=./access.key \
--set-file auth.jwtRefreshSecret=./refresh.key \
--set postgres.url="$(terraform -chdir=terraform/azure output -raw database_url)" \
--set redis.url="$(terraform -chdir=terraform/azure output -raw redis_url)"
--set serviceAccount.name=grandline. The chart default produces <release>-grandline which breaks Workload Identity federation.6.4 Watch pods come up
kubectl -n grandline get pods -w
7. SSL / TLS setup
7.1 Option A. cert-manager + Let's Encrypt (recommended with ingress-nginx)
helm repo add jetstack https://charts.jetstack.io
helm install cert-manager jetstack/cert-manager \
--namespace cert-manager --create-namespace \
--set installCRDs=true
cat <<EOF | kubectl apply -f -
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: [email protected]
privateKeySecretRef:
name: letsencrypt-prod-key
solvers:
- http01:
ingress:
class: nginx
EOF
cert-manager provisions a SAN cert covering both hostnames into the grandline-tls Secret.
7.2 Option B. App Gateway with Key Vault cert (AGIC)
If you're using AGIC, store a PFX/PEM cert in Key Vault and reference it:
appgw.ingress.kubernetes.io/appgw-ssl-certificate: <key-vault-cert-name>
AGIC then fetches the cert from Key Vault at runtime. No cert-manager needed.
7.3 Option C. Bring your own cert
kubectl create secret tls grandline-tls \
--cert=fullchain.pem --key=privkey.pem \
-n grandline
7.4 Verify
curl -sS -o /dev/null -w "%{http_code}\n" https://grandline.acme.com # 200
curl -sS https://api.grandline.acme.com/healthz # {"ok": true}
curl -sS -o /dev/null -w "%{http_code}\n" http://grandline.acme.com # 301/308
8. First login & creating users
8.1 Get the bootstrap admin credentials
kubectl get secret grandline-bootstrap-credentials -n grandline \
-o jsonpath='{.data.admin-credentials\.txt}' | base64 -d
8.2 Log in and enrol MFA
Open https://grandline.acme.com, sign in, scan the QR into an authenticator app, enter the 6-digit code.
kubectl delete secret grandline-bootstrap-credentials -n grandline
8.3 Configure email transport
Pick one and helm upgrade --reuse-values -f my-values.yaml:
SMTP (works with Office 365, SendGrid, Postmark, etc.):
email:
transport: smtp
from: "GrandLine <[email protected]>"
smtp:
host: smtp.office365.com
port: 587
user: smtp-user
existingSecret: grandline-smtp
Azure Communication Services. use the SMTP relay endpoint with the connection-string credentials.
Resend:
email:
transport: resend
from: "GrandLine <[email protected]>"
resend:
existingSecret: grandline-resend
8.4 Invite users
Dashboard → Access → Users → Invite user. Fill in email, role (Admin/Member/Viewer), send invite. Invitee receives an email with a single-use setup link (48h TTL), sets a password (min 12 chars), signs in, and enrols MFA on first login.
8.5 Roles
- Owner. full admin. Promoted from Admin via Access → Users.
- Admin. add users, manage connectors, change settings.
- Member. view dashboards, generate/download reports.
- Viewer. read-only, no report generation.
8.6 Groups (data scoping)
Access → Groups. Create a Read group, add members, grant specific CloudAccounts. Members/Viewers in the group see only those accounts; Admins/Owners bypass group scope.
8.7 CLI fallback for test accounts
kubectl -n grandline exec -it deploy/grandline-api -- sh -c '
TENANT_ID=<cuid> \
[email protected] \
PASSWORD="SomeStrongPass!23" \
ROLE=Admin \
node scripts/create-test-user.js'
8.8 SSO status
SAML, OIDC, and OAuth SSO are not wired yet. Current path is password + TOTP. SSO is on the roadmap for Enterprise tier.
9. Connect your AWS / Azure / GCP accounts
9.1 Azure (same cloud as the install. Workload Identity)
- Dashboard → Connectors → Add connector → Azure. Pick "Workload Identity".
- The connector dialog shows the worker's UAMI client ID and tenant ID (matches
terraform output worker_identity_client_id). - In the target subscription, grant that UAMI Reader:
az role assignment create \
--assignee <worker_identity_client_id> \
--role Reader \
--scope /subscriptions/<TARGET_SUBSCRIPTION_ID>
Alternative (works from anywhere): Service Principal path.
az ad sp create-for-rbac \
--name grandline-discovery \
--role Reader \
--scopes /subscriptions/<TARGET_SUBSCRIPTION_ID>
Paste appId / tenant / password into the connector dialog.
9.2 AWS (cross-cloud from AKS)
Create a cross-account role trusted by an AWS principal you control. From AKS the worker does not have AWS credentials, so use static-key or OIDC federation:
aws iam create-role --role-name GrandLineDiscoveryRole \
--assume-role-policy-document file://trust.json
aws iam attach-role-policy --role-name GrandLineDiscoveryRole \
--policy-arn arn:aws:iam::aws:policy/ReadOnlyAccess
Paste the role ARN + generated ExternalId into the AWS connector dialog.
9.3 GCP (cross-cloud from AKS)
PROJECT_ID=<your-project>
gcloud iam service-accounts create grandline-discovery \
--project "$PROJECT_ID" --display-name "GrandLine discovery"
gcloud projects add-iam-policy-binding "$PROJECT_ID" \
--member "serviceAccount:grandline-discovery@${PROJECT_ID}.iam.gserviceaccount.com" \
--role roles/viewer
gcloud projects add-iam-policy-binding "$PROJECT_ID" \
--member "serviceAccount:grandline-discovery@${PROJECT_ID}.iam.gserviceaccount.com" \
--role roles/cloudasset.viewer
gcloud iam service-accounts keys create grandline-sa.json \
--iam-account "grandline-discovery@${PROJECT_ID}.iam.gserviceaccount.com"
10. Verify the install
curl https://grandline.acme.comreturns200.curl https://api.grandline.acme.com/healthzreturns{"ok":true}.- Log in, enrol MFA, reach Overview.
- Invite a second user, they set password + MFA, sign in.
- Add an Azure connector. within 5 minutes the Resources page shows scanned resources.
- Generate a PDF report from Reports.
kubectl -n grandline get pdb,hpahealthy.
11. Troubleshooting
Flexible Server connection refused
Confirm the delegated subnet is correctly attached. az postgres flexible-server show --resource-group <rg> --name <name> --query network should show the subnet ID from the Terraform output. Also check that the Postgres pod-subnet's NSG allows 5432 from the AKS subnet.
Workload Identity federation fails with "AADSTS700213"
The federated credential's subject must be system:serviceaccount:grandline:grandline. If you overrode the ServiceAccount name, update the federated credential: az identity federated-credential update --identity-name... --subject system:serviceaccount:<ns>:<sa>.
AGIC throws 502
AGIC's probe path is the ALB's default "/" by default. Set the Ingress annotation appgw.ingress.kubernetes.io/backend-path-prefix: /healthz on the API Ingress or add an explicit readinessProbe endpoint.
Redis connection fails on boot
Azure Cache for Redis enforces TLS on port 6380 and plaintext access is disabled by default (we don't enable it). Ensure redis.url starts with rediss:// (two s's). if Terraform emitted redis://, that's a bug, file it.
Storage Account 403 from worker
The UAMI needs Storage Blob Data Contributor on the container. The Terraform module grants this by default. if you see 403s, confirm the role assignment landed (az role assignment list --assignee <client_id> --scope /subscriptions/<sub>/resourceGroups/<rg>/providers/Microsoft.Storage/storageAccounts/<sa>).
Invitee sees "Token invalid"
Token expired (48h) or was already used or was invalidated by re-invite. Re-send the invite from Access → Users → [user] → Re-invite.