Deploy GrandLine on AWS EKS
In this guide
1. Before you start
1.1 What you'll end up with
A production-grade GrandLine install running on EKS in one AWS region, reachable at https://grandline.yourdomain.com, backed by multi-AZ Aurora Postgres, multi-AZ ElastiCache Redis, and an encrypted S3 bucket. Total baseline cost ~$450/mo in us-east-1.
1.2 Accounts, tools, and access
On your laptop:
awsCLI v2 (aws --versionprints2.x)terraform≥ 1.6 (terraform version)kubectl(kubectl version --client)helm≥ 3.14 (helm version)opensslfor generating secretsgitfor cloning the self-hosted repo
On AWS:
- An AWS account you can deploy into with admin-equivalent permissions (the Terraform module creates VPC, EKS, RDS, ElastiCache, S3, IAM, KMS, Route53, ACM).
- An AWS CLI profile authenticated to that account:
aws sts get-caller-identityshould return it. - Enough service quota for one VPC, one EKS cluster (3 nodes), one RDS Aurora cluster (2 instances), one ElastiCache Redis replication group (2 nodes). These are all default quotas on a fresh account.
1.3 Clone the self-hosted repo
git clone https://github.com/GrandLineZoro/grandline-self-hosted.git
cd grandline-self-hosted
You'll spend the rest of this guide inside that repo. The Terraform module lives in terraform/aws/ and the Helm chart lives in helm/.
2. DNS. pick your hostname
GrandLine needs a hostname you control. The chart supports two layouts:
2.1 Two-subdomain layout (recommended)
Dashboard at grandline.yourdomain.com, API at api.grandline.yourdomain.com. Cleanest session-cookie and CORS setup, shares one TLS SAN cert. This is what the rest of this guide assumes.
2.2 Single-subdomain layout
Dashboard at grandline.yourdomain.com, API served under /api. One DNS record, one TLS SAN. Requires the dashboard image to be built with NEXT_PUBLIC_API_URL=/api. use this only if you're comfortable building custom images.
For the rest of this guide we'll use grandline.acme.com and api.grandline.acme.com as placeholders. substitute your own domain.
3. Deploy the dependencies (Terraform)
The Terraform module provisions everything the chart expects: VPC + 3-AZ subnets, EKS 1.30 cluster with a managed node group, Aurora Postgres 16 (multi-AZ), ElastiCache Redis 7 (multi-AZ, TLS), S3 bucket (KMS + Object Lock), IRSA role for the worker, and. optionally. a Route53 zone with an ACM wildcard cert.
3.1 Copy the tfvars template
cd terraform/aws
cp terraform.tfvars.example terraform.tfvars
3.2 Edit terraform.tfvars
The minimum fields to fill in:
name = "grandline"
region = "us-east-1"
domain_name = "grandline.acme.com" # triggers Route53 + ACM wildcard
# Any extra SAN you want on the cert (api host):
extra_certificate_sans = ["api.grandline.acme.com"]
# EKS endpoint public access. lock this down before org rollout.
# For the tester-phase install, leaving it open is acceptable.
cluster_endpoint_public_access_cidrs = ["0.0.0.0/0"]
Full configuration reference is in section 4.
3.3 Apply
terraform init
terraform apply # review the plan, type "yes" to confirm
Apply takes ~20 minutes. Aurora and ACM are the slowest.
3.4 Grab the outputs
terraform output -raw helm_install_hint > /tmp/grandline-helm-install.sh
chmod +x /tmp/grandline-helm-install.sh
cat /tmp/grandline-helm-install.sh # review before running
This is a ready-to-paste helm install command with every --set already filled in. database URL, Redis URL, S3 bucket, IRSA role ARN, service account name. You'll run it in section 6.
terraform.tfstate). Move it to an S3 backend before rotating anyone else into the install: terraform init -migrate-state -backend-config="bucket=…".4. Full configuration reference
Every variable the Terraform module accepts, with defaults and when you'd change them. You set these in terraform.tfvars.
Identity
name. resource name prefix. Default:"grandline". Used in VPC name, cluster name, IAM role names.tags. map of tags applied to every resource. Default:{}. Add cost-allocation tags here.
Networking
region. AWS region. Required. Any region with at least 3 AZs.vpc_cidr. CIDR for the new VPC. Default:10.64.0.0/16. Change if it overlaps with peered VPCs.cluster_endpoint_public_access_cidrs. who can reach the EKS API. Default:["0.0.0.0/0"]. Lock this down for production (see section 7).
EKS
eks_version. Kubernetes version. Default:"1.30".node_instance_types. default:["m6i.large"]. Usem6i.xlargeif you're planning to scan hundreds of accounts.node_desired_size/node_min_size/node_max_size. defaults3 / 3 / 6.
Data services
postgres_instance_class. default:"db.r6g.large". Aurora Postgres 16.postgres_backup_retention_days. default:7. Bump to30before org rollout.redis_node_type. default:"cache.m7g.large". Redis 7 with TLS enforced.s3_object_lock_days. default:30. Report-object retention floor.
DNS / TLS (optional)
domain_name. primary hostname. Setting this triggers Route53 zone + ACM wildcard provisioning.extra_certificate_sans. additional SANs on the ACM cert. Include your API hostname here.existing_route53_zone_id. if you already have a Route53 hosted zone for the parent domain, pass its ID instead of letting the module create one.
The module emits the following outputs (shown by terraform output):
cluster_name,cluster_endpoint. foraws eks update-kubeconfig.database_url. full Postgres URL including TLS requirement. Passed to Helm.redis_url.rediss://URL with AUTH token.s3_bucket,s3_kms_key_arn.worker_iam_role_arn. the IRSA role the worker assumes.tls_certificate_arn. ACM cert ARN (ifdomain_namewas set).helm_install_hint. a ready-to-pastehelm install.
5. Configure DNS to point at the cluster
Two DNS records, both pointing at your ingress load balancer.
5.1 Get the cluster kubeconfig
aws eks update-kubeconfig \
--region us-east-1 \
--name $(terraform output -raw cluster_name)
kubectl get nodes # sanity check. should show 3 Ready nodes
5.2 Install an ingress controller
The Terraform module does NOT install an ingress controller. you pick. For EKS the two common choices are the AWS Load Balancer Controller (creates ALBs) or ingress-nginx (creates an NLB).
Option A. AWS Load Balancer Controller (recommended on EKS, uses the ACM cert directly):
helm repo add eks https://aws.github.io/eks-charts
helm repo update
helm install aws-load-balancer-controller eks/aws-load-balancer-controller \
--namespace kube-system \
--set clusterName=$(terraform output -raw cluster_name) \
--set serviceAccount.create=true \
--set serviceAccount.name=aws-load-balancer-controller
Option B. ingress-nginx (if you prefer L4 + cert-manager):
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm upgrade --install ingress-nginx ingress-nginx/ingress-nginx \
--namespace ingress-nginx --create-namespace \
--set controller.service.type=LoadBalancer
5.3 Get the load balancer DNS name
# For ingress-nginx:
kubectl get svc -n ingress-nginx ingress-nginx-controller \
-o jsonpath='{.status.loadBalancer.ingress[0].hostname}'
# For AWS LB Controller you won't see an LB until Helm installs the app (section 6)
# and creates the Ingress. come back here after that step.
5.4 Create the DNS records
In Route53 (or your DNS provider), create two ALIAS records (or CNAME if your provider doesn't support ALIAS):
grandline.acme.com ALIAS → <your LB hostname>
api.grandline.acme.com ALIAS → <your LB hostname>
If you set domain_name in the Terraform module and let it create the Route53 zone, the zone already exists. just add these two records inside it.
Verify with dig:
dig +short grandline.acme.com
dig +short api.grandline.acme.com
6. Install the app with Helm
6.1 Generate JWT secrets
The API signs access and refresh tokens with symmetric keys you provide. Generate them on your laptop and never commit them:
openssl rand -base64 32 > access.key
openssl rand -base64 32 > refresh.key
chmod 600 access.key refresh.key
6.2 Copy and edit the Helm values
cd../../ # back to repo root
cp helm/values.example.yaml my-values.yaml
Open my-values.yaml and replace every yourco.example / grandline.yourco.example with your real hostname. Specifically:
ingress:
className: alb # or "nginx" if you installed ingress-nginx
annotations:
# For AWS Load Balancer Controller:
alb.ingress.kubernetes.io/scheme: internet-facing
alb.ingress.kubernetes.io/target-type: ip
alb.ingress.kubernetes.io/listen-ports: '[{"HTTPS": 443}]'
alb.ingress.kubernetes.io/certificate-arn: <ACM-ARN-from-terraform-output>
alb.ingress.kubernetes.io/ssl-redirect: '443'
# Drop the cert-manager annotation when using ALB + ACM.
hosts:
- host: grandline.acme.com
paths: [{ path: /, service: dashboard }]
- host: api.grandline.acme.com
paths: [{ path: /, service: api }]
tls:
- secretName: grandline-tls
hosts: [grandline.acme.com, api.grandline.acme.com]
publicUrls:
app: https://grandline.acme.com
api: https://api.grandline.acme.com
cookieDomain: acme.com
cookieSecure: true
auth:
mfaRequired: true
webauthnRpId: grandline.acme.com
webauthnOrigin: https://grandline.acme.com
bootstrap:
enabled: true
adminEmail: [email protected]
tenantName: "Acme"
6.3 Run the install
Use the generated hint as the base, plus the JWT secrets you just created:
cat /tmp/grandline-helm-install.sh \
| sed 's|-f helm/values.example.yaml|-f my-values.yaml|' \
| sed 's|helm install grandline|helm install grandline \\\n --set-file auth.jwtAccessSecret=./access.key \\\n --set-file auth.jwtRefreshSecret=./refresh.key|' \
> /tmp/grandline-install-final.sh
chmod +x /tmp/grandline-install-final.sh
bash /tmp/grandline-install-final.sh
Or type it out explicitly:
helm install grandline./helm \
--namespace grandline --create-namespace \
-f my-values.yaml \
--set serviceAccount.name=grandline \
--set-file auth.jwtAccessSecret=./access.key \
--set-file auth.jwtRefreshSecret=./refresh.key \
--set postgres.url="$(terraform -chdir=terraform/aws output -raw database_url)" \
--set redis.url="$(terraform -chdir=terraform/aws output -raw redis_url)" \
--set s3.bucket="$(terraform -chdir=terraform/aws output -raw s3_bucket)" \
--set s3.region=us-east-1 \
--set serviceAccount.annotations."eks\.amazonaws\.com/role-arn"="$(terraform -chdir=terraform/aws output -raw worker_iam_role_arn)"
--set serviceAccount.name=grandline. The chart default produces <release>-grandline which breaks IRSA role annotations that hard-code grandline.6.4 Wait for pods to come up
kubectl -n grandline get pods -w
# Expected steady state (takes ~3 minutes):
# NAME READY STATUS
# grandline-api-xxx 1/1 Running
# grandline-worker-xxx 1/1 Running
# grandline-dashboard-xxx 1/1 Running
# grandline-bootstrap-xxx 0/1 Completed <-- one-shot Job, exits cleanly
If bootstrap shows Error instead of Completed, check the logs: kubectl -n grandline logs job/grandline-bootstrap. Most common cause is a wrong postgres.url or missing TLS parameters.
7. SSL / TLS setup
Three common setups, in descending order of convenience for EKS.
7.1 Option A. ACM wildcard on the ALB (recommended)
If you set domain_name in Terraform, the module provisioned an ACM wildcard cert covering *.grandline.acme.com. The cert ARN is in terraform output -raw tls_certificate_arn.
To use it, annotate the Ingress (already shown in section 6.2):
alb.ingress.kubernetes.io/certificate-arn: <ACM-ARN>
alb.ingress.kubernetes.io/ssl-redirect: '443'
The ALB terminates TLS with the ACM cert and forwards HTTPS to the dashboard / API services. No cert-manager needed. No plaintext port 80. the listen-ports annotation specifies HTTPS only.
7.2 Option B. cert-manager + Let's Encrypt
If you're not using ACM (or you installed ingress-nginx instead of the AWS LB Controller):
# Install cert-manager once per cluster:
helm repo add jetstack https://charts.jetstack.io
helm install cert-manager jetstack/cert-manager \
--namespace cert-manager --create-namespace \
--set installCRDs=true
# Create a ClusterIssuer (one-time):
cat <<EOF | kubectl apply -f -
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: [email protected]
privateKeySecretRef:
name: letsencrypt-prod-key
solvers:
- http01:
ingress:
class: nginx
EOF
Then in my-values.yaml:
ingress:
className: nginx
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
cert-manager will request a SAN cert covering both hostnames and store it in the grandline-tls Secret automatically.
7.3 Option C. Bring your own cert
kubectl create secret tls grandline-tls \
--cert=fullchain.pem --key=privkey.pem \
-n grandline
Then reference it in my-values.yaml:
ingress:
tls:
- secretName: grandline-tls
hosts: [grandline.acme.com, api.grandline.acme.com]
7.4 Verify TLS is working
curl -sS -o /dev/null -w "%{http_code}\n" https://grandline.acme.com
# Should print 200
curl -sS https://api.grandline.acme.com/healthz
# Should return { "ok": true,... }
# Confirm HTTP redirect (no plaintext access):
curl -sS -o /dev/null -w "%{http_code}\n" http://grandline.acme.com
# Should print 301 or 308 (redirect to HTTPS)
Connection refused on port 80 when you curl http://…, that's the ALB/ingress rejecting plaintext. expected, not broken.8. First login & creating users
8.1 Get the bootstrap admin credentials
The bootstrap Job created the first Owner account and published the password to a Kubernetes Secret (not logs).
kubectl get secret grandline-bootstrap-credentials -n grandline \
-o jsonpath='{.data.admin-credentials\.txt}' | base64 -d
Save the output (email + random 20-char password) in your password manager.
8.2 Log in and enrol MFA
- Open
https://grandline.acme.com. - Enter the email and password from the Secret.
- On first sign-in you'll be sent to
/mfa/setup. Scan the QR code with an authenticator (Google Authenticator, 1Password, Authy, Bitwarden. any TOTP app works). - Enter the 6-digit code to confirm enrolment. You'll land on the Overview page.
Delete the bootstrap Secret once you've saved the password:
kubectl delete secret grandline-bootstrap-credentials -n grandline
8.3 Configure email transport (required to invite more users)
The bootstrap Owner can add users one of two ways: invite-by-email (requires working email transport) or a direct script (for test accounts only).
Supported email transports. set one in my-values.yaml and helm upgrade --reuse-values:
SMTP (any provider):
email:
transport: smtp
from: "GrandLine <[email protected]>"
smtp:
host: smtp.acme.com
port: 587
user: smtp-user
# Pass password via existingSecret or --set-file:
existingSecret: grandline-smtp
AWS SES (recommended if you're already on AWS; uses the install's IRSA role, no stored secret):
email:
transport: ses
from: "GrandLine <[email protected]>"
ses:
region: us-east-1
Plus the IRSA role needs ses:SendEmail. add that to the worker role's policy if it isn't already.
Resend:
email:
transport: resend
from: "GrandLine <[email protected]>"
resend:
existingSecret: grandline-resend # key: api_key
Console (development only. writes invite URLs to API logs):
email:
transport: console
8.4 Invite additional users
- In the dashboard, go to Access → Users.
- Click Invite user.
- Enter email, optional display name, and role: Admin, Member, or Viewer (Owner can only be assigned by another Owner, not through the invite flow).
- Click Send invite. The API generates a single-use signed token (48h TTL) and sends an email with a
/accept-invite?token=…link. - The invitee opens the link, sets a password (min 12 chars), gets redirected to
/login, signs in with the new password, and enrols MFA on first sign-in.
8.5 Role model
Four system roles, assigned at invite time:
- Owner. full admin. Only the bootstrap Owner by default; additional Owners are promoted from Admins via Access → Users → Promote.
- Admin. can add users, manage connectors, change settings. Cannot reassign tenant ownership or read the audit log.
- Member. can view dashboards, generate and download reports, cannot add cloud accounts or change settings.
- Viewer. read-only. Can view dashboards but cannot generate reports.
8.6 Data scoping with Groups
Groups are a separate layer from roles. Use them when a user should only see some of the cloud accounts, not all of them.
- Access → Groups → New group, pick "Read" kind.
- Add members.
- Grant the group access to specific CloudAccounts.
Members/Viewers in a Read group see only the accounts granted to their groups. Admins and Owners bypass group scope.
8.7 CLI path. test accounts without email
If you're handing this to other testers and don't want to stand up SMTP yet, you can shell into the API pod and seed additional test accounts directly (bypasses the invite email):
kubectl -n grandline exec -it deploy/grandline-api -- sh -c '
TENANT_ID=<cuid> \
[email protected] \
PASSWORD="SomeStrongPass!23" \
ROLE=Admin \
node scripts/create-test-user.js'
Output includes the TOTP secret. the tester enters it manually into their authenticator (no QR). Use this only for eval; real users should go through the invite flow.
8.8 What's not supported yet
SAML, OIDC, OAuth, and LDAP SSO are not wired in this release. The invite flow + password + TOTP MFA is the only authentication path. SSO is on the roadmap for Enterprise tier.
9. Connect your AWS / Azure / GCP accounts
GrandLine runs on EKS but can discover resources in any AWS account, any Azure subscription, and any GCP project. Connectors are added from the dashboard at runtime. not at install time.
9.1 AWS (same cloud as the install)
- Dashboard → Connectors → Add connector → AWS.
- Copy the generated ExternalId from the dialog.
- In the AWS account you want to scan, create a cross-account role that trusts your install's IRSA role ARN:
TENANT_EXTERNAL_ID=<from-the-dialog>
TRUST_PRINCIPAL=$(terraform -chdir=terraform/aws output -raw worker_iam_role_arn)
aws iam create-role \
--role-name GrandLineDiscoveryRole \
--assume-role-policy-document "$(cat <<JSON
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": { "AWS": "${TRUST_PRINCIPAL}" },
"Action": "sts:AssumeRole",
"Condition": {
"StringEquals": { "sts:ExternalId": "${TENANT_EXTERNAL_ID}" }
}
}]
}
JSON
)"
aws iam attach-role-policy \
--role-name GrandLineDiscoveryRole \
--policy-arn arn:aws:iam::aws:policy/ReadOnlyAccess
Paste the new role ARN into the connector dialog and click Save. Discovery starts within 60 seconds.
9.2 Azure (cross-cloud from EKS)
Create an Azure AD Service Principal with Reader on the target subscription:
az ad sp create-for-rbac \
--name grandline-discovery \
--role Reader \
--scopes /subscriptions/<SUBSCRIPTION_ID>
Paste appId, tenant, and password into the Azure connector dialog. Reader on the subscription is sufficient. do not grant Contributor.
9.3 GCP (cross-cloud from EKS)
PROJECT_ID=<your-project>
gcloud iam service-accounts create grandline-discovery \
--project "$PROJECT_ID" --display-name "GrandLine discovery"
gcloud projects add-iam-policy-binding "$PROJECT_ID" \
--member "serviceAccount:grandline-discovery@${PROJECT_ID}.iam.gserviceaccount.com" \
--role roles/viewer
gcloud projects add-iam-policy-binding "$PROJECT_ID" \
--member "serviceAccount:grandline-discovery@${PROJECT_ID}.iam.gserviceaccount.com" \
--role roles/cloudasset.viewer
gcloud iam service-accounts keys create grandline-sa.json \
--iam-account "grandline-discovery@${PROJECT_ID}.iam.gserviceaccount.com"
Paste the JSON contents into the GCP connector dialog.
10. Verify the install
Quick smoke-test checklist after everything is up:
curl https://grandline.acme.comreturns200(dashboard HTML).curl https://api.grandline.acme.com/healthzreturns{"ok": true,...}.- You can log in, enrol MFA, and reach the Overview page.
- You can invite a second user, they receive the email, set a password, and sign in.
- You can add at least one cloud connector, and within 5 minutes the Resources page shows scanned resources.
- You can generate a report from Reports → Generate report, and the PDF downloads.
kubectl -n grandline get pdb,hpashows PDBs and HPAs are Healthy.helm upgrade grandline./helm -f my-values.yaml --reuse-valuescompletes cleanly (dry-run the upgrade path).
11. Troubleshooting
Bootstrap Job shows Error status
kubectl -n grandline logs job/grandline-bootstrap. Usually one of: wrong postgres.url, missing ?sslmode=require, security group doesn't allow the pod subnet to reach the RDS endpoint. The Terraform module wires the SG correctly by default. if you've customised the VPC layout, double-check.
API pod CrashLoopBackOff on boot
Almost always an env-var mismatch. kubectl -n grandline logs deploy/grandline-api. Common causes: JWT secret not set (--set-file auth.jwtAccessSecret=...), redis.url missing the rediss:// scheme, publicUrls.api not set.
ALB returns 502 / 503
The target group health check needs /healthz on the API. If you're using target-type: ip, ensure the pod's readiness probe is passing: kubectl -n grandline describe pod -l app=api.
Login succeeds but dashboard shows "Failed to fetch"
CORS or cookie-domain misconfiguration. Confirm publicUrls.app is https://grandline.acme.com (not http, not IP), publicUrls.api is https://api.grandline.acme.com, and cookieDomain is the parent (acme.com) so the cookie is valid on both hosts. Then helm upgrade --reuse-values.
Invitee clicks email link and gets "Token invalid"
Three causes: token already used, token > 48h old, or the user was re-invited (which invalidates the older token). Re-send the invite from Access → Users → [user] → Re-invite.
MFA QR doesn't scan
Use the manual entry option under the QR. The secret is also printed in the dialog. Most likely the page is being served from an HTTP proxy that's blocking the QR image; confirm you're loading the dashboard over HTTPS.
cert-manager issuance fails. "Order failed"
kubectl describe certificate -n grandline grandline-tls. Usually the HTTP-01 challenge can't reach the cluster from Let's Encrypt's validators. confirm DNS resolves before applying, confirm the ingress controller is listening on 443, and confirm the cluster is actually reachable from the public internet. If you're on a private cluster, switch to DNS-01.