Using Federated Identities in Azure AKS

Steve Dillon
10 min readSep 28, 2023
Zach Guinta zach_guinta, CC0, via Wikimedia Commons

So, this blog is a pretty deep dive into how to access resources outside of kubernetes when running inside of kubernetes. You need to identify yourself, in the old world you would run as an AD service account, and your code could use that identity to access resources.

I’m not going to do a full explanation of Federated Identities. I did lean on a couple of blogs that I could almost make work, but had issues getting them to work in my setup.

This blog by Ermtic is helpful. This blog by Alexis Plantin is what I am expanding on here. Key differences is that I do all of the setup I can in Terraform, and my demonstration elements are in Python and bash.

The Alexis blog describes in detail the federation process, but I will give you the TL;DR version.

TL;DR: you get a JWT token from kubernetes, you then pass that JWT Token to Azure AD and it gives you a new JWT token that is backed by an Azure AD Identity.

The code for this blog is in my GitHub , most of the code there is terraform for creating an AKS Cluster, an Azure KeyVault and an Azure Postgres flexible server. In the demonstrations a pod in the AKS cluster will access (via an Azure AD identity) the Azure KeyVault and the Postgres db.

I’m sorry but this blog is also pretty long. I feel it is valueable to create the terraform as that makes it possible to follow along. You might be able to learn what you need without executing the code. As of this writing all of the scripts works, and I have taken time to try to make sure I haven’t left out any ‘tweaks’ that need to be done.

Prerequisites:

  • OSx, Linux, or Windows Linux subsystem, and you are comfortable operating on the CLI.
  • jq, helm, kubectl, Terraform (> 1.2) installed
  • Az Cli installed , and ‘az login’ performed to your target subscription
  • Azure Portal access

Stage 1: Azure Deployment

git clone https://github.com/stvdilln/aks-workload-identity.git
cd aks-workload-identity/infrastructure-terraform
# Note, it is better to login with an AD user instead of a ServicePrincipal
# here, but either can be used.
az login
# This retrieves the AD name of the user logged in and sets it in env
source ./set-env.sh
terraform init
terraform apply

This will create an AKS Cluster, an Azure KeyVault and a Azure Postgres flexible server. Some explanations of the special things are due:

In federated-ids.tf is the link between k8s and AzureAD, in this code:

  1. AzureAD and the Kubernetes Identity Provider are introduced and trust is established.
  2. What namespace:user is allowed to access the Azure Identity is identified
  3. The Azure Identity is named. When the Federation process is complete, the JWT access_token will be for this identity.

# This is the Glue between the AKS cluster Identity and the
# azure AD identity. THe OIDC issure URL is the Identity
# provider running in the Kubernetes cluster. Azure AD will
# work with this Identity provider to validate the Kubernetes
# users identity.
# This is the rosetta stone that pulls it all together.
# The oidc_issuer_url is the link for AzureAD to verify with K8s idp
# the subject is the user in kubernetes that is allowed access
# the parent_id is the target account with the Identity to be assumed
resource "azurerm_federated_identity_credential" "app1" {
depends_on = [
# These should be automatic , but the blog example shows them as explicit
module.aks_cluster_pod_ident,
module.coreinfra,
azurerm_user_assigned_identity.app1
]
name = "demo-app1-identity-credentials"
resource_group_name = module.coreinfra.context.resource_group_name
audience = ["api://AzureADTokenExchange"]
# This is the URL of the Kubernetes Identity Provider
issuer = module.aks_cluster_pod_ident.oidc_issuer_url
# This is the namespace and service account name that
# can use this federated Identity. (If you don't lock
# down access to service account creation, then anyone
# can assume the Azure AD identity.)
subject = "system:serviceaccount:default:app1"
# points to the Azure AD identity that will be 'assumed'
parent_id = azurerm_user_assigned_identity.app1.id

}

Setup for KeyVault:

In KeyVault.tf access is given for the AzureAD identity to access the KeyVault:

# object_id specifies the AzureAD account that is allowed to access
# the KeyVault
resource "azurerm_key_vault_access_policy" "pol1" {
key_vault_id = azurerm_key_vault.vault1.id
tenant_id = data.azurerm_client_config.kvconfig.tenant_id
object_id = azurerm_user_assigned_identity.app1.principal_id
storage_permissions = []
key_permissions = [ "Get", "List"]
secret_permissions = [ "Get", "List"]
certificate_permissions = ["Get", "List"]
depends_on = [ azurerm_key_vault.vault1 ]
}
value = azurerm_user_assigned_identity.app1.name
}

Stage 2: Postgres and AKS configuration

After having run terraform AzureAD and KeyVault are ready to go, but kubernetes and PostGres need some setup that is easier to do outside of terraform (espcially if keeping it simple for a blog).

In the root directory of the GitHub repo, is a bunch of files 10-, 20- etc. You need to run these files in numerical order and I will annotate each here:

10-create-env-sh

The terraform code has a bunch of outputs, AzureAd guid’s , keyvault URL etc. This script simply takes all of the of the terraform output and saves them as environment variables:

source ./10-create-env-sh
➜ aks-workload-identity git:(master) cat env.sh
export AZURE_KUBERNETES_CLUSTER_NAME=wl-ident-pod-ident-sbx-xxx-wus3
export AZURE_KUBERNETES_OIDC_ISSUER_URL=https://westus3.oic.prod-aks.azure.com/78e388c5-913a-4820-a123-9e8cb22baa12/cba48351-1c5e-4c23-a4c8-a8502dd37e04/
export AZURE_RESOURCE_GROUP=rg-wl-ident-wus3-sbx-xxxx
export AZURE_TENANT_ID=78e3****
export aks_workload_app1_client_id=f10e8de7-5f56-4d3b-95cf-xxxx
export aks_workload_app1_user_name=demo-app1-identity
export key_vault_uri=https://kv-xx-ident-sbx-wus3-xxx.vault.azure.net/
export kubernetes_authorized_ip_address=69.7.xxx.xxxx
export kv_name=kv-wl-ident-sbx-wus3-xxx
export pg_admin=pgadmintiger
export pg_database=test-db
export pg_host=pg-wl-ident-sbx-wus3-xxx.postgres.database.azure.com
export pg_password=

20-update-kubeconfig

This adds the AKS cluster just created to your kubectl config file and sets it as the active context.

./20-update-kubeconfig
Warning, the terraform locks the administration of the kubernetes cluster to the
IP address that created it. If you are running this script from a different IP
you will need to terraform apply from the new IP address
Merged "wl-ident-pod-ident-sbx-9j8-wus3" as current context in /Users/steve.dillon/.kube/config
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system ama-logs-nfftp 3/3 Running 0 6m13s
kube-system ama-logs-rs-bc7c46fc4-ltb5p 2/2 Running 0 6m13s
kube-system azure-ip-masq-agent-cnpb6 1/1 Running 0 7m3s
kube-system cloud-node-manager-c5ttd 1/1 Running 0 7m3s
kube-system coredns-autoscaler-569f6ff56-xfgxf 1/1 Running 0 7m33s
kube-system coredns-fb6b9d95f-qgb46 1/1 Running 0 5m58s
kube-system coredns-fb6b9d95f-znmwv 1/1 Running 0 7m33s
kube-system csi-azuredisk-node-gjbct 3/3 Running 0 7m3s
kube-system csi-azurefile-node-2vkg7 3/3 Running 0 7m3s
kube-system konnectivity-agent-56b595d5cd-dfnsj 1/1 Running 0 7m33s
kube-system konnectivity-agent-56b595d5cd-ppnnz 1/1 Running 0 7m33s
kube-system kube-proxy-fxl5z 1/1 Running 0 7m3s
kube-system metrics-server-6f6fd7c64b-2xgvn 2/2 Running 0 5m54s
kube-system metrics-server-6f6fd7c64b-sqjrl 2/2 Running 0 5m54s

30-helm-install-webhook

This installs the workload identity to the k8s cluster. After I initally wrote this blog, I saw the option workload_identity_enabled in the terraform azure AKS options. This accomplishes the same as that flag. If you are adapting this to say a GCP kubernetes, you will need this step, so I’m leaving the instructions here.

It looks for pods with workload identity labels and mutates them when they are installed:
The label we are adding to our pods is:

  labels:
azure.workload.identity/use: "true"

To run this:

./30-helm-install-webhook
...
Update Complete. ⎈Happy Helming!⎈
NAME: workload-identity-webhook

40-create-k8s-service-account

This script creates a kubernetes service account that is key to federating the kubernetes ID to the Azure AD ID.

In the terraform code to create the Federated user we have:

resource "azurerm_user_assigned_identity" "app1" {
name = "demo-app1-identity"
location = module.coreinfra.context.location
resource_group_name = module.coreinfra.context.resource_group_name
}
output "aks_workload_app1_client_id" {
value = azurerm_user_assigned_identity.app1.client_id
}
output "aks_workload_app1_user_name" {
value = azurerm_user_assigned_identity.app1.name
}
and
resource "azurerm_federated_identity_credential" "app1" {
....
subject = "system:serviceaccount:default:app1"
...
}

In the Kubernetes yaml we are about to apply, it links to those definitions, the name and namespace need to match the values in: “system:serviceaccount:default:app1”

And the ClientID in the kubernetes file needs to match the client-id of the user_assigned identity.

#!/bin/bash
#
# THis creates the service account that is key to the workload identity
# The client-id is registered in Azure Active Directory. If you have
# multiple identities runnin in k8s, control of the service account
# details is key to preventing application 'A' accessing another
# applications services.
kubectl apply -f - <<EOF
apiVersion: v1
kind: ServiceAccount
metadata:
name: app1
namespace: default
annotations:
azure.workload.identity/client-id: "$aks_workload_app1_client_id"
labels:
azure.workload.identity/use: "true"
EOF

Run this file:

./40-create-k8s-service-account
serviceaccount/app1 created
➜ aks-workload-identity git:(master) kubectl describe serviceaccounts app1
Name: app1
Namespace: default
Labels: azure.workload.identity/use=true
Annotations: azure.workload.identity/client-id: f10e8de7-xxx-yyyy-zzzz-d70e8949fa68
Image pull secrets: <none>
Mountable secrets: <none>
Tokens: <none>
Events: <none>

50-add-aad-users-to-pqsql

I don’t know if people have pqsql on their machines, so I walk people to do this in the Azure Portal. You need to run the pgaaadauth_create_principal on the ‘postgres’ database of the server. (If you run this file it will print with the variables substituted). You need to run this script as a postgres AD administrator. The terraform that created the server, set the AD adminstrator to the same user than run the terraform code. You can modify database as needed to get this command to run.

Auzre Portal to launch Connections

Secondly, you need to run the ‘create table’ and ‘Grant Insert’ on the test-db database.

#In order to prevent you from needing pqsql on your machine and 
#needing to log in with it, I give these instuructions to do this
#in the Azure portal.

#In Azure Portal in the search bar search for 'pg-wl-ident' and select it.
#Then Databases->Postgres->Connect

# You can run this script to have it format the commands for you

cat <<EOF
# run this on the 'postgres' database
select * from pgaadauth_create_principal('$aks_workload_app1_user_name', false, false);

# run these on the 'test-db' database
CREATE TABLE my_table (
test text
);
GRANT INSERT,SELECT ON my_table TO "$aks_workload_app1_user_name";
EOF

60-run-demo-service

I have a demonstration pod that does nothing, but waits for you to run commands in it. The pod is configured with the workload identity label, and users the default:app1 service account that ‘grants’ access to identity federation. I also pass a lot of config here for things that we want to demo.

#!/bin/bash
#cat <<EOF
kubectl apply -f - <<EOF
apiVersion: v1
kind: Pod

metadata:
name: my-debug-container
namespace: default
labels:
azure.workload.identity/use: "true"
spec:
serviceAccountName: app1
containers:
- name: static-client
image: ghcr.io/stvdilln/managed-ident-test:0.0.3
env:
- name: "key_vault_uri"
value: "$key_vault_uri"
- name: "key_vault_secret_name"
value: "big-secret"
- name: "aks_workload_app1_user_name"
value: "$aks_workload_app1_user_name"
- name: "pg_database"
value: "$pg_database"
- name: "pg_host"
value: "$pg_host"

command: [ "/bin/sh", "-c", "--" ]
args: [ "while true; do sleep 30; done;" ]

resources:
requests:
memory: "128Mi"
cpu: "100m"
limits:
memory: "256Mi"
cpu: "200m"
EOF

Run this with

./60-run-demo-service
pod/my-debug-container created

70-shell-into-container

This will give you a command prompt into the pod and be able to play with the managed identity. The code for the scripts is in the k8s-container/dockerdir directory:

In the test shell you can:

az-login-federated:

The AZURE_* environment variables are created as part of the mutating web hook and are automatic. Running this command will log you into azure cli with the federated identity.

#!/bin/bash 
az login --federated-token "$(cat $AZURE_FEDERATED_TOKEN_FILE)" --service-principal -u $AZURE_CLIENT_ID -t $AZURE_TENANT_ID --allow-no-subscriptions
az account get-access-token --query accessToken --output tsv



root@my-debug-container:/# ./az-login-federated
[
{
"cloudName": "AzureCloud",
"id": "78e388c5-913a-4820-a123-9e8cb22baa12",
"isDefault": true,
"name": "N/A(tenant level account)",
"state": "Enabled",
"tenantId": "78e388c5-xxx-xxx-a123-9e8cb22baa12",
"user": {
"name": "f10e8de7-5f56-4d3b-95cf-d70e8949fa68",
"type": "servicePrincipal"
}
}
]
eyJ0eXAiOiJKV1QiLCJhbGciOiJU1

workload-ident-demp.py

This sample code, shows how to take the Kubernetes JWT token and assume the AzureAD identity using python, using a small ‘token_credential.py’ that I found on the web. It will read a value from KeyVault and will insert ‘Hello World <timestamp>’ into the test database.

import os
import time
import psycopg2 as pg
from datetime import datetime
from azure.keyvault.secrets import SecretClient
from token_credential import MyClientAssertionCredential
def main():
# get environment variables to authenticate to the key vault
azure_client_id = os.getenv('AZURE_CLIENT_ID', '')
if not azure_client_id:
raise Exception('AZURE_CLIENT_ID environment variable is not set: add label azure.workload.identity/use: "true" to your pod spec')

azure_tenant_id = os.getenv('AZURE_TENANT_ID', '')
if not azure_tenant_id:
raise Exception('AZURE_TENANT_ID environment variable is not set')

azure_authority_host = os.getenv('AZURE_AUTHORITY_HOST', '')
if not azure_authority_host:
raise Exception('AZURE_AUTHORITY_HOST environment variable is not set')

azure_federated_token_file = os.getenv('AZURE_FEDERATED_TOKEN_FILE', '')
if not azure_federated_token_file:
raise Exception('AZURE_FEDERATED_TOKEN_FILE environment variable is not set')

database = os.getenv('pg_database', 'test-db')
host = os.getenv('pg_host', '')
if not host:
raise Exception('pg_host environment variable is not set')

db_user = os.getenv('aks_workload_app1_user_name', '')
if not db_user:
raise Exception('aks_workload_app1_user_name environment variable is not set (foo@domain.com)')

keyvault_url = os.getenv('key_vault_uri', '')
if not keyvault_url:
raise Exception('key_vault_uri environment variable is not set (https://{vault-name}.vault.azure.net/)')
secret_name = os.getenv('key_vault_secret_name', 'big-secret')

# create a token credential object, which has a get_token method that returns a token
token_credential = MyClientAssertionCredential(azure_client_id, azure_tenant_id, azure_authority_host, azure_federated_token_file)
access_token = token_credential.get_token('https://graph.microsoft.com/.default').token
#print('TOKEN {}'.format(access_token))
# create a secret client with the token credential
keyvault = SecretClient(vault_url=keyvault_url, credential=token_credential)
secret = keyvault.get_secret(secret_name)
print('successfully got secret, secret={}'.format(secret.value))
# Now connect to Postgres DB
password = access_token = token_credential.get_token('https://ossrdbms-aad.database.windows.net/.default').token
print('database={}'.format(database))
print('host={}'.format(host))
print('db_user={}'.format(db_user))

conn_string = "host={0} user={1} sslmode=prefer dbname={2} password={3}".format(host,db_user,database password)
#print(conn_string)
conn = pg.connect(conn_string)
#print(conn)
cursor = conn.cursor()
test_data = 'Hello World at {0}'.format(datetime.now().strftime('%Y-%m-%d %H:%M:%S'))
print(test_data);
cursor.execute("INSERT INTO my_table (test) VALUES (%s);", (test_data,))
conn.commit()
cursor.close()
conn.close()

if __name__ == '__main__':
main()
root@my-debug-container:/# python3 workload-ident-demo.py 
successfully got secret, secret=Whopper
database=test-db
host=pg-wl-ident-sbx-wus3-9j8.postgres.database.azure.com
db_user=demo-app1-identity
Hello World at 2023-09-28 17:51:42

start-pqsql

This script uses the various environment variables to log you into and start pqsql with the federated id.

In here is a gem and that is the ‘curl’ command that transforms the JWT token issues by kubernetes and it calls AzureAD and gets the managed Identity access_token (jwt).

IDENTITY_TOKEN=$(cat $AZURE_FEDERATED_TOKEN_FILE)

output=$(curl -s --location --request GET "$AZURE_AUTHORITY_HOST/$AZURE_TENANT_ID/oauth2/v2.0/token" \
--form 'grant_type="client_credentials"' \
--form 'client_id="'$AZURE_CLIENT_ID'"' \
--form 'scope="https://ossrdbms-aad.database.windows.net/.default"' \
--form 'client_assertion_type="urn:ietf:params:oauth:client-assertion-type:jwt-bearer"' \
--form 'client_assertion="'$IDENTITY_TOKEN'"' )

export PGPASSWORD=$(echo $output | jq -r '.access_token')


psql -h $pg_host --user $aks_workload_app1_user_name $pg_database

# Running the script

root@my-debug-container:/# ./start-pqsql
psql (14.9 (Ubuntu 14.9-0ubuntu0.22.04.1), server 12.15)
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off)
Type "help" for help.

test-db=> select * from my_table;
test
------------------------------------
Hello World at 2023-09-28 17:51:42
(1 row)

test-db=>

--

--

Steve Dillon

Cloud Architect and Automation specialist. Specializing in AWS, Hashicorp and DevOps.