# Envoy Proxy

## Overview

Envoy Proxy is described as an edge and service proxy. This means that
Envoy can take care of managing inbound and outbound networks requests
to and from your application. This allows your application to not to
have to worry about managing key material like OAuth Client secrets,
JSON Web Tokens (JWTs), and other sensitive information.

Envoy provides a plugin system that allows application developers to use built
in plugins to handle things like:

* Redirecting to an Identity Provider
* Doing an OAuth handshake with an OAuth Authorization Server
  * Performing an Authorization Code Grant Exchange
  * Exchanging a refresh token for a new access token
* Validating incoming JSON Web Tokens
* Connecting to a policy decision point to authorize request before forwarding
  them to your application.

Envoy can be run in multiple ways and seems to work best when working as a
sidecar process to your application. The idea behind this is that you would
expose envoy to externally and use it to reverse proxy requests to your
application that is only accessible via envoy. This is typically configured
using a loopback address for tcp connections. Envoy can speak gRPC and HTTP
quite fluently and the Envoy documentation is fairly extensive.

You can configure Envoy to receive its configuration from a static YAML file or
dynamically by giving it the location of a control plane for it to connect to
and receive its configuration from. Envoy Gateway and Istio are popular control
planes that allow you to manage a fleet of envoy proxies through a central
management point.

In this document I'm going to go over how to configure Envoy in a standalone
mode using static configuration. This configuration is written in YAML and is
provided to the Envoy program as a command line option during startup.

In order to adequately understand what Envoy is providing I will start with
going over the following primitives:

1. Authentication
    * Public Key Cryptography
    * Public Key Infrastructure
    * Digital Signing
1. Authorization
    * Access Control Models
      * DAC
      * RBAC
      * ABAC

After this brief overview I will dive into how to configure Envoy to provide
the bare necessities for booting up a new service with authentication
and authorization delegated to Envoy.

1. Authentication
    * OpenID Connect Provider using `envoy.filters.http.oauth2`
    * JSON Web Token Validation using `envoy.filters.http.jwt_authn`
1. Authorization
    * External policy decision point (PDP) using `envoy.filters.http.ext_authz`

## Pre-requisite Concepts

Authentication is the act of prooving you are who you claim to be.
Authorization is the act of prooving that you are allowed to do what
you're trying to do. The distinction between the two is important because the
context determines which elements are necessary.

An example of this is the difference between commuting via municipal transit
versus commuting via an airplane. The security context between the two modes of
transportation are different therefore the level or rigor applied to
authenticating versus authorizing access to the resource differ. To board a bus
you must present a bus token/ticket to the bus driver before you are able to
board the bus. The bus driver does not require you to verify who you are.
Instead, they are only interested in verifying that you have a valid bus ticket
that has not expired, is for the bus that they operate and is issued from a
legitimate authority (the transit authority). TO ride an airplane you must
provide both your passport and your boarding pass in order to board the plane.
The passport is used to verify that you are who you say you are and the boarding
pass is used to ensure that you have a valid seat on the plane. The passport is
used to authenticate the passenger and the bus ticket/boarding pass is used to
authorize the passenger. The bus and plane are protected resources like an API
and the operator of the API understand the security context the best. They
understand whether a rigorous authentication and authorization check is
warranted or not. The passenger is responsible for obtaining a passport,
boarding pass, bus ticket from trusted and reputable authorities.

```mermaid
sequenceDiagram
    participant P as Passenger
    participant BD as Bus Driver
    participant B as Bus

    P->>BD: request access
    BD->>P: request ticket
    P->>BD: present ticket
    Note over BD: authorize (bus #, expiration, fake/legit?)

    alt Valid ticket
        BD->>P: grant access
        P->>B: board bus
    else Invalid ticket
        BD->>P: deny access
    end
```

The Bus # indicates the canonical identifier for the resource and
this is similar to accessing a resource exposed via a REST/GraphQL
API. The expiration check ensures that the same token cannot be re-used
indefinitely and that the access granted by the ticket is limited in
scope to prevent abuse of the resource and this is similar to ensuring
that a JWT cannot be used indefinitely. The check to make sure that the
ticket is legitimate and issued from a trusted authority is similar to
a digital signature check. In this example, the bus driver does not need to
authenticate the passenger by verifying that they are who they say they are. The
bus driver does not care. The bus driver only cares about whether or not they
carry a token that awards them access to the resource. In this scenario the
passenger could give the token to someone else (for example a child) so that
they can access the resource. The security context of this resource does not
warrant the need for authentication and only requires authorization.

```mermaid
sequenceDiagram
    participant P as Passenger
    participant SA as Security Agent
    participant BA as Boarding Agent
    participant Plane as Plane

    P->>SA: request access to gate
    SA->>P: request boarding pass
    P->>SA: present boarding pass
    SA->>SA: validate boarding pass
    SA->>P: allow access to gate

    P->>BA: request access to board plane
    BA->>P: request passport
    P->>BA: present passport
    BA->>P: request boarding pass
    P->>BA: present boarding pass
    BA->>P: allow access to board plane

    P->>Plane: board plane
```

To board a plane you must pass through more security checks before you can
access the airplane. That is because flying in an airplane is a high security
context that requires additional checks to ensure the safety of everyone and the
risk of allowing access to a bad actor has more severe consequences. To board
the airplane you must pass through the security checkpoint by presenting a valid
boarding pass for a flight. This check ensures that we do not allow people into
the gate that do not have a valid pass. A valid pass is one that hasn't already
been used, is for a flight that is set to take off in the future and is for a
known and registered airline. Depending on whether the flight is a domestic or
international flight the gate may require other forms of proof of access. Once
the passenger has made it to the gate they are required to provide a passport
and boarding pass to an airline agent before they are allowed to board the
aircraft. This ensures that everyone who is aboard the airplane is known ahead
of time and that known bad actors are not allowed to board the aircraft. The
airline agent performs an authentication AND authorization check. The airplane
is a metaphor for a high security context that the operators of the airplane
understand. The credit card company and each intermediate authority that was
used to ensure entry do not determine the access controls for gaining entry into
the plane.

### Authentication

Authentication is the act of verifying that an entity is who they say they are.

How do we do this on the internet? To accomplish this we depend on public key
cryptography which is a form of asymmetic crypto. In this style of crypto each
party has a public and private key. Entities distribute their public keys while
keeping their private keys private. The interesting property of the
public/private key relationship is that messages that are encrypted by either
the public or private key can only be decrypted by the other corresponding key.

#### Confidentiality

So if I give you my public key then you can encrypt a message with my public key
and send that message to me. Only I can decrypt that message using my private
key. This ensures confidentiality so that the ciphertext produced can be snooped
by anyone but only the recipient can convert the ciphertext back into plaintext.

The following example shows an exchange between two parties. Each party
encrypts a plaintext message with the other party's public key. When that party
receives the ciphertext message they are able to decrypt the message using their
own private key.

```ruby
#!/bin/env ruby
require 'openssl'

class Person
  attr_reader :name, :public_key

  def initialize(name, private_key = OpenSSL::PKey::RSA.new(2048))
    @name = name
    @private_key = private_key
    @public_key = private_key.public_key
  end

  def send_to(person, plaintext)
    ciphertext = person.public_key.public_encrypt(plaintext)
    person.receive_from(self, ciphertext)
  end

  def receive_from(person, ciphertext)
    plaintext = @private_key.private_decrypt(ciphertext)
    puts "#{person.name}: #{plaintext}\n"
  end
end

clifford = Person.new("clifford")
reginald = Person.new("reginald")

clifford.send_to(reginald, "What time is it?")
reginald.send_to(clifford, "Time to go live! Who sent this?")
```

#### Authenticity

To ensure that a message originated from the entity that claims to have sent the
message an additional signature can be appended to the message. The signature
can contain any arbitrary text but is usually a hash (e.g. SHA256) of the
original plaintext message and encrypted using the private key of the sender.
I'll explain below why a hash is used below. If the recipient has the public key of the sender
then they can decrypt the signature using the public key of the sender. If
signature can be decrypted without an error then we can trust that the message
did in fact originate from the sender. This authenticates the message.

In the previous code example each party was able to ensure that the message that
they delivered to the intended recipient could only be read by that recipient.
However, the recipient could not guarantee that they message that they received
actually came from the party that claims to have sent it. If an attacker could
eavesdrop on the conversation, they could intercept the message and rewrite it
before delivering it. This might cause confusion between the two parties and an
attacker could then coerce one of the parties into a specific action.

#### Integrity

When a recipient receives a message from a sender the recipient also needs to
verify that the message wasn't altered. If the signature of the message includes
an encrypted hash then the recipient can compute a hash of the plaintext message
and compare it with the hash in the encrypted signature. This ensures that the
message hasn't been tampered with.

In the following code example the two actors perform a public key exchange with
each other before they start to communicate with each other. This allows them to
verify that the message that they receive did in fact originate from the person
that they think it originated from. It also allows them to ensure that the
message hasn't been altered in transit by appending a signature. The choice of
SHA1 is meant for demonstration purposes only and is not considered a strong
enough hashing algorithm due to the opportunity for collisions to occur.

```ruby
#!/bin/env ruby
require 'openssl'

class Person
  attr_reader :name, :public_key

  def initialize(name, private_key = OpenSSL::PKey::RSA.new(2048))
    @name = name
    @private_key = private_key
    @public_key = private_key.public_key
    @friends = {}
  end

  def add_friend(friend)
    @friends[friend.name] = friend.public_key
  end

  def send_to(person, plaintext)
    signature = @private_key.private_encrypt(Digest::SHA1.hexdigest(plaintext))
    person.receive([self.name, plaintext, signature])
  end

  def receive(message)
    raise "This message cannot be trusted" unless valid?(message)

    name, plaintext, _ = message
    puts "#{name}: #{plaintext}\n"
  end

  private

  def valid?(message)
    header, body, signature = message
    public_key = @friends[header]

    # verify that we know the sender
    return false if public_key.nil?

    # verify that the message hasn't been altered in transit
    Digest::SHA1.hexdigest(body) == public_key.public_decrypt(signature)
  end
end

clifford = Person.new("clifford")
reginald = Person.new("reginald")

# public key exchange
clifford.add_friend(reginald)
reginald.add_friend(clifford)

clifford.send_to(reginald, "What time is it?")
reginald.send_to(clifford, "It's still time to go live!")
```

In order for us to be able to trust JSON Web Tokens we need public/private key
pairs that we can use to validate the authenticity and integrity of the token.
It is also possible to encrypt the JWT body but this isn't necessary and this is
why storing sensitive information like personally identifiable information in a
JWT claim is not recommended.

In the Ruby code example above the message that was sent from one person to
another took the form of:

```plaintext
  [name, plaintext, signature]
```

This shape is similar to how a JSON Web Token is structured. A JWT takes the
form of:

```plaintext
  header.body.signature
```

Where each segment is a base64 encoded JSON. The header provides information
such as the type of signature algorithm that was used and the key id of the
public key that can be used to verify the signature. This key id typically
corresponds to one of the keys that are published through the JSON Web Key Set
(JWKS) URI. For example, the GitLab JWKS can be discovered through the OIDC
Discovery Endpoint.

Here's an example of JWT:

```plaintext
eyJ0eXAiOiJKV1QiLCJraWQiOiJ0ZDBTbWRKUTRxUGg1cU5Lek0yNjBDWHgyVWgtd2hHLU1Eam9PS1dmdDhFIiwiYWxnIjoiUlMyNTYifQ.eyJpc3MiOiJodHRwOi8vZ2RrLnRlc3Q6MzAwMCIsInN1YiI6IjEiLCJhdWQiOiJlMzFlMWRhMGI4ZjZiNmUzNWNhNzBjNzkwYjEzYzA0MDZlNDRhY2E2YjJiZjY3ZjU1ZGU3MzU1YTk3OWEyMjRmIiwiZXhwIjoxNzQ3OTM3OTgzLCJpYXQiOjE3NDc5Mzc4NjMsImF1dGhfdGltZSI6MTc0Nzc3NDA2Nywic3ViX2xlZ2FjeSI6IjI0NzRjZjBiMjIxMTY4OGE1NzI5N2FjZTBlMjYwYTE1OTQ0NzU0ZDE2YjFiZDQyYzlkNjc3OWM5MDAzNjc4MDciLCJuYW1lIjoiQWRtaW5pc3RyYXRvciIsIm5pY2tuYW1lIjoicm9vdCIsInByZWZlcnJlZF91c2VybmFtZSI6InJvb3QiLCJlbWFpbCI6ImFkbWluQGV4YW1wbGUuY29tIiwiZW1haWxfdmVyaWZpZWQiOnRydWUsInByb2ZpbGUiOiJodHRwOi8vZ2RrLnRlc3Q6MzAwMC9yb290IiwicGljdHVyZSI6Imh0dHBzOi8vd3d3LmdyYXZhdGFyLmNvbS9hdmF0YXIvMjU4ZDhkYzkxNmRiOGNlYTJjYWZiNmMzY2QwY2IwMjQ2ZWZlMDYxNDIxZGJkODNlYzNhMzUwNDI4Y2FiZGE0Zj9zPTgwJmQ9aWRlbnRpY29uIiwiZ3JvdXBzX2RpcmVjdCI6WyJnaXRsYWItb3JnIiwidG9vbGJveCIsIm1hc3NfaW5zZXJ0X2dyb3VwX18wXzEwMCIsImN1c3RvbS1yb2xlcy1yb290LWdyb3VwL2FhIiwiY3VzdG9tLXJvbGVzLXJvb3QtZ3JvdXAvYWEvYWFhIiwiZ251d2dldCIsIkNvbW1pdDQ1MSIsImphc2hrZW5hcyIsImZsaWdodGpzIiwidHdpdHRlciIsImdpdGxhYi1leGFtcGxlcyIsImdpdGxhYi1leGFtcGxlcy9zZWN1cml0eSIsIjQxMjcwOCIsImdpdGxhYi1leGFtcGxlcy9kZW1vLWdyb3VwIiwiY3VzdG9tLXJvbGVzLXJvb3QtZ3JvdXAiLCI0MzQwNDQtZ3JvdXAtMSIsIjQzNDA0NC1ncm91cC0yIiwiZ2l0bGFiLW9yZzEiLCJnaXRsYWItb3JnL3NlY3VyZSIsImdpdGxhYi1vcmcvc2VjdXJlL21hbmFnZXJzIiwiZ2l0bGFiLW9yZy9zZWN1cml0eS1wcm9kdWN0cyIsImdpdGxhYi1vcmcvc2VjdXJpdHktcHJvZHVjdHMvYW5hbHl6ZXJzIl19.TjTrGS5FjfPoY0HWkSLvgjogBxB27jX2beosOZAkwXi_gO3q9DTnL0csOgxjoF1UR8baPNfMFBqL1ipLxBdY9vvDxZve-sOhoSptjzLGkCi7uQKeu7r8wNyFWNWhcLwmbinZyENGSZqIDSkHy0lGdo9oj7qqnH6sYqU46jtWACDGSHTFjNNuo1s_P2SZgkaq4c4v4jdlVV_C_Qlvtl7-eaWV1LzTpB4Mz0VWGsRx1pk3-KnS24crhBjxSE383z4Nar4ZhrsrTK-bOj33l6U32gRKNb4g6GxrPXaRQ268n37spQmbQn0aDwmUOABv-aBRy203bCCZca8BJ0XBur8t6w
```

If we break the JWT apart using `.` delimeter it will look like the following:

```plaintext
header:

  eyJ0eXAiOiJKV1QiLCJraWQiOiJ0ZDBTbWRKUTRxUGg1cU5Lek0yNjBDWHgyVWgtd2hHLU1Eam9PS1dmdDhFIiwiYWxnIjoiUlMyNTYifQ

body:

  eyJpc3MiOiJodHRwOi8vZ2RrLnRlc3Q6MzAwMCIsInN1YiI6IjEiLCJhdWQiOiJlMzFlMWRhMGI4ZjZiNmUzNWNhNzBjNzkwYjEzYzA0MDZlNDRhY2E2YjJiZjY3ZjU1ZGU3MzU1YTk3OWEyMjRmIiwiZXhwIjoxNzQ3OTM3OTgzLCJpYXQiOjE3NDc5Mzc4NjMsImF1dGhfdGltZSI6MTc0Nzc3NDA2Nywic3ViX2xlZ2FjeSI6IjI0NzRjZjBiMjIxMTY4OGE1NzI5N2FjZTBlMjYwYTE1OTQ0NzU0ZDE2YjFiZDQyYzlkNjc3OWM5MDAzNjc4MDciLCJuYW1lIjoiQWRtaW5pc3RyYXRvciIsIm5pY2tuYW1lIjoicm9vdCIsInByZWZlcnJlZF91c2VybmFtZSI6InJvb3QiLCJlbWFpbCI6ImFkbWluQGV4YW1wbGUuY29tIiwiZW1haWxfdmVyaWZpZWQiOnRydWUsInByb2ZpbGUiOiJodHRwOi8vZ2RrLnRlc3Q6MzAwMC9yb290IiwicGljdHVyZSI6Imh0dHBzOi8vd3d3LmdyYXZhdGFyLmNvbS9hdmF0YXIvMjU4ZDhkYzkxNmRiOGNlYTJjYWZiNmMzY2QwY2IwMjQ2ZWZlMDYxNDIxZGJkODNlYzNhMzUwNDI4Y2FiZGE0Zj9zPTgwJmQ9aWRlbnRpY29uIiwiZ3JvdXBzX2RpcmVjdCI6WyJnaXRsYWItb3JnIiwidG9vbGJveCIsIm1hc3NfaW5zZXJ0X2dyb3VwX18wXzEwMCIsImN1c3RvbS1yb2xlcy1yb290LWdyb3VwL2FhIiwiY3VzdG9tLXJvbGVzLXJvb3QtZ3JvdXAvYWEvYWFhIiwiZ251d2dldCIsIkNvbW1pdDQ1MSIsImphc2hrZW5hcyIsImZsaWdodGpzIiwidHdpdHRlciIsImdpdGxhYi1leGFtcGxlcyIsImdpdGxhYi1leGFtcGxlcy9zZWN1cml0eSIsIjQxMjcwOCIsImdpdGxhYi1leGFtcGxlcy9kZW1vLWdyb3VwIiwiY3VzdG9tLXJvbGVzLXJvb3QtZ3JvdXAiLCI0MzQwNDQtZ3JvdXAtMSIsIjQzNDA0NC1ncm91cC0yIiwiZ2l0bGFiLW9yZzEiLCJnaXRsYWItb3JnL3NlY3VyZSIsImdpdGxhYi1vcmcvc2VjdXJlL21hbmFnZXJzIiwiZ2l0bGFiLW9yZy9zZWN1cml0eS1wcm9kdWN0cyIsImdpdGxhYi1vcmcvc2VjdXJpdHktcHJvZHVjdHMvYW5hbHl6ZXJzIl19

signature:

  TjTrGS5FjfPoY0HWkSLvgjogBxB27jX2beosOZAkwXi_gO3q9DTnL0csOgxjoF1UR8baPNfMFBqL1ipLxBdY9vvDxZve-sOhoSptjzLGkCi7uQKeu7r8wNyFWNWhcLwmbinZyENGSZqIDSkHy0lGdo9oj7qqnH6sYqU46jtWACDGSHTFjNNuo1s_P2SZgkaq4c4v4jdlVV_C_Qlvtl7-eaWV1LzTpB4Mz0VWGsRx1pk3-KnS24crhBjxSE383z4Nar4ZhrsrTK-bOj33l6U32gRKNb4g6GxrPXaRQ268n37spQmbQn0aDwmUOABv-aBRy203bCCZca8BJ0XBur8t6w
```

When we Base64 decode the header it takes the following form. This tells us that
the signature was produced using an "RS256" algorithm which is a short hand for
RSA public key cryptography with a SHA256 hash. The identifier for the public
key that can be used to decrypt the signature is marked by the `kid` name. This
`kid` will correspond to an identifer that can be discovered at the JWKS
metadata endpoint.

```plaintext
{
  "typ": "JWT",
  "kid": "td0SmdJQ4qPh5qNKzM260CXx2Uh-whG-MDjoOKWft8E",
  "alg": "RS256"
}
```

```bash
$ curl https://gitlab.com/.well-known/openid-configuration | jq '.' | grep jwks_uri
  "jwks_uri": "https://gitlab.com/oauth/discovery/keys",
```

The following keys imply that GitLab uses RSA for public key cryptography and
SHA256 as the hash algorithm for verifying digital signatures.

```bash
# note that I have ommitted some data to keep the example brief
$ curl https://gitlab.com/oauth/discovery/keys | jq '.'
{
  "keys": [
    {
      "kty": "RSA",
      "kid": "kewiQq9jiC84CvSsJYOB-N6A8WFLSV20Mb-y7IlWDSQ",
      "use": "sig",
      "alg": "RS256"
    },
    {
      "kty": "RSA",
      "kid": "4i3sFE7sxqNPOT7FdvcGA1ZVGGI_r-tsDXnEuYT4ZqE",
      "use": "sig",
      "alg": "RS256"
    },
    {
      "kty": "RSA",
      "kid": "UEtnUohTq58JiJzxHhBLSU0yTpsmW-9EY1Wykha6VIg",
      "use": "sig",
      "alg": "RS256"
    }
  ]
}
```

When the body of the JWT is decoded it takes the following form:

```bash
{
  "iss": "http://gdk.test:3000",
  "sub": "1",
  "aud": "e31e1da0b8f6b6e35ca70c790b13c0406e44aca6b2bf67f55de7355a979a224f",
  "exp": 1747937983,
  "iat": 1747937863,
  "auth_time": 1747774067,
  "sub_legacy": "2474cf0b2211688a57297ace0e260a15944754d16b1bd42c9d6779c900367807",
  "name": "Administrator",
  "nickname": "root",
  "preferred_username": "root",
  "email": "admin@example.com",
  "email_verified": true,
  "profile": "http://gdk.test:3000/root",
  "picture": "https://www.gravatar.com/avatar/258d8dc916db8cea2cafb6c3cd0cb0246efe061421dbd83ec3a350428cabda4f?s=80&d=identicon",
  "groups_direct": [
    "gitlab-org"
  ]
}
```

There are several non-standard claims in this token such as the `auth_time`,
`sub_legacy`, `name`, `nickname`, `preferred_username`, `email`,
`email_verified`, `profile`, `picture`, `groups_direct`. I think that `email` is
problematic because this is considered personally identifiable information and
this is something that I would like us to consider removing.

Finally, we can validate the integrity of the JWT by decrypting the signature
and recomputing a hash of the header + "." + body. The following ruby code
demonstrates this:

```ruby
#!/usr/bin/env ruby
require 'openssl'
require 'base64'

# This key is fetched from the `jwks_uri`
metadata = {
  kty: "RSA",
  kid: "td0SmdJQ4qPh5qNKzM260CXx2Uh-whG-MDjoOKWft8E",
  e: "AQAB",
  n: "z4JrfdkUjeCPcMQEB1ai9OJbZ8xMrtdNI9K80XUYTcyfkQDlFnZNgRvwnkLkZJ0XjtLbc6Y0RMEyo32DivIfWb31US_1FRRJm0oS2mSFV4iHsfTXjVnlmExYW0ke2_BZ4Vu_rRIVxD1eJYNLjn8Uqb7ZllnUJFZDzTk5qQCVX9F5idQgWFh9DxtY3pGutz1-BxaQmTDts_p4cDu8HPnmJEiTCsx7opIfvqpaumfuiLlPZvozERnsnC8BDS1EQja3nJhOnaBFV6vrk57VH_IwmybVACk2w3uW8n0o63roDHfnpo5hQuSm2M-5mEcyXH0PA5YsDuYRi1uxF58Vob6NSw",
  use: "sig",
  alg: "RS256"
}

jwt = "eyJ0eXAiOiJKV1QiLCJraWQiOiJ0ZDBTbWRKUTRxUGg1cU5Lek0yNjBDWHgyVWgtd2hHLU1Eam9PS1dmdDhFIiwiYWxnIjoiUlMyNTYifQ.eyJpc3MiOiJodHRwOi8vZ2RrLnRlc3Q6MzAwMCIsInN1YiI6IjEiLCJhdWQiOiJlMzFlMWRhMGI4ZjZiNmUzNWNhNzBjNzkwYjEzYzA0MDZlNDRhY2E2YjJiZjY3ZjU1ZGU3MzU1YTk3OWEyMjRmIiwiZXhwIjoxNzQ3OTM3OTgzLCJpYXQiOjE3NDc5Mzc4NjMsImF1dGhfdGltZSI6MTc0Nzc3NDA2Nywic3ViX2xlZ2FjeSI6IjI0NzRjZjBiMjIxMTY4OGE1NzI5N2FjZTBlMjYwYTE1OTQ0NzU0ZDE2YjFiZDQyYzlkNjc3OWM5MDAzNjc4MDciLCJuYW1lIjoiQWRtaW5pc3RyYXRvciIsIm5pY2tuYW1lIjoicm9vdCIsInByZWZlcnJlZF91c2VybmFtZSI6InJvb3QiLCJlbWFpbCI6ImFkbWluQGV4YW1wbGUuY29tIiwiZW1haWxfdmVyaWZpZWQiOnRydWUsInByb2ZpbGUiOiJodHRwOi8vZ2RrLnRlc3Q6MzAwMC9yb290IiwicGljdHVyZSI6Imh0dHBzOi8vd3d3LmdyYXZhdGFyLmNvbS9hdmF0YXIvMjU4ZDhkYzkxNmRiOGNlYTJjYWZiNmMzY2QwY2IwMjQ2ZWZlMDYxNDIxZGJkODNlYzNhMzUwNDI4Y2FiZGE0Zj9zPTgwJmQ9aWRlbnRpY29uIiwiZ3JvdXBzX2RpcmVjdCI6WyJnaXRsYWItb3JnIiwidG9vbGJveCIsIm1hc3NfaW5zZXJ0X2dyb3VwX18wXzEwMCIsImN1c3RvbS1yb2xlcy1yb290LWdyb3VwL2FhIiwiY3VzdG9tLXJvbGVzLXJvb3QtZ3JvdXAvYWEvYWFhIiwiZ251d2dldCIsIkNvbW1pdDQ1MSIsImphc2hrZW5hcyIsImZsaWdodGpzIiwidHdpdHRlciIsImdpdGxhYi1leGFtcGxlcyIsImdpdGxhYi1leGFtcGxlcy9zZWN1cml0eSIsIjQxMjcwOCIsImdpdGxhYi1leGFtcGxlcy9kZW1vLWdyb3VwIiwiY3VzdG9tLXJvbGVzLXJvb3QtZ3JvdXAiLCI0MzQwNDQtZ3JvdXAtMSIsIjQzNDA0NC1ncm91cC0yIiwiZ2l0bGFiLW9yZzEiLCJnaXRsYWItb3JnL3NlY3VyZSIsImdpdGxhYi1vcmcvc2VjdXJlL21hbmFnZXJzIiwiZ2l0bGFiLW9yZy9zZWN1cml0eS1wcm9kdWN0cyIsImdpdGxhYi1vcmcvc2VjdXJpdHktcHJvZHVjdHMvYW5hbHl6ZXJzIl19.TjTrGS5FjfPoY0HWkSLvgjogBxB27jX2beosOZAkwXi_gO3q9DTnL0csOgxjoF1UR8baPNfMFBqL1ipLxBdY9vvDxZve-sOhoSptjzLGkCi7uQKeu7r8wNyFWNWhcLwmbinZyENGSZqIDSkHy0lGdo9oj7qqnH6sYqU46jtWACDGSHTFjNNuo1s_P2SZgkaq4c4v4jdlVV_C_Qlvtl7-eaWV1LzTpB4Mz0VWGsRx1pk3-KnS24crhBjxSE383z4Nar4ZhrsrTK-bOj33l6U32gRKNb4g6GxrPXaRQ268n37spQmbQn0aDwmUOABv-aBRy203bCCZca8BJ0XBur8t6w"
header, body, signature = jwt.split(".")

n = OpenSSL::BN.new(Base64.urlsafe_decode64(metadata[:n]), 2)
e = OpenSSL::BN.new(Base64.urlsafe_decode64(metadata[:e]), 2)
public_key = OpenSSL::PKey::RSA.new(OpenSSL::ASN1::Sequence.new([
  OpenSSL::ASN1::Integer.new(n),
  OpenSSL::ASN1::Integer.new(e)
]).to_der)

puts public_key.verify(
  OpenSSL::Digest::SHA256.new,
  Base64.urlsafe_decode64(signature),
  [header, body].join(".")
)
```


The problem of sharing and distributing public keys is solved using Public Key
Infrastructure (PKI). PKI provides a mechanism for distributing X.509
certificates that include metadata and the public keys for different entities.
X.509 certificates can include a digital signature from other authorities
provides a chain of trust. Each X.509 certificate stores in the CA trust store
on your computer is a self signed certificate that is considered trusted.
Intermediate certificates can be traced back to a root certificate and provides
a web of trust. i.e. I trust this JWT because it was signed by a public key that
I found in this intermediate certificate that was signed by a public key found
in this root certificate that is in my operating systems root certificate
authority trust store. Typically, organizations will operate an internal
certificate authority that can sign intermediate certificates that can be
installed in the trust store for internal services. This makes it easier to
issue internal certificates without need direct access to private keys that can
be abused by bad actors. I apologize for the weak summary of this but a cursory
knowledge of how this works is important for understanding authentication.

When a service federates user authentication to an Identity Provider (.e.g. SAML
IdP, OIDC Provider) the transaction between the service (i.e. SAML Service
Provider, OIDC Relaying Party) depends on an exchange of public key information
ahead of time (AoT). Without this pre-prequisite, none of the downstream
assumptions about user authentication is valid.

The OpenID Core specification describes the `id_token` as a JWT and the JWT
specification describes a set of standard claims that are found in the
JWT body. The `id_token` in the OpenID Connect (OIDC) workflow represents the authentication context.
This _DOES NOT_ represent an authorization context.

### Authorization

Authorization is the act of verifying that a party is allowed to perform a
specific action against a resource. This is separate from Authentication because
in many cases the Resource Server providing access to the Resource does not need
to know who the party is. This creates a decoupling that allows an API to
determine just how much information it actually needs to know about the party
making the request. See the Bus vis Airplane example above for an explanation.

OAuth was designed as a protocol for delegating authorization to an intermediate
entity so that this entity could access resources on behalf of a user without
needing full access to everything that the user has access too. By adhering to
the OAuth2 protocol flow we can ensure that requests made on behalf of end users
do not operate at the highest level of privilege available to them. We can
ensure that requests that are made on behalf of end users use the lowest level
of privilege necessary for the service (OAuth client) to perform their desired
function.

The OAuth2 `access_token` represents the authorization context and this is
distinct from the `id_token` which represents the authentication context. The
`id_token` tells us who the currently logged in user is but it _should not_ be
used to make authorization decisions. Authorization decisions should be made
using the `access_token` because this represents the delegated authorization
access granted to the service that the token was minted for. In general, most
API's that receive a request should make authorization decisions based on the
privileges granted to the `access_token`. OAuth does not specify the format of
the `access_token` so this is up to the OAuth Authorization Server to decide the
schema. It is possible for us to adopt the JWT standard for `access_token`
representation.

The authorization server that generates the `access_token` should only grant
just enough privileges to this token that is required by the service (OAuth
client) that this token is intended for.

The separation of the authentication context from the authorization context is
incredibly important. This ensures that services cannot access resources based
on the full scope of access that a user has but rather the delegated authorized
access that is granted to an access token. The access token represents the
low-privilege session for a specific service. A single `id_token` can be used
across services to allow the service to know who is logged in but each service
should have its own access token for each user based on the permissions that the
service declares that it needs and the permissions that the user agrees to give
it. I need to say this again because understanding this is crucial!

## Envoy Architecture

Given all the concerns listed above this is where Envoy shines. It can be used
to take care of Authentication via an OpenID Connect transaction by slightly
abusing the built-in `envoy.filters.http.oauth2` HTTP filter. It can also be
used to validate any incoming JWTs via the `envoy.filters.http.jwt_authn` HTTP
filter. Finally, we can use the `envoy.filters.http.ext_authz` HTTP filter to
delegate authorization decisions to an external policy decision point (PDP).

I wrote Sparkle as a proof-of-concept to model these ideas using Envoy. Before
we dive into the configuration I want to quickly go over the high level
architecture of how these pieces work together.

The proposed architecture ensures that authorization decisions are made consistently at the edge before requests reach the application.

Envoy can be configured to host multiple listeners and each listener can be
configured to have its own pipeline of middleware to execute in the order that
the middleware is declared. Sparkle uses a single listener on all interfaces
listening for TCP traffic on port 10000 to accept all incoming HTTP traffic.
The last HTTP filter to execute is the `envoy.filter.http.router` filter that
will reverse proxy the incoming request to Sparkle.

Below is a snippet of configuration required to setup the reverse proxy.

```yaml
static_resources:
    - name: listener_0
      address:
        socket_address:
          protocol: TCP
          address: 0.0.0.0
          port_value: 10000
      filter_chains:
        - filters:
            - name: envoy.filters.network.http_connection_manager
              typed_config:
                "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
                http_filters:
                  - name: envoy.filters.http.router
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
                route_config:
                  virtual_hosts:
                    - name: local
                      domains: ["*"]
                      routes:
                        - match:
                            prefix: "/"
                          route:
                            cluster: sparkle
  clusters:
    - name: sparkle
      load_assignment:
        cluster_name: sparkle
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: 127.0.0.1
                      port_value: 8080
```

### Authentication Flow

```mermaid
sequenceDiagram
    participant User
    box grey Docker Image
      participant Envoy
      participant authzd
      participant sparkled
    end
    participant OIDC Provider

    User->>Envoy: GET /dashboard (no auth)
    Envoy->>Envoy: OAuth2 filter detects no auth
    Envoy->>User: Redirect to OIDC Provider
    User->>OIDC Provider: Login
    OIDC Provider->>User: Redirect to /callback with code
    User->>Envoy: GET /callback?code=...
    Envoy->>OIDC Provider: Exchange code for tokens
    OIDC Provider->>Envoy: Return tokens (ID, access, refresh)
    Envoy->>User: Set cookies & redirect to /dashboard
```

The `envoy.filters.http.oauth2` HTTP filter can be configured to detect an
unauthenticated request and intercept all inbound requests by redirecting the
user-agent to the hard-coded OAuth Authorization Server endpoints. This filter
does not support the OIDC Discovery endpoint but an Envoy Gateway
[plugin](https://gateway.envoyproxy.io/docs/tasks/security/oidc/) does.
Envoy Gateway is a control plane that is outside the scope of this document.

In the configuration below, the `envoy.filters.http.oauth2` HTTP filter is used
to manage an OAuth handshake with an OAuth Authorization server. By adding the
`openid` scope to the handshake we have implicitly upgraded the transaction from
a generic OAuth2 handshake to an OpenID Connect transaction. This upgrade allows
the OAuth handshake to receive an additional `id_token` from the Security Token
Service (STS) described by the `token_endpoint` configuration.

The `authorization_endpoint` is the location of the Identity Provider (OIDC
Provider IdP) that this filter will redirect the user-agent to in order to begin a
transaction. The `token_endpoint` is the location that the OAuth client will
forward an OAuth Grant to in order to retrieve an `access_token`, `id_token`,
and `refresh_token`. This HTTP filter takes care of generating a nonce to handle
abusive clients that may wish to try to hit the callback endpoint. It also takes
care of calling the `token_endpoint` with a Refresh Token Grant when it needs to
generate a new `access_token`, `id_token`. This entire exchange is complex and
error prone and by using this filter we reduce the amount of errors that can be
introduced by incorrectly negotiating a new session with the IdP. This ensures
that we use a standards based approach for interoperating with our IdP in the
same manner as any external integration.

```yaml
                # ...
                http_filters:
                  - name: envoy.filters.http.oauth2
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.filters.http.oauth2.v3.OAuth2
                      config:
                        auth_scopes:
                          - email
                          - openid
                          - profile
                        authorization_endpoint: "https://gitlab.com/oauth/authorize"
                        credentials:
                          client_id: "OAUTH_CLIENT_ID"
                          cookie_names:
                            id_token: id_token
                        redirect_path_matcher:
                          path:
                            exact: /callback
                        redirect_uri: "%REQ(x-forwarded-proto)%://%REQ(:authority)%/callback"
                        signout_path:
                          path:
                            exact: /signout
                        token_endpoint:
                          uri: "https://gitlab.com/oauth/token"
                        use_refresh_token: true
                  - name: envoy.filters.http.router
                  # ...
```

The `signout_path` is a virtual path that is managed by Envoy to take care of
clearing session cookies and terminating a session. The `token_endpoint`
configuration will be something that we can utilize to extract the STS code from
the `gitlab-org/gitlab` codebase into a separate isolated service.

### Authorization Flow

TODO:: model these examples from https://gitlab.com/gitlab-org/architecture/auth-architecture/design-doc/-/merge_requests/12#note_2516950269

Example 1: Session cookie

1. Request with a Cookie arrives to Envoy.
1. Envoy sends the request context to a separate service.
1. Separate auth service responds with HTTP OK and a token from STS representing the authenticated principal.
1. Envoy forwards the request to GitLab with the identity token injected into a header.

Example 2: Authorization header

1. Request with an Authorization: Bearer token arrives to Envoy.
1. Envoy sends the token to a separate service.
1. Separate service responds with an identity token from STS.
1. Envoy forwards the request to Rails.

Example 3: Unauthenticated

1. Unauthenticated request arrives.
1. Envoy forwards the request to Rails without an identity token.

Example 4: Workload Identity Federation

1. OAuth authorization request arrives for 3rd-party integration.
1. Envoy forwards the request to the authorization server.

Example 5: ?

1. OAuth authorization request arrives for internal service integration.
1. Envoy forwards the request to the authorization service.
1. Envoy captures authorization grant and exchanges it for the token (current solution).

```mermaid
sequenceDiagram
    participant User
    box grey Docker Image
      participant Envoy
      participant authzd
      participant sparkled
    end
    participant OIDC Provider

    User->>Envoy: GET /dashboard (with cookies)
    Envoy->>Envoy: Extract ID token from cookie
    Envoy->>Envoy: JWT filter validates & extracts claims
    Note right of Envoy: Sets headers:<br/>x-jwt-payload<br/>x-jwt-claim-sub

    Envoy->>authzd: Check authorization (gRPC)
    Note right of authzd: Request includes:<br/>- Method & Path<br/>- Headers (inc. cookies)<br/>- JWT claims
    authzd->>authzd: Evaluate authorization rules
    authzd->>Envoy: Return OK/Denied decision

    alt Authorization OK
        Envoy->>sparkled: Forward request with JWT headers
        sparkled->>sparkled: Extract user from x-jwt-claim-sub
        sparkled->>User: Return dashboard content
    else Authorization Denied
        Envoy->>User: Return 401 Unauthorized
    end
```

The ID token can be validated using the `envoy.filters.http.jwt_authn` HTTP
filter. The following configuration will look for an `id_token` cookie and then
parse the value, validate it against the list of keys specified at the
`remote_jwks` uri and then it will inject a header called `x-jwt-payload` with
the valid JWT as well as the `x-jwt-claim-sub` with the body section of the JWT.
This filter ensures ensures the integrity and authenticity of the detected JWT
and will immediately reject tokens that are invalid.

```yaml
                  # ...
                  - name: envoy.filters.http.oauth2
                    # ...
                  - name: envoy.filters.http.jwt_authn
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.filters.http.jwt_authn.v3.JwtAuthentication
                      providers:
                        gitlab_provider:
                          audiences:
                            - OAUTH_CLIENT_ID
                          claim_to_headers:
                            - claim_name: sub
                              header_name: x-jwt-claim-sub
                          forward_payload_header: x-jwt-payload
                          from_cookies:
                            - id_token
                          issuer: https://gitlab.com
                          remote_jwks:
                            http_uri:
                              uri: https://gitlab.com/oauth/discovery/keys
                      rules:
                        - match:
                            prefix: /
                          requires:
                            requires_any:
                              requirements:
                                - provider_name: gitlab_provider
                                - allow_missing: {}
                  - name: envoy.filters.http.router
                  # ...
```

The `envoy.filters.http.ext_authz` filter can be used to forward the incoming HTTP request to an external
policy decision point that can be used to make the authorization decision. For
Sparkle the PDP is hosted as a sidecar process called `authzd` that makes the
authorization decision specifically on the contents of the HTTP request.

```yaml
                  # ...
                  - name: envoy.filters.http.oauth2
                  # ...
                  - name: envoy.filters.http.jwt_authn
                  # ...
                  - name: envoy.filters.http.ext_authz
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.filters.http.ext_authz.v3.ExtAuthz
                      grpc_service:
                        envoy_grpc:
                          cluster_name: authzd
                      failure_mode_allow: false
                  - name: envoy.filters.http.router
                  # ...
```

The external authorization service must implement the [`CheckRequest` protobuf](https://github.com/envoyproxy/envoy/blob/04378898516847d1107c5b15c22ac602ff06372c/api/envoy/service/auth/v3/external_auth.proto#L35) service definition.
An example of this can be found in the Sparkle repo. Below is an example
snippet:

```golang
package authz

import (
	"context"

	core "github.com/envoyproxy/go-control-plane/envoy/config/core/v3"
	auth "github.com/envoyproxy/go-control-plane/envoy/service/auth/v3"
	types "github.com/envoyproxy/go-control-plane/envoy/type/v3"
	status "google.golang.org/genproto/googleapis/rpc/status"
	"google.golang.org/grpc/codes"
)

type CheckService struct {
	auth.UnimplementedAuthorizationServer
}

func (svc *CheckService) Check(ctx context.Context, request *auth.CheckRequest) (*auth.CheckResponse, error) {
	if svc.isAllowed(ctx, request) {
		return svc.OK(ctx), nil
	}
	return svc.Denied(ctx), nil
}

func (svc *CheckService) OK(ctx context.Context) *auth.CheckResponse {
	return &auth.CheckResponse{
		Status: &status.Status{
			Code: int32(codes.OK),
		},
		HttpResponse: &auth.CheckResponse_OkResponse{
			OkResponse: &auth.OkHttpResponse{
				Headers:              []*core.HeaderValueOption{},
				HeadersToRemove:      []string{},
				ResponseHeadersToAdd: []*core.HeaderValueOption{},
			},
		},
	}
}

func (svc *CheckService) Denied(ctx context.Context) *auth.CheckResponse {
	return &auth.CheckResponse{
		Status: &status.Status{
			Code: int32(codes.PermissionDenied),
		},
		HttpResponse: &auth.CheckResponse_DeniedResponse{
			DeniedResponse: &auth.DeniedHttpResponse{
				Status: &types.HttpStatus{
					Code: types.StatusCode_Unauthorized,
				},
				Headers: []*core.HeaderValueOption{},
			},
		},
	}

  // ...
}
```

## Distribution

To deploy Sparkle I used bundled envoy, sparkled and authzd inside a single
docker image. This docker image uses dumb-init to run these three services
simultaneously so that these three processes can coordinate with one another to
form a logical service. Sparkle is currently distributed via Runway and all
secrets and configuration management is handled through environment variables
that are exported into the docker container when it is booted up by Runway and
OpenBao.

Below is the Dockerfile that is used to build and distribute the Sparkle docker
image. It uses a temporary stage to build the sparkle and authz services and
then copies the compiled artifacts into the envoy base image. The final image
bundles dumb-init, sparkled, authzd and envoy.

```Dockerfile
# syntax=docker/dockerfile:1
FROM golang:1.24.3 AS build
ENV CGO_ENABLED=0
WORKDIR /app
COPY . ./
RUN go build -o /bin/sparkled ./cmd/sparkled/main.go
RUN go build -o /bin/authzd ./cmd/authzd/main.go

FROM envoyproxy/envoy:v1.34-latest
EXPOSE 8080 9901 10000 10003
RUN apt-get update && apt-get install -y dumb-init && rm -rf /var/lib/apt/lists/*
WORKDIR /opt/sparkle/
RUN mkdir -p bin etc public
COPY --from=build /bin/authzd bin/authzd
COPY --from=build /bin/sparkled bin/sparkled
COPY --from=build /app/public public
COPY etc/ etc
COPY bin/*.sh bin/
RUN chmod +x bin/*.sh
ENTRYPOINT ["/usr/bin/dumb-init", "--"]
CMD ["/bin/minit"]
```

The entrypoint uses dumb-init as PID 1 to forward signals to child
processes. minit searches for a Procfile and starts a process for each row in
the file.

```Procfile
envoy: ./bin/envoy-shim
authzd: ./bin/authzd
sparkled: ./bin/sparkled
```

## Summary

Envoy provides a lot of features out of the box making it possible for
application developers to focus on their core domain. This makes it easier to
offload complex and error prone duties such as interacting with an OIDC Provider
and managing key material like an OAuth Client Secret a non-event. By moving
these responsibilities into Envoy we reduce the opportunity for tokens to get
leaked and we ensure that we adhere to open standards while also creating safe
extension points for extending authorization decisions. Envoy's ability to
modify incoming and outgoing requests before delivery makes it possible to
remove sensitive headers and/or convert them to a canonical representation in a
single consistent way. Envoy can handle mapping Authorization headers, session
cookies, query string parameters into a single consistent interface making it
possible to reduce the need for each application to handle each
authentication/authorization strategy that GitLab as a whole supports.

## References

* [Envoy Proxy](https://www.envoyproxy.io/)
* [OpenID Core Specification](https://openid.net/specs/openid-connect-core-1_0-final.html#IDToken)
* [RFC-7519: JSON Web Token (JWT)](https://datatracker.ietf.org/doc/html/rfc7519)
* [`envoy.filters.http.oauth2`](https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_filters/oauth2_filter.html)
* [`envoy.filters.http.jwt_authn`](https://www.envoyproxy.io/docs/envoy/latest/api-v3/extensions/filters/http/jwt_authn/v3/config.proto)
* [`envoy.filters.http.ext_authz`](https://www.envoyproxy.io/docs/envoy/latest/api-v3/extensions/filters/http/ext_authz/v3/ext_authz.proto)
* [`envoy.filters.http.router`](https://www.envoyproxy.io/docs/envoy/latest/api-v3/extensions/filters/http/router/v3/router.proto)
* [Sparkle](https://gitlab.com/gitlab-org/software-supply-chain-security/authorization/sparkled)
  * https://gitlab.com/gitlab-org/software-supply-chain-security/authorization/sparkled/-/merge_requests/3
  * https://gitlab.com/gitlab-org/software-supply-chain-security/authorization/sparkled/-/merge_requests/4
  * https://gitlab.com/gitlab-org/software-supply-chain-security/authorization/sparkled/-/merge_requests/6
  * https://gitlab.com/gitlab-org/software-supply-chain-security/authorization/sparkled/-/merge_requests/7
  * https://gitlab.com/gitlab-org/software-supply-chain-security/authorization/sparkled/-/merge_requests/8
  * https://gitlab.com/gitlab-org/software-supply-chain-security/authorization/sparkled/-/merge_requests/9