Achieving Zero Trust Security on Amazon EKS with Istio

September 3, 2024 By Mark Otto 0

This is the fourth blog post of our “Istio on EKS” series. In this blog post, we’ll explore how Istio, a powerful service mesh, enables organizations to implement a zero trust security model on Amazon Elastic Kubernetes Service (Amazon EKS). We will start by understanding how Istio implements peer authentication between microservices by Mutual Transport Layer Security (mTLS). Next, we will learn how request authentication works in Istio, and how we can use AWS Certificate Manager (ACM) for Istio’s ingress gateway certificates. Finally, we’ll dive deep into leveraging Open Policy Agent (OPA) for external authorization.

In our first blog, Getting started with Istio on EKS, we explained how to set up Istio on Amazon EKS. We covered core aspects such as Istio Gateway, Istio VirtualService, and observability with open source Kiali and Grafana. In the second blog, Using Istio Traffic Management on Amazon EKS to Enhance User Experience, we explained traffic management strategies to accomplish sophisticated testing and deployment strategies, downtime reduction, and user experience enhancement for communication among microservices. And in our third blog, Enhancing Network Resilience with Istio on Amazon EKS, we continued to explore Istio’s network resilience capabilities and demonstrate how to set up and configure these features on Amazon EKS. Now we are continuing this journey by exploring the security features of Istio with a focus on achieving zero trust security.

The zero trust security model operates on the principle of “never trust, always verify.” It advocates for decoupling security policies from application code, enabling a more consistent and centralized approach to security enforcement. By using a service mesh like Istio, organizations can externalize security configurations from their microservices. Istio acts as a dedicated security proxy, handling tasks such as mTLS, access control, and traffic monitoring. This separation of concerns allows platform teams to own and maintain security policies centrally, while developers can focus solely on writing business logic without embedding security rules within their application code.

In Istio, zero trust security is implemented through mTLS authentication between services. By default, Istio automatically configures mTLS for all service-to-service communication within the mesh. Additionally, Istio provides fine-grained access control policies and end-user authentication mechanisms, enabling organizations to enforce zero trust principles at different layers of the service mesh.

Implementing zero trust security

In this blog post, we will explore Istio’s features like request authentication, ingress gateway security, peer authentication, and extensible external authorization. It also covers integrating additional tools like Keycloak, ACM, and OPA to complete the portfolio of tools required for a comprehensive zero trust security implementation on Amazon EKS.

  • Peer authentication: By enabling peer authentication in Istio, you ensure that all communication between services in the mesh is encrypted and mutually authenticated. This provides a strong security posture against unauthorized access and potential attacks, such as man-in-the-middle attacks. In this blog, we will learn how to enable and configure mTLS in Istio.
  • Request authentication: In Istio, request authentication refers to the process of verifying the identity and credentials of a client or service that is attempting to access a resource within the service mesh. Istio enables request-level authentication by validating JSON Web Tokens (JWT) either with a custom authentication provider or an OpenID Connect (OIDC) provider. In this blog, we will use the open source Keycloak OIDC provider.
  • Ingress gateway certificate management: The ingress gateway handles external requests, and in this module we will use the HTTPS protocol for secure communications. We will learn to simplify certificate management by leveraging ACM and AWS Private Certificate Authority (CA).
  • OPA external authorization: The OPA external authorization feature enables Envoy (Istio sidecar proxy) to check with an external OPA server for authorization decisions before allowing requests to reach the target service. In this module, we will use OPA as an external authorizer for Envoy that can act as Istio’s external authorization policy evaluation engine.

Deployment Architecture

We’ll revisit the same microservices-based product catalog application that we used in our first blog, Getting Started with Istio on EKS, to implement Istio’s zero trust security model. The application is composed of three types of microservices: Frontend, Product Catalog, and Catalog Detail as shown in the Istio Data Plane in this diagram.

Istio Data Plane diagram

Fig: Deployment Architecture

Code sample

We have created a Security module code sample in our GitHub repository that demonstrates how to set up and deploy the Istio based security solutions described in this post. The code sample is intended for demonstration purposes only and should not be used in production environments.

Prerequisites and Initial Setup

Before proceeding with the setup, ensure that the prerequisites are met. Upon completing the setup, you will have an Amazon EKS cluster with Istio and the sample application configured. Please refer to the Prerequisites and Setup sections of the code sample for detailed instructions.

Zero Trust Security Use cases

Peer Authentication

In a zero trust security environment, we have to assume that bad actors are able to access services and networks inside of our security perimeter. Traffic between microservices that is sent in plain text is vulnerable to an eavesdropping attack. We can use Istio’s Peer Authentication API to define how traffic will be tunneled (or not) between microservice sidecars (Istio Proxies). Istio’s control plane (Istiod) maintains a CA and generates certificates to enable secure mTLS communication in the data plane. Additionally, Istiod automates key and certificate rotation at scale.

To begin, it is helpful to understand what mTLS is. As described in this diagram, mTLS is a security mechanism that provides secure two-way peer authentication with certificates between services, including clients and servers.

mTLS description

Fig: image source from AWS blog

Peer authentication with Istio is fairly straightforward, as shown in the next diagram. Services communicate via their Istio’s Envoy Proxies over an encrypted mTLS channel:

Peer authentication with Istio

Fig: Peer Authentication in Istio

By default, Istio is configured in Permissive mode, which means it will allow both plaintext and encrypted traffic. In a zero trust environment, we want to force all communications to be encrypted, and disallow plaintext. In addition, we should first understand how Istio handles certificates for mTLS, both with its built in CA as well as AWS Private CA. As you follow along in the GitHub peer-authentication Readme you will learn:

  1. cert-manager Integration: Instead of using Istio’s built-in CA, here we will learn a simpler and more cost effective pattern for short-lived certificate management. This module integrates Istio with AWS Private CA for issuing certificates required for mTLS between workloads. This integration is achieved through the cert-manager toolchain along with custom components like istio-csr and aws-privateca-issuer.
  2. Mutual TLS Enforcement Modes in Istio: We want to transition from the default permissive mTLS behavior in Istio to a strict mTLS mode where mutual TLS is required for all service-to-service communication within the mesh. When that is complete, we will then learn several methods to validate the mTLS setup.

Request Authentication

Request Authentication is the mechanism of verifying the identity of the user or service making a request. Istio enables request-level authentication using JWT through its Authentication Policy and Envoy proxy sidecar. The policy defines JWT as the authentication mechanism and provides necessary JWT details. The sidecar intercepts requests, validates JWT tokens based on the policy, and allows or denies requests accordingly, ensuring only authorized access within the service mesh. Istio can authenticate end-user requests by validating a JSON Web Token (JWT) using either a custom authentication provider or any compliant OpenID Connect (OIDC) identity provider like Keycloak. Requests that match the JWT validation rules are allowed to pass through to the destination application services. The Istio proxies automatically reject requests that fail the JWT validation. By default, requests without a JWT are allowed to pass through to the application services. To block requests without a JWT, you need to combine request authentication policies with authorization policies that require authenticated claims.

Request Authentication

Fig: Request Authentication with valid JWT

This diagram shows the following flow when a user is attempting to access the frontend service:

  1. A user requests an access token (JWT) from Keycloak.
  2. Keycloak authenticates the user and returns a JWT.
  3. The user then sends a request to the frontend service via the Ingress Gateway along with the valid token.
  4. Istio validates the JWT to ensure that the request is allowed.
  5. As the request had a valid JWT, it is allowed to access the frontend service. This is the “good path” that we expect when an incoming request is valid. Other negative scenarios are covered in the Github Readme.

As you follow along in the GitHub request-authentication Readme, you will learn:

  1. Enable Request Authentication: By applying RequestAuthentication policy, the ingress gateway is configured to validate incoming requests based on the provided JWT rules.
  2. Generate access tokens: Demonstrates how to use the provided helper script to generate access tokens (JWTs) from the Keycloak OIDC provider for a specific user.
  3. Validation: Demonstrates different scenarios:
    a. Requests with valid JWTs are allowed
    b. Requests with invalid JWTs are rejected
    c. Requests with missing JWTs are allowed (by default)

Ingress Gateway Certificate Management

Ingress Gateway is a load balancer operating at the edge of the mesh that receives incoming HTTP/TCP connections. With Istio, we can configure gateways to use HTTPS, but we still need to manage certificates on the load balancer. While the rest of this section is not required for zero trust security, certificate management can add complexity and administrative overhead. This is an optional recommendation for simplifying your management.

In this example, we are using an AWS Application Load Balancer (ALB), using AWS Private CA to create certs and ACM to manage them, reducing this burden. ACM makes it very simple to use certificates with ALBs, simply annotate the istio-ingress service with the Amazon Resource Name (ARN) of the certificate, and then you can leverage the features of ACM such as certificate management and renewal.

In this diagram, requests come in to the Istio Ingress Gateway on port 443 (HTTPS). The Istio Ingress Gateway uses an ALB for load balancing ingress traffic between pods. The certificates are managed by ACM for simplified management.

Istio Ingress Gateway diagram

Fig: Ingress Gateway Certificate Management

As you follow along the GitHub ingress-security Readme, you will learn:

  1. Certificate Management setup: Here a self-signed certificate is used for the Ingress Gateway load balancer to avoid creating a new Private CA resource. The self-signed certificate is generated, imported into ACM, and associated with the Ingress Gateway load balancer’s HTTPS listener using service annotations.
  2. Verify with load test: Verify the correct configuration of the Ingress Gateway, testing HTTPS communication through the load balancer, simulating load with the siege tool, and visualizing the mTLS-encrypted traffic flow in the Kiali observability tool.

OPA External Authorization

In a zero trust security environment, the principle is to trust no one and nothing by default. Every request or communication must be authorized and validated before granting access, even within private networks. OPA acts as an external authorization policy evaluation engine, enabling fine-grained and flexible access control policies across the Istio service mesh. Istio’s proxies (based on Envoy) use the External Authorization filter architecture to delegate authorization decisions to an external service. This allows application teams to integrate with external policy services, such as OPA using the OPA Envoy external authorizer plugin, and extend the authorization semantics beyond Istio’s native AuthorizationPolicy capabilities.

Istio flow diagram

Fig: OPA External Authorization

The diagram describes the overall flow:

  1. Istio intercepts incoming requests to the application.
  2. Istio forwards the request to the OPA sidecar for authorization based on the Authorization Policy.
  3. OPA evaluates the request against the defined policy.rego rules
  4. OPA returns an allow/deny decision to Istio.
  5. Istio enforces the decision by either allowing the request to reach the application (if allowed) or blocking the request (if denied).

As you follow along in the GitHub OPA-external-authorization Readme, you will learn:

  1. Deploying OPA: The OPA server is deployed as a sidecar container alongside the application containers using Gatekeeper’s mutation feature. The OPA sidecar listens on port 9191 for authorization requests.
  2. Configuring Istio for External Authorization: A ServiceEntry is created to allow Istio proxies to resolve the OPA sidecar endpoint. The OPA extension is registered with Istio by updating the Istio ConfigMap in the istio-system namespace.
  3. Defining Authorization Policies: An AuthorizationPolicy with the CUSTOM action is applied to the application namespace workshop, which forwards authorization decisions to the OPA sidecar. The OPA policy rules are defined in the policy.rego file, specifying access controls based on user roles and HTTP request attributes.
  4. The policy.rego file contains the OPA policy rules that define the authorization logic. In this example, the policy allows requests to the /hello endpoint but denies all other requests.
  5. Testing Authorization: Various curl requests are sent with different user roles and HTTP methods/paths to validate the authorization behavior. The OPA decision logs are inspected to confirm the allow/deny decisions based on the policy rules.

Clean up

To clean up the Amazon EKS environment and remove the services we deployed, please run the following commands:

terraform destroy -auto-approve

Conclusion

In this post, we covered Istio’s security mechanisms, which allow us to implement a true zero trust security architecture on Amazon EKS. In addition to the built-in security features for Request Authentication, Peer Authentication, and Ingress security, we learned to leverage add-on tools such as Keycloak and OPA. These features enable comprehensive hardening and protection of microservices in the service mesh. Overall, Istio provides developers with powerful tools to build security and robustness into microservices architectures.