Appendix F. Open Policy Agent

In a typical microservices deployment, we can enforce access-control policies in either of the following two locations or both:

  • The edge of the deployment--Typically, with an API gateway (which we discuss in chapter 5)

  • The edge of the service--Typically, with a service mesh or with a set of embedded libraries (which we discuss in chapter 7 and chapter 12)

Authorization at the service level enables each service to enforce access-control policies in the way it wants. Typically, you apply coarse-grained access-control policies at the API gateway (at the edge), and more fine-grained access-control policies at the service level. Also, it’s common to do data-level entitlements at the service level. For example, at the edge of the deployment, we can check whether a given user is eligible to perform an HTTP GET for the Order Processing microservice. But the data entitlement checks, such as only an order admin can view orders having a transaction amount greater than $10,000, are enforced at the service level.

In this appendix, we discuss key components of an access-control system, access-control patterns, and how to define and enforce access-control policies by using Open Policy Agent (OPA). OPA (www.openpolicyagent.org) is an open source, lightweight, general-purpose policy engine with no dependency on microservices. You can use OPA to define fine-grained access-control policies and enforce those policies at different locations across your infrastructure as well as within a microservices deployment. We discussed OPA briefly in chapter 5. In this appendix, we delve deep into the details. We also assume that you’ve already gone through chapters 5, 7, 10, 11, and 12, and have a good understanding of containers, Kubernetes, Istio, and JWT.

F.1 Key components in an access-control system

In a typical access-control system, we find five key components (figure F.1): the policy administration point (PAP), policy enforcement point (PEP), policy decision point (PDP), policy information point (PIP), and policy store. The PAP is the component that lets policy administrators and developers define access-control policies.

 


Figure F.1 Components of a typical access-control system. The PAP defines access-control policies and then stores those in the policy store. At runtime, the PEP intercepts all the requests, builds an authorization request, and talks to the PDP. The PDP loads the policies from the policy store and any other missing information from the PIP, evaluates the policies, and passes the decision back to the PEP.

Most of the time, PAP implementations come with their own user interface or expose the functionality via an API. Some access-control systems don’t have a specific PAP; rather, they read policies directly from the filesystem, so you need to use third-party tools to author these policies. Once you define the policies via a PAP, the PAP writes the policies to a policy store. The policy store can be a database, a filesystem, or even a service that’s exposed via HTTP.

The PEP sits between the service/API, which you want to protect, and the client application. At runtime, the PEP intercepts all the communications between the client application and the service. As we discussed in chapter 3, the PEP can be an API gateway, or as we discussed in chapters 7 and 8, it can be some kind of an interceptor embedded into your application itself. And in chapter 12, we discussed how in a service mesh deployment, a proxy can be used as a PEP that intercepts all the requests coming to your microservice.

When the PEP intercepts a request, it extracts certain parameters from the request--such as the user, resource, action, and so on--and creates an authorization request. Then it talks to the PDP to check whether the request is authorized. If it’s authorized, the PDP dispatches the request to the corresponding service or to the API; otherwise, it returns an error to the client application. Before the request hits the PEP, we assume it’s properly authenticated.

When the PEP talks to the PDP to check authorization, the PDP loads all the corresponding policies from the policy store. And while evaluating an authorization request against the applicable policies, if there is any required but missing information, the PDP will talk to a PIP. For example, let’s say we have an access-control policy that says you can buy a beer only if your age is greater than 21, but the authorization request carries only your name as the subject, buy as the action, and beer as the resource. The age is the missing information here, and the PDP will talk to a PIP to find the corresponding subject’s age. We can connect multiple PIPs to a PDP, and each PIP can connect to different data sources.

F.2 What is an Open Policy Agent?

As we discussed in the introduction to this appendix, OPA is an open source, lightweight, general-purpose policy engine that has no dependency on microservices. You can use OPA to define fine-grained access-control policies and enforce those policies at different locations throughout your infrastructure as well as within a microservices deployment. To define access-control policies, OPA introduces a new declarative language called Rego (www.openpolicyagent.org/docs/latest/policy-language).

OPA started as an open source project in 2016, with a goal to unify policy enforcement across multiple heterogeneous technology stacks. Netflix, one of the early adopters of OPA, uses it to enforce access-control policies in its microservices deployment. Apart from Netflix, Cloudflare, Pinterest, Intuit, Capital One, State Street, and many more use OPA. At the time of this writing, OPA is an incubating project under the Cloud Native Computing Foundation (CNCF).

F.3 OPA high-level architecture

In this section, we discuss how OPA’s high-level architecture fits into our discussion. As you can see in figure F.2, the OPA engine can run on its own as a standalone deployment or as an embedded library along with an application.


Figure F.2 An application or a PEP can integrate with the OPA policy engine via its HTTP REST API or via the Go API.

When you run the OPA server as a standalone deployment, it exposes a set of REST APIs that PEPs can connect to and check authorization. In figure F.2, the OPA engine acts as the PDP.

The open source distribution of the OPA server doesn’t come with a policy authoring tool or a user interface to create and publish policies to the OPA server. But you can use a tool like Visual Studio (VS) Code to create OPA policies, and OPA has a plugin for VS Code. If you decide to embed the OPA server (instead of using it as a hosted server) as a library in your application, you can use the Go API (provided by OPA) to interact with it.

Once you have the policies, you can use the OPA API to publish them to the OPA server. When you publish those policies via the API, the OPA engine keeps them in memory only. You’ll need to build a mechanism to publish policies every time the server boots up. The other option is to copy the policy files to the filesystem behind OPA, and the OPA server will pick them up when it boots up. If any policy changes occur, you’ll need to restart the OPA server. However, there is an option to ask the OPA server to load policies dynamically from the filesystem, but that’s not recommended in a production deployment. Also, you can load policies to the OPA server by using a bundle server; we discuss that in detail in section F.7.

OPA has a PIP design to bring in external data to the PDP or to the OPA engine. This model is quite similar to the model we discussed in the previous paragraph with respect to policies. In section F.7, we detail how OPA brings in external data.

F.4 Deploying OPA as a Docker container

In this section, we discuss how to deploy an OPA server as a Docker container. In OPA, there are multiple ways of loading policies. Importantly, OPA stores those policies in memory (there is no persistence), so that on a restart or redeployment, OPA needs a way to reload the policies. For example, when we use OPA for the Kubernetes admission control, policies are persisted in the Kubernetes API server, and OPA has its own sidecar that loads policies via OPA’s REST API. That’s roughly the approach we followed in section 5.3. In using OPA in a microservices deployment, the most common approaches are to either configure OPA to download policies via the bundle API (for example, using AWS’s S3 as the bundle server) or use volume/bind mounts to mount policies into the container running OPA.

With bind mounts, we keep all the policies in a directory in the host filesystem and then mount it to the OPA Docker container filesystem. If you look at the appendix-f/sample01/run_opa.sh file, you’ll find the following Docker command (do not try it as it is). Here, we mount the policies directory from the current location of the host filesystem to the policies directory of the container filesystem under the root:

> docker run --mount type=bind,source="$(pwd)"/policies,target=/policies 
-p 8181:8181 openpolicyagent/opa:0.15.0 run /policies --server

To start the OPA server, run the following command from the appendix-f/sample01 directory. This loads the OPA policies from the appendix-f/sample01/policies directory (in section F.6, we discuss OPA policies in detail):

> sh run_opa.sh

{
  "addrs":[
    ":8181"
  ],
  "insecure_addr":"",
  "level":"info",
  "msg":"Initializing server.",
  "time":"2019-11-05T07:19:34Z"
}

You can run the following command from the appendix-f/sample01 directory to test the OPA server. The appendix-f/sample01/policy_1_input_1.json file carries the input data for the authorization request in JSON format (in section F.6, we discuss authorization requests in detail):

> curl -v -X POST --data-binary @policy_1_input_1.json 
http://localhost:8181/v1/data/authz/orders/policy1

{"result":{"allow":true}}

The process of deploying OPA in Kubernetes is similar to deploying any other service on Kubernetes, as we discuss in appendix J. You can check the OPA documentation available at http://mng.bz/MdDD for details.

F.5 Protecting an OPA server with mTLS

OPA was designed to run on the same server as the microservice that needs authorization decisions. As such, the first layer of defense for microservice-to-OPA communication is the fact that the communication is limited to localhost. OPA is a host-local cache of the relevant policies authored in the PAP and recorded in the policy store. To make a decision, OPA is often self-contained and can make the decision all on its own without reaching out to other servers.

This means that decisions are highly available and highly performant, for the simple reason that OPA shares a fate with the microservice that needs authorization decisions and requires no network hop for those decisions. Nevertheless, OPA recommends defense in depth and ensuring that communication between it and its microservice or other clients is secured via mTLS.

In this section, we discuss how to protect the OPA server with mTLS. This will ensure all the communications that happen among the OPA server and other client applications are encrypted. Also, only legitimate clients with proper keys can talk to the OPA server. To protect the OPA server with mTLS, we need to accomplish the following tasks:

  • Generate a public/private key pair for the OPA server

  • Generate a public/private key pair for the OPA client

  • Generate a public/private key pair for the CA

  • Sign the public key of the OPA server with the CA’s private key to generate the OPA server’s public certificate

  • Sign the public key of the OPA client with the CA’s private key to generate the OPA client’s public certificate

To perform all these tasks, we can use the appendix-f/sample01/keys/gen-key.sh script with OpenSSL. Let’s run the following Docker command from the appendix-f/sample01/keys directory to spin up an OpenSSL Docker container. You’ll see that we mount the current location (which is appendix-f/sample01/keys) from the host filesystem to the /export directory on the container filesystem:

> docker run -it -v $(pwd):/export prabath/openssl
#

Once the container boots up successfully, you’ll find a command prompt where you can type OpenSSL commands. Let’s run the following command to execute the gen-key.sh file that runs a set of OpenSSL commands:

# sh /export/gen-key.sh

Once this command executes successfully, you’ll find the keys corresponding to the CA in the appendix-f/sample01/keys/ca directory, the keys corresponding to the OPA server in the appendix-f/sample01/keys/opa directory, and the keys corresponding to the OPA client in the appendix-f/sample01/keys/client directory. If you want to understand the exact OpenSSL commands we ran during key generation, check appendix G.

In case you’re already running the OPA server, stop it by pressing Ctrl-C on the corresponding command console. To start the OPA server with TLS support, use the following command from the appendix-f/sample01 directory:

> sh run_opa_tls.sh

{
  "addrs":[
    ":8181"
  ],
  "insecure_addr":"",
  "level":"info",
  "msg":"Initializing server.",
  "time":"2019-11-05T19:03:11Z"
}

You can run the following command from the appendix-f/sample01 directory to test the OPA server. The appendix-f/sample01/policy_1_input_1.json file carries the input data for the authorization request in JSON format. Here we use HTTPS to talk to the OPA server:

> curl -v -k -X POST --data-binary @policy_1_input_1.json 
https://localhost:8181/v1/data/authz/orders/policy1

{"result":{"allow":true}}

Let’s check what’s in the run_opa_tls.sh script, shown in the following listing. The code annotations in the listing explain what each argument means.

Listing F.1 Protecting an OPA server endpoint with TLS

> docker run 
       -v "$(pwd)"/policies:/policies               
       -v "$(pwd)"/keys:/keys                       
       -p 8181:8181                                 
       openpolicyagent/opa:0.15.0                   
       run /policies                                
       --tls-cert-file /keys/opa/opa.cert           
       --tls-private-key-file /keys/opa/opa.key     
       --server                                      

Instructs the OPA server to load policies from policies directory, which is mounted to the OPA container

The OPA server finds key/certificate for the TLS communication from the keys directory, which is mounted to the OPA container.

Port mapping maps the container port to the host port.

Name of the OPA Docker image

Runs the OPA server by loading policies and data from the policies directory, which is mounted to the OPA container

Certificate used for the TLS communication

Private key used for the TLS communication

Starts the OPA engine under the server mode

Now the communication between the OPA server and the OPA client (curl) is protected with TLS. But still, anyone having access to the OPA server’s IP address can access it over TLS. There are two ways to protect the OPA endpoint for authentication: token authentication and mTLS.

With token-based authentication, the client has to pass an OAuth 2.0 token in the HTTP Authorization header as a bearer token, and you also need to write an authorization policy.1 In this section, we focus on securing the OPA endpoint with mTLS.

If you’re already running the OPA server, stop it by pressing Ctrl-C on the corresponding command console. To start the OPA server enabling mTLS, run the following command from the appendix-f/sample01 directory:

> sh run_opa_mtls.sh

Let’s check what’s in the run_opa_mtls.sh script, shown in the following listing. The code annotations explain what each argument means.

Listing F.2 Protecting an OPA server endpoint with mTLS

> docker run 
       -v "$(pwd)"/policies:/policies     
       -v "$(pwd)"/keys:/keys     
       -p 8181:8181     
       openpolicyagent/opa:0.15.0     
       run /policies     
       --tls-cert-file /keys/opa/opa.cert     
       --tls-private-key-file /keys/opa/opa.key  
       --tls-ca-cert-file /keys/ca/ca.cert        
       --authentication=tls                       
       --server    

The public certificate of the CA. All the OPA clients must carry a certificate signed by this CA.

Enables mTLS authentication

You can use the following command from the appendix-f/sample01 directory to test the OPA server, which is now secured with mTLS:

> curl -k -v --key keys/client/client.key 
--cert keys/client/client.cert -X POST 
--data-binary @policy_1_input_1.json 
https://localhost:8181/v1/data/authz/orders/policy1 

Here, we use HTTPS to talk to the OPA server, along with the certificate and the key generated for the OPA client at the start of this section. The key and the certificate of the OPA client are available in the appendix-f/sample01/keys/client directory.

F.6 OPA policies

To define access-control policies, OPA introduces a new declarative language called Rego.2 In this section, we go through a set of OPA policies (listing F.3) to understand the strength of the Rego language. All the policies we discuss here are available in the appendix-f/sample01/policies directory and are already loaded into the OPA server we booted up in section F.5, which is protected with mTLS.

Listing F.3 OPA policy written in Rego

package authz.orders.policy1        
  
default allow = false               

allow {                             
  input.method = "POST"             
  input.path = "orders"    
  input.role = "manager"    
}

allow {        
  input.method = "POST" 
  input.path = ["orders",dept_id]   
  input.deptid = dept_id
  input.role = "dept_manager"
}

The package name of the policy. Packages let you organize your policies into modules, just as with programming languages.

By default, all requests are disallowed. If this isn’t set and no allowed rules are matched, OPA returns an undefined decision.

Declares the conditions to allow access to the resource

The Input document is an arbitrary JSON object handed to OPA and includes use-case-specific information. In this example, the Input document includes a method, path, role, and deptid. This condition requires that the method parameter in the input document must be POST.

The value of the path parameter in the input document must match this value, where the value of the dept_id is the deptid parameter from the input document.

The policy defined in listing F.3, which you’ll find in the policy_1.rego file, has two allow rules. For an allow rule to return true, every statement within the allow block must return true. The first allow rule returns true only if a user with the manager role is the one doing an HTTP POST on the orders resource. The second allow rule returns true if a user with the dept_manager role is the one doing an HTTP POST on the orders resource under their own department.

Let’s evaluate this policy with two different input documents. The first is the input document in listing F.4, which you’ll find in the policy_1_input_1.json file. Run the following curl command from the appendix-f/sample01 directory and it returns true, because the inputs in the request match with the first allow rule in the policy (listing F.3):

> curl -k -v --key keys/client/client.key 
--cert keys/client/client.cert -X POST 
--data-binary @policy_1_input_1.json 
https://localhost:8181/v1/data/authz/orders/policy1

{"result":{"allow":true}}

Listing F.4 Rego input document with manager role

{
    "input":{
      "path":"orders",
      "method":"POST",
      "role":"manager"
    }
}

Let’s try with another input document, as shown in listing F.5, which you’ll find in the policy_1_input_2.json file. Run the following curl command from the appendix-f/sample01 directory and it returns true, because the inputs in the request match with the second allow rule in the policy (listing F.3). You can see how the response from OPA server changes by changing the values of the inputs:

> curl -k -v --key keys/client/client.key 
--cert keys/client/client.cert -X POST 
--data-binary @policy_1_input_2.json 
https://localhost:8181/v1/data/authz/orders/policy1

{"result":{"allow":true}}

Listing F.5 Rego input document with dept_manager role

{
    "input":{
      "path":["orders",1000],
      "method":"POST",
      "deptid":1000,
      "role":"dept_manager"
    }
}

Now let’s have a look at a slightly improved version of the policy in listing F.3. You can find this new policy in listing F.6, and it’s already deployed to the OPA server you’re running. Here, our expectation is that if a user has the manager role, they will be able to do HTTP PUTs, POSTs, or DELETEs on any orders resource, and if a user has the dept_manager role, they will be able to do HTTP PUTs, POSTs, or DELETEs only on the orders resource in their own department. Also any user, regardless of the role, should be able to do HTTP GETs to any orders resource under their own account. The annotations in the following listing explain how the policy is constructed.

Listing F.6 Improved OPA policy written in Rego

package authz.orders.policy2

default allow = false
allow {
  allowed_methods_for_manager[input.method]                  
  input.path = "orders"
  input.role = "manager"
}

allow {
  allowed_methods_for_dept_manager[input.method]             
  input.deptid = dept_id
  input.path = ["orders",dept_id]
  input.role = "dept_manager"
}
 
allow {                                                      
  input.method = "GET"
  input.empid = emp_id
  input.path = ["orders",emp_id]
}

allowed_methods_for_manager = {"POST","PUT","DELETE"}        
allowed_methods_for_dept_manager = {"POST","PUT","DELETE"}   

Checks whether the value of the method parameter from the input document is in the allowed_methods_for_manager set

Checks whether the value of the method parameter from the input document is in the allowed_methods_for_dept_manager set

Allows anyone to access the orders resource under their own employee ID

The definition of the allowed_methods_for_manager set

The definition of the allowed_methods_for_dept_manager set

Let’s evaluate this policy with the input document in listing F.7, which you’ll find in the policy_2_input_1.json file. Run the following curl command from the appendix-f/sample01 directory and it returns true, because the inputs in the request match with the first allow rule in the policy (listing F.6):

> curl -k -v --key keys/client/client.key 
--cert keys/client/client.cert -X POST 
--data-binary @policy_2_input_1.json 
https://localhost:8181/v1/data/authz/orders/policy2

{
  "result":{
    "allow":true,
    "allowed_methods_for_dept_manager":["POST","PUT","DELETE"],
    "allowed_methods_for_manager":["POST","PUT","DELETE"]
  }
}
 

Listing F.7 Rego input document with manager role

{
    "input":{
      "path":"orders",
      "method":"PUT",
      "role":"manager"
    }
}

You can also try out the same curl command as shown here with two other input documents: policy_2_input_2.json and policy_2_input_3.json. You can find these files inside the appendix-f/sample01 directory.

F.7 External data

During policy evaluation, sometimes the OPA engine needs access to external data. As we discussed in section F.1 while evaluating an authorization request against the applicable policies, if there is any required but missing information, the OPA server will talk to a PIP (or external data sources). For example, let’s say we have an access-control policy that says you can buy a beer only if your age is greater than 21, but the authorization request carries only your name as the subject, buy as the action, and beer as the resource. The age is the missing information here, and the OPA server will talk to an external data source to find the corresponding subject’s age. In this section, we discuss multiple approaches OPA provides to bring in external data for policy evaluation.3

F.7.1 Push data

The push data approach to bring in external data to the OPA server uses the data API provided by the OPA server. Let’s look at a simple example. This is the same example we used in section 5.3. The policy in listing F.8 returns true if method, path, and the set of scopes in the input message match some data read from an external data source that’s loaded under the package named data.order_policy_data.

Listing F.8 OPA policy using pushed external data

package authz.orders.policy3                 

import data.order_policy_data as policies    

default allow = false                        

allow {                                      
  policy = policies[_]                       
  policy.method = input.method               
  policy.path = input.path    
  policy.scopes[_] = input.scopes[_]
}

The package name of the policy

Declares the set of statically registered data identified as policies

By default, all requests are disallowed. If this isn’t set and no allowed rules matched, OPA returns an undefined decision.

Iterates over values in the policies array

Declares the conditions to allow access to the resource

For an element in the policies array, checks whether the value of the method parameter in the input matches the method element of the policy

This policy consumes all the external data from the JSON file appendix-f/sample01/order_policy_data.json (listing F.9), which we need to push to the OPA server using the OPA data API. Assuming your OPA server is running on port 8181, you can run the following curl command from the appendix-f/sample01 directory to publish the data to the OPA server. Keep in mind that here we’re pushing only external data, not the policy. The policy that consumes the data is already on the OPA server, which you can find in the appendix-f/sample01/policies/policy_3.rego file:

> curl -k -v --key keys/client/client.key 
--cert keys/client/client.cert -H "Content-Type: application/json" 
-X PUT --data-binary @order_policy_data.json 
https://localhost:8181/v1/data/order_policy_data

Listing F.9 Order Processing resources defined as OPA data

[
{
  "id": "r1",                  
  "path": "orders",            
  "method": "POST",            
  "scopes": ["create_order"]   
},
{   
  "id": "r2",
  "path": "orders",
  "method": "GET",
  "scopes": ["retrieve_orders"]
},
{   
  "id": "r3",
  "path": "orders/{order_id}",
  "method": "PUT",
  "scopes": ["update_order"]
}
] 

An identifier for the resource path

The resource path

The HTTP method

To do an HTTP POST to the orders resource, you must have this scope.

Now you can run the following curl command from the appendix-f/sample01 directory with the input message, which you’ll find in the JSON file appendix-f/sample01/policy_3_input_1.json (in listing F.10) to check if the request is authorized:

> curl -k -v --key keys/client/client.key 
--cert keys/client/client.cert -X POST 
--data-binary @policy_3_input_1.json 
https://localhost:8181/v1/data/authz/orders/policy3

{"result":{"allow":true}}

Listing F.10 OPA input document

{
  "input":{
    "path":"orders",
    "method":"GET",
    "scopes":["retrieve_orders"]
  }
}

With the push data approach, you control when you want to push the data to the OPA server. For example, when the external data gets updated, you can push the updated data to the OPA server. This approach, however, has its own limitations. When you use the data API to push external data into the OPA server, the OPA server keeps the data in cache (in memory), and when you restart the server, you need to push the data again. Nevertheless, this is the approach used within the Kubernetes admission control use case, where there is a sidecar running next to OPA that synchronizes the state of OPA with external data.

F.7.2 Loading data from the filesystem

In this section, we discuss how to load external data from the filesystem. When we start the OPA server, we need to specify from which directory on the filesystem the OPA server should load data files and policies. Let’s have a look at the appendix-f/sample-01/run_opa_mtls.sh shell script, shown in the following listing. The code annotations explain how OPA loads policies from the filesystem at startup.

Listing F.11 Loading policies at startup

docker run 
       -v "$(pwd)"/policies:/policies             
       -v "$(pwd)"/keys:/keys     
       -p 8181:8181     
       openpolicyagent/opa:0.15.0     
       run /policies                              
       --tls-cert-file /keys/opa/opa.cert     
       --tls-private-key-file /keys/opa/opa.key  
       --tls-ca-cert-file /keys/ca/ca.cert     
       --authentication=tls    
       --server    

A Docker bind mount, which mounts the policies directory under the current path of the host machine to the policies directory of the container filesystem

Runs the OPA server by loading policies and data from the policies directory

The OPA server you already have running has the policy and the data we’re going to discuss in this section. Let’s first check the external data file (order_policy_data_ from_file.json), which is available in the appendix-f/sample01/policies directory. This is the same file you saw in listing F.9 except for a slight change to the file’s structure. You can find the updated data file in the following listing.

Listing F.12 Order Processing resources defined as data with a root element

{"order_policy_data_from_file" :[
    {
      "id": "p1",
      "path": "orders",
      "method": "POST",
      "scopes": ["create_order"]
    },
    {
      "id": "p2",
      "path": "orders",
      "method": "GET",
      "scopes": ["retrieve_orders"]
    },
    {
      "id": "p3",
      "path": "orders/{order_id}",
      "method": "PUT",
      "scopes": ["update_order"]
    }
  ]
}

You can see in the JSON payload that we have a root element called order_policy _data_from_file. The OPA server derives the package name corresponding to this data set as data.order_policy_data_from_file , which is used in the policy in the following listing. This policy is exactly the same as in listing F.8 except the package name has changed.

Listing F.13 OPA policy using pushed external data

package authz.orders.polic4

import data.order_policy_data_from_file as policies

default allow = false

allow {
  policy = policies[_]    
  policy.method = input.method
  policy.path = input.path
  policy.scopes[_] = input.scopes[_]
}

Now you can run the following curl command from the appendix-f/sample01 directory with the input message (appendix-f/sample01/policy_4_input_1.json) from listing F.10 to check whether the request is authorized:

> curl -k -v --key keys/client/client.key 
--cert keys/client/client.cert -X POST 
--data-binary @policy_4_input_1.json 
https://localhost:8181/v1/data/authz/orders/policy4

{"result":{"allow":true}}

One issue with loading data from the filesystem is that when there’s any update, you need to restart the OPA server. There is, however, a configuration option (see appendix-f/sample01/run_opa_mtls_watch.sh) to ask the OPA server to load policies dynamically (without a restart), but that option isn’t recommended for production deployments. In practice, if you deploy an OPA server in a Kubernetes environment, you can keep all your policies and data in a Git repository and use an init container along with the OPA server in the same Pod to pull all the policies and data from Git when you boot up the corresponding Pod. This process is the same as the approach we discussed in section 11.2.7 to load keystores. And when there’s an update to the policies or data, we need to restart the Pods.

F.7.3 Overload

The overload approach to bringing in external data to the OPA server uses the input document itself. When the PEP builds the authorization request, it can embed external data into the request. Say, for example, the orders API knows, for anyone wanting to do an HTTP POST to it, they need to have the create_order scope. Rather than pre-provisioning all the scope data into the OPA server, the PEP can send it along with the authorization request. Let’s have a look at a slightly modified version of the policy in listing F.8. You can find the updated policy in the following listing.

Listing F.14 OPA policy using external data that comes with the request

package authz.orders.policy5

import input.external as policy

default allow = false

allow {
  policy.method = input.method
  policy.path = input.path
  policy.scopes[_] = input.scopes[_]
}

You can see that we used the input.external package name to load the external data from the input document. Let’s look at the input document in the following listing, which carries the external data with it.

Listing F.15 OPA request carrying external data

{
    "input":{
      "path":"orders",
      "method":"GET",
      "scopes":["retrieve_orders"],
      "external" : {
            "id": "r2",
            "path": "orders",
            "method": "GET",
            "scopes": ["retrieve_orders"]
      }
    }
}

Now you can run the following curl command from the appendix-f/sample01 directory with the input message from listing F.15 (appendix-f/sample01/policy_5_input_ 1.json) to check whether the request is authorized:

> curl -k -v --key keys/client/client.key 
--cert keys/client/client.cert -X POST 
--data-binary @policy_5_input_1.json 
https://localhost:8181/v1/data/authz/orders/policy5

{"result":{"allow":true}}

Reading external data from the input document doesn’t work all the time. For example, there should be a trust relationship between the OPA client (or the policy enforcement point) and the OPA server. Next we discuss an alternative for sending data in the input document that requires less trust and is applicable especially for end-user external data.

F.7.4 JSON Web Token

JSON Web Token (JWT) provides a reliable way of transferring data over the wire between multiple parties in a cryptographically secure way. (If you’re new to JWT, check out appendix B.) OPA provides a way to pass a JWT in the input document. The OPA server can verify the JWT and then read data from it. Let’s go through an example.

First, we need to have an STS that issues a JWT. You can spin up an STS by using the following command. This is the same STS we discussed in chapter 10:

> docker run -p 8443:8443 prabath/insecure-sts-ch10:v1

Here, the STS starts on port 8443. Once it starts, run the following command to get a JWT:

> curl -v -X POST --basic -u applicationid:applicationsecret 
-H "Content-Type: application/x-www-form-urlencoded;charset=UTF-8" 
-k -d "grant_type=password&username=peter&password=peter123&scope=foo" 
https://localhost:8443/oauth/token

In this command, applicationid is the client ID of the web application, and applicationsecret is the client secret (which are hardcoded in the STS). If everything works fine, the STS returns an OAuth 2.0 access token, which is a JWT (or a JWS, to be precise):

{
"access_token":"eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9.eyJleHAiOjE1NTEzMTIzNz
YsInVzZXJfbmFtZSI6InBldGVyIiwiYXV0aG9yaXRpZXMiOlsiUk9MRV9VU0VSIl0sImp0aSI6I
jRkMmJiNjQ4LTQ2MWQtNGVlYy1hZTljLTVlYWUxZjA4ZTJhMiIsImNsaWVudF9pZCI6ImFwcGxp
Y2F0aW9uaWQiLCJzY29wZSI6WyJmb28iXX0.tr4yUmGLtsH7q9Ge2i7gxyTsOOa0RS0Yoc2uBuA
W5OVIKZcVsIITWV3bDN0FVHBzimpAPy33tvicFROhBFoVThqKXzzG00SkURN5bnQ4uFLAP0NpZ6
BuDjvVmwXNXrQp2lVXl4lQ4eTvuyZozjUSCXzCI1LNw5EFFi22J73g1_mRm2jdEhBp1TvMaRKLB
Dk2hzIDVKzu5oj_gODBFm3a1S-IJjYoCimIm2igcesXkhipRJtjNcrJSegBbGgyXHVak2gB7I07
ryVwl_Re5yX4sV9x6xNwCxc_DgP9hHLzPM8yz_K97jlT6Rr1XZBlveyjfKs_XIXgU5qizRm9mt5
xg",
"token_type":"bearer",
"refresh_token":"",
"expires_in":5999,
"scope":"foo",
"jti":"4d2bb648-461d-4eec-ae9c-5eae1f08e2a2"
}

Now you can extract the JWT from the output, which is the value of the access_token parameter. It’s a bit lengthy, so make sure that you copy the complete string. In listing F.16, you’ll find the input document. There we use the copied value of the JWT as the value of the token parameter. The listing shows only a part of the JWT, but you can find the complete input document in the appendix-f/sample01/policy_6_input_1.json file.

Listing F.16 Input document, which carries data in a JWT

{
    "input":{
      "path": ["orders",101],
      "method":"GET",
      "empid" : 101,
      "token" : "eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9... "
    }
}

The following listing shows the policy corresponding to the input document in listing F.16. The code annotations here explain all key instructions.

Listing F.17 OPA policy using external data that comes with the request in a JWT

package authz.orders.policy6

default allow = false

certificate = `-----BEGIN CERTIFICATE-----                     
MIICxzCCAa+gAwIBAgIEHP9VkjAN...
-----END CERTIFICATE-----`

allow {
  input.method = "GET"
  input.empid = emp_id
  input.path = ["orders",emp_id]
  token.payload.authorities[_] = "ROLE_USER"
}

token = {"payload": payload} {
  io.jwt.verify_rs256(input.token, certificate)                
  [header, payload, signature] := io.jwt.decode(input.token)   
  payload.exp >= now_in_seconds                                
}

now_in_seconds = time.now_ns() / 1000000000                    

The PEM-encoded certificate of the STS to validate the JWT, which corresponds to the private key that signs the JWT

Verifies the signature of the JWT following the RSA SHA256 algorithm

Decodes the JWT

Checks whether the JWT is expired

Finds the current time in seconds; now_ns() returns time in nanoseconds.

Now you can run the following curl command from the appendix-f/sample01 directory with the input message from listing F.16 (appendix-f/sample01/policy_6_input_ 1.json) to check whether the request is authorized:

> curl -k -v --key keys/client/client.key 
--cert keys/client/client.cert -X POST 
--data-binary @policy_6_input_1.json 
https://localhost:8181/v1/data/authz/orders/policy6

{"result":{"allow":true}}

In listing F.17, to do the JWT validation, we first needed to validate the signature and then check the expiration. OPA has a built-in function, called io.jwt.decode _verify(string, constraints) that validates all in one go.4 For example, you can use this function to validate the signature, expiration (exp), not before use (nbf), audience, issuer, and so on.

F.7.5 Bundle API

To bring in external data to an OPA server under the bundle API approach, first you need to have a bundle server. A bundle server is an endpoint that hosts a bundle. For example, the bundle server can be an AWS S3 bucket or a GitHub repository. A bundle is a gzipped tarball, which carries OPA policies and data files under a well-defined directory structure.5

Once the bundle endpoint is available, you need to update the OPA configuration file with the bundle endpoint, the credentials to access the bundle endpoint (if it’s secured), the polling interval, and so on, and then pass the configuration file as a parameter when you spin up the OPA server.6 Once the OPA server is up, it continuously polls the bundle API to get the latest bundle after each predefined time interval.

If your data changes frequently, you’ll find some drawbacks in using the bundle API. The OPA server polls the bundle API after a predefined time interval, so if you frequently update the policies or data, you could make authorization decisions based on stale data. To fix that, you can reduce the polling time interval, but then again, that will increase the load on the bundle API.

F.7.6 Pull data during evaluation

At the time of this writing, the pull data during evaluation approach is an experimental feature. With this approach, you don’t need to load all the external data into the OPA server’s memory; rather, you pull data as and when needed during the policy evaluation. To implement pull data during evaluation, you need to use the OPA built-in function http.send. To do that, you need to host an API (or a microservice) over HTTP (which is accessible to the OPA server) to accept data requests from the OPA server and respond with the corresponding data.7

F.8 OPA integrations

As we discussed early in this appendix, OPA is a general-purpose policy engine. As a general-purpose policy engine, it can address a large variety of access-control use cases. For example, you can use OPA with Kubernetes and Docker for admission control, with Envoy, Kong, and other popular API gateways for API authorization, with Spinnaker, Boomerang, and Terraform in CI/CD pipelines and with SQLite for data filtering. In this section, we briefly discuss three use cases that are related to a microservices deployment.8

F.8.1 Istio

Istio is a service mesh implementation developed by Google, Lyft, and IBM. It’s open source, and the most popular service mesh implementation at the time of this writing. If you’re new to Istio or service mesh architecture, see appendix K.

Istio introduces a component called Mixer that runs on an Istio control plane (figure F.3). Mixer takes care of precondition checking, quota management, and telemetry reporting. For example, when a request hits the Envoy proxy at the data plane, it talks to the Mixer API to see if it’s OK to proceed with that request. Mixer has a rich plugin architecture, so you can chain multiple plugins in the precondition check phase. For example, you can have a mixer plugin that connects to an external PDP to evaluate a set of access-control policies against the incoming request.


Figure F.3 Istio high-level architecture with a control plane and a data plane

Istio integrates with OPA in two ways: via the OPA Mixer adapter (plugin) and directly with Envoy’s check API. You pick one or the other; there is no need for both. For the Mixer integration, when a request hits the Envoy proxy in the data plane, it does a check API call to Mixer. This API call carries certain attributes with respect to the request (for example, path, headers, and so on). Then Mixer hands over control to the OPA mixer adapter. The OPA mixer adapter, which embeds the OPA engine as an embedded library, does the authorization check against defined policies and returns the decision to Mixer and then to the Envoy proxy.9

For the second style of integration with Istio, OPA runs as a sidecar next to each instance of Envoy. Mixer is not involved at all. When a request hits the Envoy proxy, it asks OPA directly for an authorization decision, providing the same information it would provide to Mixer. OPA makes a decision, and Envoy enforces it. The benefit to this approach is that all decisions are made locally on the same server as the microservice and require no network hops, yielding better availability and performance.

F.8.2 Kubernetes admission controller

The Kubernetes admission controller is a component that’s run in the Kubernetes API server. (In section J.18, we discuss how the Kubernetes internal communication works and the role of an admission controller.) When an API request arrives at the Kubernetes API server, it goes through a set of authentication and authorization plugins and then, finally, the admission controller plugins (figure F.4).


Figure F.4 A request generated by kubectl passes through authentication, authorization, and admission controller plugins of the API server; is validated; and then is stored in etcd. The scheduler and kubelet respond to events generated by the API server.

OPA Gatekeeper is a native integration of OPA into the Kubernetes API server that lets you write policies that are enforced via admission control. It lets you control which Pods, Ingresses, Services, and so on, are allowed on the Kubernetes cluster and how they are individually configured. Common policies include ensuring that all images come from a trusted image registry, prohibiting multiple Ingresses from using the same host, and requiring encryption be used on storage.10

F.8.3 Apache Kafka

In chapter 9, we discuss Kafka under the context of securing reactive microservices. Apache Kafka is the most popular message broker implementation used in microservices deployments. To use OPA for Kafka authorization, you need to engage the OPA Authorizer plugin with Kafka. To authorize a request, the OPA Authorizer plugin talks to a remote OPA server over HTTP.11 In a Kubernetes deployment, you would deploy the OPA server as a sidecar along with Kafka on the same Pod.

F.9 OPA alternatives

Since OPA was introduced in 2016, OPA has become the de facto implementation of fine-grained access control, mostly in the Kubernetes and microservices domains. A couple of alternatives to OPA exist, but at the time of this writing, none of them are as popular as OPA.

One alternative, eXtensible Access Control Markup Language (XACML), is an open standard developed by the Organization for the Advancement of Structured Information Standards (OASIS). The XACML standard introduces a policy language based on XML and a schema based on XML for authorization requests and responses. OASIS released the XACML 1.0 specification in 2003, and at the time of this writing, the latest is XACML 3.0. XACML was popular many years back, but over time, as the popularity of XML-based standards declined, XACML adoption lessened rapidly as well. Also, XACML as a policy language is quite complex, though very powerful. If you’re looking for an open source implementation of XACML 3.0, check the Balana project, which is available at https://github.com/wso2/balana.

Speedle, another open source alternative to OPA, is also a general-purpose authorization engine. Speedle was developed by Oracle and is relatively new. It’s too early to comment on how Speedle competes with OPA, and at the time of this writing, only Oracle Cloud uses Speedle internally. You can find more details on Speedle at https://speedle.io/.


1.This policy is explained at www.openpolicyagent.org/docs/latest/security/.

2.You can find more details about Rego at www.openpolicyagent.org/docs/latest/policy-language/.

3.A detailed discussion of these approaches is documented at www.openpolicyagent.org/docs/latest/external-data/.

4.You can find all the OPA functions to verify JWT at http://mng.bz/aRv9.

5.Details on how to create a bundle are at www.openpolicyagent.org/docs/latest/management/#bundles.

6.Details on these configuration options are documented at www.openpolicyagent.org/docs/latest/ configuration/.

7.Details on how to use http.send and some examples are documented at www.openpolicyagent.org/docs/ latest/policy-reference/#http.

8. You can find more OPA integration use cases at www.openpolicyagent.org/docs/latest/ecosystem/.

9.Details on the OPA Mixer plugin are at https://github.com/istio/istio/tree/master/mixer/adapter/opa.

10.You can find more details on OPA Gatekeeper at https://github.com/open-policy-agent/gatekeeper. How to deploy an OPA Gatekeeper on Kubernetes for a Kubernetes ingress validation is documented at www.openpolicyagent.org/docs/latest/kubernetes-tutorial/.

11.You can find more details on OPA Kafka Authorizer at https://github.com/open-policy-agent/contrib/tree/master/kafka_authorizer.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.129.247.196