In a typical microservices deployment, we can enforce access-control policies in either of the following two locations or both:
The edge of the deployment--Typically, with an API gateway (which we discuss in chapter 5)
The edge of the service--Typically, with a service mesh or with a set of embedded libraries (which we discuss in chapter 7 and chapter 12)
Authorization at the service level enables each service to enforce access-control policies in the way it wants. Typically, you apply coarse-grained access-control policies at the API gateway (at the edge), and more fine-grained access-control policies at the service level. Also, it’s common to do data-level entitlements at the service level. For example, at the edge of the deployment, we can check whether a given user is eligible to perform an HTTP GET
for the Order Processing microservice. But the data entitlement checks, such as only an order admin can view orders having a transaction amount greater than $10,000, are enforced at the service level.
In this appendix, we discuss key components of an access-control system, access-control patterns, and how to define and enforce access-control policies by using Open Policy Agent (OPA). OPA (www.openpolicyagent.org) is an open source, lightweight, general-purpose policy engine with no dependency on microservices. You can use OPA to define fine-grained access-control policies and enforce those policies at different locations across your infrastructure as well as within a microservices deployment. We discussed OPA briefly in chapter 5. In this appendix, we delve deep into the details. We also assume that you’ve already gone through chapters 5, 7, 10, 11, and 12, and have a good understanding of containers, Kubernetes, Istio, and JWT.
In a typical access-control system, we find five key components (figure F.1): the policy administration point (PAP), policy enforcement point (PEP), policy decision point (PDP), policy information point (PIP), and policy store. The PAP is the component that lets policy administrators and developers define access-control policies.
Most of the time, PAP implementations come with their own user interface or expose the functionality via an API. Some access-control systems don’t have a specific PAP; rather, they read policies directly from the filesystem, so you need to use third-party tools to author these policies. Once you define the policies via a PAP, the PAP writes the policies to a policy store. The policy store can be a database, a filesystem, or even a service that’s exposed via HTTP.
The PEP sits between the service/API, which you want to protect, and the client application. At runtime, the PEP intercepts all the communications between the client application and the service. As we discussed in chapter 3, the PEP can be an API gateway, or as we discussed in chapters 7 and 8, it can be some kind of an interceptor embedded into your application itself. And in chapter 12, we discussed how in a service mesh deployment, a proxy can be used as a PEP that intercepts all the requests coming to your microservice.
When the PEP intercepts a request, it extracts certain parameters from the request--such as the user, resource, action, and so on--and creates an authorization request. Then it talks to the PDP to check whether the request is authorized. If it’s authorized, the PDP dispatches the request to the corresponding service or to the API; otherwise, it returns an error to the client application. Before the request hits the PEP, we assume it’s properly authenticated.
When the PEP talks to the PDP to check authorization, the PDP loads all the corresponding policies from the policy store. And while evaluating an authorization request against the applicable policies, if there is any required but missing information, the PDP will talk to a PIP. For example, let’s say we have an access-control policy that says you can buy a beer only if your age is greater than 21, but the authorization request carries only your name as the subject, buy as the action, and beer as the resource. The age is the missing information here, and the PDP will talk to a PIP to find the corresponding subject’s age. We can connect multiple PIPs to a PDP, and each PIP can connect to different data sources.
As we discussed in the introduction to this appendix, OPA is an open source, lightweight, general-purpose policy engine that has no dependency on microservices. You can use OPA to define fine-grained access-control policies and enforce those policies at different locations throughout your infrastructure as well as within a microservices deployment. To define access-control policies, OPA introduces a new declarative language called Rego (www.openpolicyagent.org/docs/latest/policy-language).
OPA started as an open source project in 2016, with a goal to unify policy enforcement across multiple heterogeneous technology stacks. Netflix, one of the early adopters of OPA, uses it to enforce access-control policies in its microservices deployment. Apart from Netflix, Cloudflare, Pinterest, Intuit, Capital One, State Street, and many more use OPA. At the time of this writing, OPA is an incubating project under the Cloud Native Computing Foundation (CNCF).
In this section, we discuss how OPA’s high-level architecture fits into our discussion. As you can see in figure F.2, the OPA engine can run on its own as a standalone deployment or as an embedded library along with an application.
When you run the OPA server as a standalone deployment, it exposes a set of REST APIs that PEPs can connect to and check authorization. In figure F.2, the OPA engine acts as the PDP.
The open source distribution of the OPA server doesn’t come with a policy authoring tool or a user interface to create and publish policies to the OPA server. But you can use a tool like Visual Studio (VS) Code to create OPA policies, and OPA has a plugin for VS Code. If you decide to embed the OPA server (instead of using it as a hosted server) as a library in your application, you can use the Go API (provided by OPA) to interact with it.
Once you have the policies, you can use the OPA API to publish them to the OPA server. When you publish those policies via the API, the OPA engine keeps them in memory only. You’ll need to build a mechanism to publish policies every time the server boots up. The other option is to copy the policy files to the filesystem behind OPA, and the OPA server will pick them up when it boots up. If any policy changes occur, you’ll need to restart the OPA server. However, there is an option to ask the OPA server to load policies dynamically from the filesystem, but that’s not recommended in a production deployment. Also, you can load policies to the OPA server by using a bundle server; we discuss that in detail in section F.7.
OPA has a PIP design to bring in external data to the PDP or to the OPA engine. This model is quite similar to the model we discussed in the previous paragraph with respect to policies. In section F.7, we detail how OPA brings in external data.
In this section, we discuss how to deploy an OPA server as a Docker container. In OPA, there are multiple ways of loading policies. Importantly, OPA stores those policies in memory (there is no persistence), so that on a restart or redeployment, OPA needs a way to reload the policies. For example, when we use OPA for the Kubernetes admission control, policies are persisted in the Kubernetes API server, and OPA has its own sidecar that loads policies via OPA’s REST API. That’s roughly the approach we followed in section 5.3. In using OPA in a microservices deployment, the most common approaches are to either configure OPA to download policies via the bundle API (for example, using AWS’s S3 as the bundle server) or use volume/bind mounts to mount policies into the container running OPA.
With bind mounts, we keep all the policies in a directory in the host filesystem and then mount it to the OPA Docker container filesystem. If you look at the appendix-f/sample01/run_opa.sh file, you’ll find the following Docker command (do not try it as it is). Here, we mount the policies directory from the current location of the host filesystem to the policies directory of the container filesystem under the root:
> docker run --mount type=bind,source="$(pwd)"/policies,target=/policies -p 8181:8181 openpolicyagent/opa:0.15.0 run /policies --server
To start the OPA server, run the following command from the appendix-f/sample01 directory. This loads the OPA policies from the appendix-f/sample01/policies directory (in section F.6, we discuss OPA policies in detail):
> sh run_opa.sh { "addrs":[ ":8181" ], "insecure_addr":"", "level":"info", "msg":"Initializing server.", "time":"2019-11-05T07:19:34Z" }
You can run the following command from the appendix-f/sample01 directory to test the OPA server. The appendix-f/sample01/policy_1_input_1.json file carries the input data for the authorization request in JSON format (in section F.6, we discuss authorization requests in detail):
> curl -v -X POST --data-binary @policy_1_input_1.json http://localhost:8181/v1/data/authz/orders/policy1 {"result":{"allow":true}}
The process of deploying OPA in Kubernetes is similar to deploying any other service on Kubernetes, as we discuss in appendix J. You can check the OPA documentation available at http://mng.bz/MdDD for details.
OPA was designed to run on the same server as the microservice that needs authorization decisions. As such, the first layer of defense for microservice-to-OPA communication is the fact that the communication is limited to localhost. OPA is a host-local cache of the relevant policies authored in the PAP and recorded in the policy store. To make a decision, OPA is often self-contained and can make the decision all on its own without reaching out to other servers.
This means that decisions are highly available and highly performant, for the simple reason that OPA shares a fate with the microservice that needs authorization decisions and requires no network hop for those decisions. Nevertheless, OPA recommends defense in depth and ensuring that communication between it and its microservice or other clients is secured via mTLS.
In this section, we discuss how to protect the OPA server with mTLS. This will ensure all the communications that happen among the OPA server and other client applications are encrypted. Also, only legitimate clients with proper keys can talk to the OPA server. To protect the OPA server with mTLS, we need to accomplish the following tasks:
Sign the public key of the OPA server with the CA’s private key to generate the OPA server’s public certificate
Sign the public key of the OPA client with the CA’s private key to generate the OPA client’s public certificate
To perform all these tasks, we can use the appendix-f/sample01/keys/gen-key.sh script with OpenSSL. Let’s run the following Docker command from the appendix-f/sample01/keys directory to spin up an OpenSSL Docker container. You’ll see that we mount the current location (which is appendix-f/sample01/keys) from the host filesystem to the /export directory on the container filesystem:
> docker run -it -v $(pwd):/export prabath/openssl #
Once the container boots up successfully, you’ll find a command prompt where you can type OpenSSL commands. Let’s run the following command to execute the gen-key.sh file that runs a set of OpenSSL commands:
# sh /export/gen-key.sh
Once this command executes successfully, you’ll find the keys corresponding to the CA in the appendix-f/sample01/keys/ca directory, the keys corresponding to the OPA server in the appendix-f/sample01/keys/opa directory, and the keys corresponding to the OPA client in the appendix-f/sample01/keys/client directory. If you want to understand the exact OpenSSL commands we ran during key generation, check appendix G.
In case you’re already running the OPA server, stop it by pressing Ctrl-C on the corresponding command console. To start the OPA server with TLS support, use the following command from the appendix-f/sample01 directory:
> sh run_opa_tls.sh { "addrs":[ ":8181" ], "insecure_addr":"", "level":"info", "msg":"Initializing server.", "time":"2019-11-05T19:03:11Z" }
You can run the following command from the appendix-f/sample01 directory to test the OPA server. The appendix-f/sample01/policy_1_input_1.json file carries the input data for the authorization request in JSON format. Here we use HTTPS to talk to the OPA server:
> curl -v -k -X POST --data-binary @policy_1_input_1.json https://localhost:8181/v1/data/authz/orders/policy1 {"result":{"allow":true}}
Let’s check what’s in the run_opa_tls.sh script, shown in the following listing. The code annotations in the listing explain what each argument means.
> docker run -v "$(pwd)"/policies:/policies ❶ -v "$(pwd)"/keys:/keys ❷ -p 8181:8181 ❸ openpolicyagent/opa:0.15.0 ❹ run /policies ❺ --tls-cert-file /keys/opa/opa.cert ❻ --tls-private-key-file /keys/opa/opa.key ❼ --server ❽
❶ Instructs the OPA server to load policies from policies directory, which is mounted to the OPA container
❷ The OPA server finds key/certificate for the TLS communication from the keys directory, which is mounted to the OPA container.
❸ Port mapping maps the container port to the host port.
❹ Name of the OPA Docker image
❺ Runs the OPA server by loading policies and data from the policies directory, which is mounted to the OPA container
❻ Certificate used for the TLS communication
❼ Private key used for the TLS communication
❽ Starts the OPA engine under the server mode
Now the communication between the OPA server and the OPA client (curl) is protected with TLS. But still, anyone having access to the OPA server’s IP address can access it over TLS. There are two ways to protect the OPA endpoint for authentication: token authentication and mTLS.
With token-based authentication, the client has to pass an OAuth 2.0 token in the HTTP Authorization header as a bearer token, and you also need to write an authorization policy.1 In this section, we focus on securing the OPA endpoint with mTLS.
If you’re already running the OPA server, stop it by pressing Ctrl-C on the corresponding command console. To start the OPA server enabling mTLS, run the following command from the appendix-f/sample01 directory:
> sh run_opa_mtls.sh
Let’s check what’s in the run_opa_mtls.sh script, shown in the following listing. The code annotations explain what each argument means.
> docker run -v "$(pwd)"/policies:/policies -v "$(pwd)"/keys:/keys -p 8181:8181 openpolicyagent/opa:0.15.0 run /policies --tls-cert-file /keys/opa/opa.cert --tls-private-key-file /keys/opa/opa.key --tls-ca-cert-file /keys/ca/ca.cert ❶ --authentication=tls ❷ --server
❶ The public certificate of the CA. All the OPA clients must carry a certificate signed by this CA.
You can use the following command from the appendix-f/sample01 directory to test the OPA server, which is now secured with mTLS:
> curl -k -v --key keys/client/client.key --cert keys/client/client.cert -X POST --data-binary @policy_1_input_1.json https://localhost:8181/v1/data/authz/orders/policy1
Here, we use HTTPS to talk to the OPA server, along with the certificate and the key generated for the OPA client at the start of this section. The key and the certificate of the OPA client are available in the appendix-f/sample01/keys/client directory.
To define access-control policies, OPA introduces a new declarative language called Rego.2 In this section, we go through a set of OPA policies (listing F.3) to understand the strength of the Rego language. All the policies we discuss here are available in the appendix-f/sample01/policies directory and are already loaded into the OPA server we booted up in section F.5, which is protected with mTLS.
package authz.orders.policy1 ❶ default allow = false ❷ allow { ❸ input.method = "POST" ❹ input.path = "orders" input.role = "manager" } allow { input.method = "POST" input.path = ["orders",dept_id] ❺ input.deptid = dept_id input.role = "dept_manager" }
❶ The package name of the policy. Packages let you organize your policies into modules, just as with programming languages.
❷ By default, all requests are disallowed. If this isn’t set and no allowed rules are matched, OPA returns an undefined decision.
❸ Declares the conditions to allow access to the resource
❹ The Input document is an arbitrary JSON object handed to OPA and includes use-case-specific information. In this example, the Input document includes a method, path, role, and deptid. This condition requires that the method parameter in the input document must be POST.
❺ The value of the path parameter in the input document must match this value, where the value of the dept_id is the deptid parameter from the input document.
The policy defined in listing F.3, which you’ll find in the policy_1.rego file, has two allow
rules. For an allow
rule to return true
, every statement within the allow
block must return true
. The first allow
rule returns true
only if a user with the manager
role is the one doing an HTTP POST
on the orders
resource. The second allow
rule returns true
if a user with the dept_manager
role is the one doing an HTTP POST
on the orders
resource under their own department.
Let’s evaluate this policy with two different input documents. The first is the input document in listing F.4, which you’ll find in the policy_1_input_1.json file. Run the following curl
command from the appendix-f/sample01 directory and it returns true
, because the inputs in the request match with the first allow rule in the policy (listing F.3):
> curl -k -v --key keys/client/client.key --cert keys/client/client.cert -X POST --data-binary @policy_1_input_1.json https://localhost:8181/v1/data/authz/orders/policy1 {"result":{"allow":true}}
{ "input":{ "path":"orders", "method":"POST", "role":"manager" } }
Let’s try with another input document, as shown in listing F.5, which you’ll find in the policy_1_input_2.json file. Run the following curl
command from the appendix-f/sample01 directory and it returns true
, because the inputs in the request match with the second allow rule in the policy (listing F.3). You can see how the response from OPA server changes by changing the values of the inputs:
> curl -k -v --key keys/client/client.key --cert keys/client/client.cert -X POST --data-binary @policy_1_input_2.json https://localhost:8181/v1/data/authz/orders/policy1 {"result":{"allow":true}}
{ "input":{ "path":["orders",1000], "method":"POST", "deptid":1000, "role":"dept_manager" } }
Now let’s have a look at a slightly improved version of the policy in listing F.3. You can find this new policy in listing F.6, and it’s already deployed to the OPA server you’re running. Here, our expectation is that if a user has the manager
role, they will be able to do HTTP PUT
s, POST
s, or DELETE
s on any orders
resource, and if a user has the dept_manager
role, they will be able to do HTTP PUT
s, POST
s, or DELETE
s only on the orders
resource in their own department. Also any user, regardless of the role, should be able to do HTTP GET
s to any orders
resource under their own account. The annotations in the following listing explain how the policy is constructed.
package authz.orders.policy2 default allow = false allow { allowed_methods_for_manager[input.method] ❶ input.path = "orders" input.role = "manager" } allow { allowed_methods_for_dept_manager[input.method] ❷ input.deptid = dept_id input.path = ["orders",dept_id] input.role = "dept_manager" } allow { ❸ input.method = "GET" input.empid = emp_id input.path = ["orders",emp_id] } allowed_methods_for_manager = {"POST","PUT","DELETE"} ❹ allowed_methods_for_dept_manager = {"POST","PUT","DELETE"} ❺
❶ Checks whether the value of the method parameter from the input document is in the allowed_methods_for_manager set
❷ Checks whether the value of the method parameter from the input document is in the allowed_methods_for_dept_manager set
❸ Allows anyone to access the orders resource under their own employee ID
❹ The definition of the allowed_methods_for_manager set
❺ The definition of the allowed_methods_for_dept_manager set
Let’s evaluate this policy with the input document in listing F.7, which you’ll find in the policy_2_input_1.json file. Run the following curl
command from the appendix-f/sample01 directory and it returns true
, because the inputs in the request match with the first allow rule in the policy (listing F.6):
> curl -k -v --key keys/client/client.key --cert keys/client/client.cert -X POST --data-binary @policy_2_input_1.json https://localhost:8181/v1/data/authz/orders/policy2 { "result":{ "allow":true, "allowed_methods_for_dept_manager":["POST","PUT","DELETE"], "allowed_methods_for_manager":["POST","PUT","DELETE"] } }
{ "input":{ "path":"orders", "method":"PUT", "role":"manager" } }
You can also try out the same curl
command as shown here with two other input documents: policy_2_input_2.json and policy_2_input_3.json. You can find these files inside the appendix-f/sample01 directory.
During policy evaluation, sometimes the OPA engine needs access to external data. As we discussed in section F.1 while evaluating an authorization request against the applicable policies, if there is any required but missing information, the OPA server will talk to a PIP (or external data sources). For example, let’s say we have an access-control policy that says you can buy a beer only if your age is greater than 21, but the authorization request carries only your name as the subject, buy as the action, and beer as the resource. The age is the missing information here, and the OPA server will talk to an external data source to find the corresponding subject’s age. In this section, we discuss multiple approaches OPA provides to bring in external data for policy evaluation.3
The push data approach to bring in external data to the OPA server uses the data API provided by the OPA server. Let’s look at a simple example. This is the same example we used in section 5.3. The policy in listing F.8 returns true
if method
, path
, and the set of scopes
in the input message match some data read from an external data source that’s loaded under the package named data.order_policy_data
.
package authz.orders.policy3 ❶ import data.order_policy_data as policies ❷ default allow = false ❸ allow { ❹ policy = policies[_] ❺ policy.method = input.method ❻ policy.path = input.path policy.scopes[_] = input.scopes[_] }
❶ The package name of the policy
❷ Declares the set of statically registered data identified as policies
❸ By default, all requests are disallowed. If this isn’t set and no allowed rules matched, OPA returns an undefined decision.
❹ Iterates over values in the policies array
❺ Declares the conditions to allow access to the resource
❻ For an element in the policies array, checks whether the value of the method parameter in the input matches the method element of the policy
This policy consumes all the external data from the JSON file appendix-f/sample01/order_policy_data.json (listing F.9), which we need to push to the OPA server using the OPA data API. Assuming your OPA server is running on port 8181, you can run the following curl
command from the appendix-f/sample01 directory to publish the data to the OPA server. Keep in mind that here we’re pushing only external data, not the policy. The policy that consumes the data is already on the OPA server, which you can find in the appendix-f/sample01/policies/policy_3.rego file:
> curl -k -v --key keys/client/client.key --cert keys/client/client.cert -H "Content-Type: application/json" -X PUT --data-binary @order_policy_data.json https://localhost:8181/v1/data/order_policy_data
[ { "id": "r1", ❶ "path": "orders", ❷ "method": "POST", ❸ "scopes": ["create_order"] ❹ }, { "id": "r2", "path": "orders", "method": "GET", "scopes": ["retrieve_orders"] }, { "id": "r3", "path": "orders/{order_id}", "method": "PUT", "scopes": ["update_order"] } ]
❶ An identifier for the resource path
❹ To do an HTTP POST to the orders resource, you must have this scope.
Now you can run the following curl
command from the appendix-f/sample01 directory with the input message, which you’ll find in the JSON file appendix-f/sample01/policy_3_input_1.json (in listing F.10) to check if the request is authorized:
> curl -k -v --key keys/client/client.key --cert keys/client/client.cert -X POST --data-binary @policy_3_input_1.json https://localhost:8181/v1/data/authz/orders/policy3 {"result":{"allow":true}}
{ "input":{ "path":"orders", "method":"GET", "scopes":["retrieve_orders"] } }
With the push data approach, you control when you want to push the data to the OPA server. For example, when the external data gets updated, you can push the updated data to the OPA server. This approach, however, has its own limitations. When you use the data API to push external data into the OPA server, the OPA server keeps the data in cache (in memory), and when you restart the server, you need to push the data again. Nevertheless, this is the approach used within the Kubernetes admission control use case, where there is a sidecar running next to OPA that synchronizes the state of OPA with external data.
In this section, we discuss how to load external data from the filesystem. When we start the OPA server, we need to specify from which directory on the filesystem the OPA server should load data files and policies. Let’s have a look at the appendix-f/sample-01/run_opa_mtls.sh shell script, shown in the following listing. The code annotations explain how OPA loads policies from the filesystem at startup.
docker run -v "$(pwd)"/policies:/policies ❶ -v "$(pwd)"/keys:/keys -p 8181:8181 openpolicyagent/opa:0.15.0 run /policies ❷ --tls-cert-file /keys/opa/opa.cert --tls-private-key-file /keys/opa/opa.key --tls-ca-cert-file /keys/ca/ca.cert --authentication=tls --server
❶ A Docker bind mount, which mounts the policies directory under the current path of the host machine to the policies directory of the container filesystem
❷ Runs the OPA server by loading policies and data from the policies directory
The OPA server you already have running has the policy and the data we’re going to discuss in this section. Let’s first check the external data file (order_policy_data_ from_file.json), which is available in the appendix-f/sample01/policies directory. This is the same file you saw in listing F.9 except for a slight change to the file’s structure. You can find the updated data file in the following listing.
{"order_policy_data_from_file" :[ { "id": "p1", "path": "orders", "method": "POST", "scopes": ["create_order"] }, { "id": "p2", "path": "orders", "method": "GET", "scopes": ["retrieve_orders"] }, { "id": "p3", "path": "orders/{order_id}", "method": "PUT", "scopes": ["update_order"] } ] }
You can see in the JSON payload that we have a root element called order_policy _data_from_file
. The OPA server derives the package name corresponding to this data set as data.order_policy_data_from_file
, which is used in the policy in the following listing. This policy is exactly the same as in listing F.8 except the package name has changed.
package authz.orders.polic4 import data.order_policy_data_from_file as policies default allow = false allow { policy = policies[_] policy.method = input.method policy.path = input.path policy.scopes[_] = input.scopes[_] }
Now you can run the following curl
command from the appendix-f/sample01 directory with the input message (appendix-f/sample01/policy_4_input_1.json) from listing F.10 to check whether the request is authorized:
> curl -k -v --key keys/client/client.key --cert keys/client/client.cert -X POST --data-binary @policy_4_input_1.json https://localhost:8181/v1/data/authz/orders/policy4 {"result":{"allow":true}}
One issue with loading data from the filesystem is that when there’s any update, you need to restart the OPA server. There is, however, a configuration option (see appendix-f/sample01/run_opa_mtls_watch.sh) to ask the OPA server to load policies dynamically (without a restart), but that option isn’t recommended for production deployments. In practice, if you deploy an OPA server in a Kubernetes environment, you can keep all your policies and data in a Git repository and use an init container along with the OPA server in the same Pod to pull all the policies and data from Git when you boot up the corresponding Pod. This process is the same as the approach we discussed in section 11.2.7 to load keystores. And when there’s an update to the policies or data, we need to restart the Pods.
The overload approach to bringing in external data to the OPA server uses the input document itself. When the PEP builds the authorization request, it can embed external data into the request. Say, for example, the orders API knows, for anyone wanting to do an HTTP POST
to it, they need to have the create_order
scope. Rather than pre-provisioning all the scope data into the OPA server, the PEP can send it along with the authorization request. Let’s have a look at a slightly modified version of the policy in listing F.8. You can find the updated policy in the following listing.
package authz.orders.policy5 import input.external as policy default allow = false allow { policy.method = input.method policy.path = input.path policy.scopes[_] = input.scopes[_] }
You can see that we used the input.external
package name to load the external data from the input document. Let’s look at the input document in the following listing, which carries the external data with it.
{ "input":{ "path":"orders", "method":"GET", "scopes":["retrieve_orders"], "external" : { "id": "r2", "path": "orders", "method": "GET", "scopes": ["retrieve_orders"] } } }
Now you can run the following curl
command from the appendix-f/sample01 directory with the input message from listing F.15 (appendix-f/sample01/policy_5_input_ 1.json) to check whether the request is authorized:
> curl -k -v --key keys/client/client.key --cert keys/client/client.cert -X POST --data-binary @policy_5_input_1.json https://localhost:8181/v1/data/authz/orders/policy5 {"result":{"allow":true}}
Reading external data from the input document doesn’t work all the time. For example, there should be a trust relationship between the OPA client (or the policy enforcement point) and the OPA server. Next we discuss an alternative for sending data in the input document that requires less trust and is applicable especially for end-user external data.
JSON Web Token (JWT) provides a reliable way of transferring data over the wire between multiple parties in a cryptographically secure way. (If you’re new to JWT, check out appendix B.) OPA provides a way to pass a JWT in the input document. The OPA server can verify the JWT and then read data from it. Let’s go through an example.
First, we need to have an STS that issues a JWT. You can spin up an STS by using the following command. This is the same STS we discussed in chapter 10:
> docker run -p 8443:8443 prabath/insecure-sts-ch10:v1
Here, the STS starts on port 8443. Once it starts, run the following command to get a JWT:
> curl -v -X POST --basic -u applicationid:applicationsecret -H "Content-Type: application/x-www-form-urlencoded;charset=UTF-8" -k -d "grant_type=password&username=peter&password=peter123&scope=foo" https://localhost:8443/oauth/token
In this command, applicationid
is the client ID of the web application, and applicationsecret
is the client secret (which are hardcoded in the STS). If everything works fine, the STS returns an OAuth 2.0 access token, which is a JWT (or a JWS, to be precise):
{ "access_token":"eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9.eyJleHAiOjE1NTEzMTIzNz YsInVzZXJfbmFtZSI6InBldGVyIiwiYXV0aG9yaXRpZXMiOlsiUk9MRV9VU0VSIl0sImp0aSI6I jRkMmJiNjQ4LTQ2MWQtNGVlYy1hZTljLTVlYWUxZjA4ZTJhMiIsImNsaWVudF9pZCI6ImFwcGxp Y2F0aW9uaWQiLCJzY29wZSI6WyJmb28iXX0.tr4yUmGLtsH7q9Ge2i7gxyTsOOa0RS0Yoc2uBuA W5OVIKZcVsIITWV3bDN0FVHBzimpAPy33tvicFROhBFoVThqKXzzG00SkURN5bnQ4uFLAP0NpZ6 BuDjvVmwXNXrQp2lVXl4lQ4eTvuyZozjUSCXzCI1LNw5EFFi22J73g1_mRm2jdEhBp1TvMaRKLB Dk2hzIDVKzu5oj_gODBFm3a1S-IJjYoCimIm2igcesXkhipRJtjNcrJSegBbGgyXHVak2gB7I07 ryVwl_Re5yX4sV9x6xNwCxc_DgP9hHLzPM8yz_K97jlT6Rr1XZBlveyjfKs_XIXgU5qizRm9mt5 xg", "token_type":"bearer", "refresh_token":"", "expires_in":5999, "scope":"foo", "jti":"4d2bb648-461d-4eec-ae9c-5eae1f08e2a2" }
Now you can extract the JWT from the output, which is the value of the access_token
parameter. It’s a bit lengthy, so make sure that you copy the complete string. In listing F.16, you’ll find the input document. There we use the copied value of the JWT as the value of the token
parameter. The listing shows only a part of the JWT, but you can find the complete input document in the appendix-f/sample01/policy_6_input_1.json file.
{ "input":{ "path": ["orders",101], "method":"GET", "empid" : 101, "token" : "eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9... " } }
The following listing shows the policy corresponding to the input document in listing F.16. The code annotations here explain all key instructions.
package authz.orders.policy6 default allow = false certificate = `-----BEGIN CERTIFICATE----- ❶ MIICxzCCAa+gAwIBAgIEHP9VkjAN... -----END CERTIFICATE-----` allow { input.method = "GET" input.empid = emp_id input.path = ["orders",emp_id] token.payload.authorities[_] = "ROLE_USER" } token = {"payload": payload} { io.jwt.verify_rs256(input.token, certificate) ❷ [header, payload, signature] := io.jwt.decode(input.token) ❸ payload.exp >= now_in_seconds ❹ } now_in_seconds = time.now_ns() / 1000000000 ❺
❶ The PEM-encoded certificate of the STS to validate the JWT, which corresponds to the private key that signs the JWT
❷ Verifies the signature of the JWT following the RSA SHA256 algorithm
❹ Checks whether the JWT is expired
❺ Finds the current time in seconds; now_ns() returns time in nanoseconds.
Now you can run the following curl
command from the appendix-f/sample01 directory with the input message from listing F.16 (appendix-f/sample01/policy_6_input_ 1.json) to check whether the request is authorized:
> curl -k -v --key keys/client/client.key --cert keys/client/client.cert -X POST --data-binary @policy_6_input_1.json https://localhost:8181/v1/data/authz/orders/policy6 {"result":{"allow":true}}
In listing F.17, to do the JWT validation, we first needed to validate the signature and then check the expiration. OPA has a built-in function, called io.jwt.decode _verify(string,
constraints)
that validates all in one go.4 For example, you can use this function to validate the signature, expiration (exp), not before use (nbf), audience, issuer, and so on.
To bring in external data to an OPA server under the bundle API approach, first you need to have a bundle server. A bundle server is an endpoint that hosts a bundle. For example, the bundle server can be an AWS S3 bucket or a GitHub repository. A bundle is a gzipped tarball, which carries OPA policies and data files under a well-defined directory structure.5
Once the bundle endpoint is available, you need to update the OPA configuration file with the bundle endpoint, the credentials to access the bundle endpoint (if it’s secured), the polling interval, and so on, and then pass the configuration file as a parameter when you spin up the OPA server.6 Once the OPA server is up, it continuously polls the bundle API to get the latest bundle after each predefined time interval.
If your data changes frequently, you’ll find some drawbacks in using the bundle API. The OPA server polls the bundle API after a predefined time interval, so if you frequently update the policies or data, you could make authorization decisions based on stale data. To fix that, you can reduce the polling time interval, but then again, that will increase the load on the bundle API.
At the time of this writing, the pull data during evaluation approach is an experimental feature. With this approach, you don’t need to load all the external data into the OPA server’s memory; rather, you pull data as and when needed during the policy evaluation. To implement pull data during evaluation, you need to use the OPA built-in function http.send
. To do that, you need to host an API (or a microservice) over HTTP (which is accessible to the OPA server) to accept data requests from the OPA server and respond with the corresponding data.7
As we discussed early in this appendix, OPA is a general-purpose policy engine. As a general-purpose policy engine, it can address a large variety of access-control use cases. For example, you can use OPA with Kubernetes and Docker for admission control, with Envoy, Kong, and other popular API gateways for API authorization, with Spinnaker, Boomerang, and Terraform in CI/CD pipelines and with SQLite for data filtering. In this section, we briefly discuss three use cases that are related to a microservices deployment.8
Istio is a service mesh implementation developed by Google, Lyft, and IBM. It’s open source, and the most popular service mesh implementation at the time of this writing. If you’re new to Istio or service mesh architecture, see appendix K.
Istio introduces a component called Mixer that runs on an Istio control plane (figure F.3). Mixer takes care of precondition checking, quota management, and telemetry reporting. For example, when a request hits the Envoy proxy at the data plane, it talks to the Mixer API to see if it’s OK to proceed with that request. Mixer has a rich plugin architecture, so you can chain multiple plugins in the precondition check phase. For example, you can have a mixer plugin that connects to an external PDP to evaluate a set of access-control policies against the incoming request.
Istio integrates with OPA in two ways: via the OPA Mixer adapter (plugin) and directly with Envoy’s check API. You pick one or the other; there is no need for both. For the Mixer integration, when a request hits the Envoy proxy in the data plane, it does a check
API call to Mixer. This API call carries certain attributes with respect to the request (for example, path, headers, and so on). Then Mixer hands over control to the OPA mixer adapter. The OPA mixer adapter, which embeds the OPA engine as an embedded library, does the authorization check against defined policies and returns the decision to Mixer and then to the Envoy proxy.9
For the second style of integration with Istio, OPA runs as a sidecar next to each instance of Envoy. Mixer is not involved at all. When a request hits the Envoy proxy, it asks OPA directly for an authorization decision, providing the same information it would provide to Mixer. OPA makes a decision, and Envoy enforces it. The benefit to this approach is that all decisions are made locally on the same server as the microservice and require no network hops, yielding better availability and performance.
The Kubernetes admission controller is a component that’s run in the Kubernetes API server. (In section J.18, we discuss how the Kubernetes internal communication works and the role of an admission controller.) When an API request arrives at the Kubernetes API server, it goes through a set of authentication and authorization plugins and then, finally, the admission controller plugins (figure F.4).
OPA Gatekeeper is a native integration of OPA into the Kubernetes API server that lets you write policies that are enforced via admission control. It lets you control which Pods, Ingresses, Services, and so on, are allowed on the Kubernetes cluster and how they are individually configured. Common policies include ensuring that all images come from a trusted image registry, prohibiting multiple Ingresses from using the same host, and requiring encryption be used on storage.10
In chapter 9, we discuss Kafka under the context of securing reactive microservices. Apache Kafka is the most popular message broker implementation used in microservices deployments. To use OPA for Kafka authorization, you need to engage the OPA Authorizer plugin with Kafka. To authorize a request, the OPA Authorizer plugin talks to a remote OPA server over HTTP.11 In a Kubernetes deployment, you would deploy the OPA server as a sidecar along with Kafka on the same Pod.
Since OPA was introduced in 2016, OPA has become the de facto implementation of fine-grained access control, mostly in the Kubernetes and microservices domains. A couple of alternatives to OPA exist, but at the time of this writing, none of them are as popular as OPA.
One alternative, eXtensible Access Control Markup Language (XACML), is an open standard developed by the Organization for the Advancement of Structured Information Standards (OASIS). The XACML standard introduces a policy language based on XML and a schema based on XML for authorization requests and responses. OASIS released the XACML 1.0 specification in 2003, and at the time of this writing, the latest is XACML 3.0. XACML was popular many years back, but over time, as the popularity of XML-based standards declined, XACML adoption lessened rapidly as well. Also, XACML as a policy language is quite complex, though very powerful. If you’re looking for an open source implementation of XACML 3.0, check the Balana project, which is available at https://github.com/wso2/balana.
Speedle, another open source alternative to OPA, is also a general-purpose authorization engine. Speedle was developed by Oracle and is relatively new. It’s too early to comment on how Speedle competes with OPA, and at the time of this writing, only Oracle Cloud uses Speedle internally. You can find more details on Speedle at https://speedle.io/.
1.This policy is explained at www.openpolicyagent.org/docs/latest/security/.
2.You can find more details about Rego at www.openpolicyagent.org/docs/latest/policy-language/.
3.A detailed discussion of these approaches is documented at www.openpolicyagent.org/docs/latest/external-data/.
4.You can find all the OPA functions to verify JWT at http://mng.bz/aRv9.
5.Details on how to create a bundle are at www.openpolicyagent.org/docs/latest/management/#bundles.
6.Details on these configuration options are documented at www.openpolicyagent.org/docs/latest/ configuration/.
7.Details on how to use http.send
and some examples are documented at www.openpolicyagent.org/docs/ latest/policy-reference/#http.
8. You can find more OPA integration use cases at www.openpolicyagent.org/docs/latest/ecosystem/.
9.Details on the OPA Mixer plugin are at https://github.com/istio/istio/tree/master/mixer/adapter/opa.
10.You can find more details on OPA Gatekeeper at https://github.com/open-policy-agent/gatekeeper. How to deploy an OPA Gatekeeper on Kubernetes for a Kubernetes ingress validation is documented at www.openpolicyagent.org/docs/latest/kubernetes-tutorial/.
11.You can find more details on OPA Kafka Authorizer at https://github.com/open-policy-agent/contrib/tree/master/kafka_authorizer.
3.129.247.196