PuppetDB uses a Command/Query Responsibility Separation (CQRS) pattern:
APIs are versioned (v1, v2, v3...). The most recent ones add functionalities and try to keep backwards compatibility.
The URL for queries is structured like this:
http[s]://<server>:<port>/pdb/query/<version>/<endpoint>?query=<query>
Available endpoints for queries are: nodes
, environments
, factsets
, facts
, fact-names
, fact-paths
, fact-contents
, catalogs
, edges
, resources
, reports
, events
, event-counts
, aggregate-event-counts
, metrics
, server-time
, and version
.
Query strings are URL-encoded JSON arrays in prefix notation, which makes them look a bit unusual. The general format is as follows:
[ "<operator>" , "<field>" , "<value>" ]
The comparison operators are: =
, >=
, >
, <
, <=
and ~
(regexp matching). Some examples are as follows:
["=", "type", "Service"] [">=", "timestamp", "2013-12-18T14:00:00"] ["~", "certname", "www\d+\.example\.com"]
The expressions can be combined with and
, not
, and or
. An example (here split over multiple lines for clarity) is as follows:
[ "and", ["=", "type", "File"], ["=", "title", "/etc/hosts" ] ]
It's possible to build complex subqueries using the in
operator, the extract
statement, and subqueries such as select-resources
or select-facts
. An example usable on the /facts
endpoint to return the IPs of all the nodes that have an Apache service is as follows:
["and", ["=", "name", "ipaddress"], ["in", "certname", ["extract", "certname", ["select-resources", ["and", ["=", "type", "Service"], ["=", "title", "apache"] ] ] ] ] ]
Since version 3 of API, it has been possible to paginate and sort the results of queries. Each endpoint may support one or more query parameters: order-by
, limit
, include-total
, offset
, and so on.
It's quite easy to query PuppetDB directly with curl
; following is the simplest example, with curl
executed on HTTP on the same PuppetDB host:
curl http://localhost:8080/pdb/query/v4/nodes/web01.example.com
Note the URL to a specific endpoint (facts
), the API version (v4
), and the specific client certname.
When we have to use queries, we must URL encode characters such as [
and ]
, and for this we can use curl's –data-urlencode
option. When we use it, we have to specify to use the -X GET
option (otherwise a POST would be done):
curl -X GET 'http://localhost:8080/pdb/query/v4/events' –data-urlencode 'query=["=", "certname" , "puppet.example.com"]'
The response, in JSON array format (note the starting and ending square brackets [ ]
), contains one or more entries like this:
[ { "new_value" : "{md5}be99db88f4c07058843ea356eb3469bf", "report" : "2331579061f83db1a35e7579a83a671f011e07fa", "run_start_time" : "2016-03-19T21:17:26.790Z", "property" : "content", "file" : "/etc/puppetlabs/code/environments/production/modules/puppetdb/manifests/master/routes.pp", "old_value" : "{md5}d13e1f5c099082afbe8a5ed9d4695beb", "containing_class" : "Puppetdb::Master::Routes", "line" : 38, "resource_type" : "File", "status" : "success", "configuration_version" : "1458422249", "resource_title" : "/etc/puppetlabs/puppet/routes.yaml", "environment" : "production", "timestamp" : "2016-03-19T21:17:39.138Z", "run_end_time" : "2016-03-19T21:17:36.705Z", "report_receive_time" : "2016-03-19T21:18:38.350Z", "containment_path" : [ "Stage[main]", "Puppetdb::Master::Routes", "File[/etc/puppetlabs/puppet/routes.yaml]" ], "certname" : "puppet.example.com", "message" : "content changed '{md5}d13e1f5c099082afbe8a5ed9d4695beb' to '{md5}be99db88f4c07058843ea356eb3469bf'" } ]
Have a look at some of the most interesting fields: timestamp
, certname
, resource-title
, resource-type
, property
, file
and line
. Note that the name and kind of the fields may vary according to the endpoint used (for example, on other endpoints we have title
and type
instead of resource-title
and resource-type
).
It's recommended to experiment with test queries on various endpoints, such as the ones listed later in this chapter, to have a better idea of the kind and name of fields returned.
When we make requests over HTTPS we have to reference the certificates' files to use:
$ curl 'https://puppetdb:8081/pdb/query/v4/facts/web01.example.com'
--cacert /var/lib/puppet/ssl/certs/ca.pem
--cert /var/lib/puppet/ssl/certs/<node>.pem
--key /var/lib/puppet/ssl/private_keys/<node>.pem
Explicit commands are used (via HTTP URL-encoded POST to the /commands
URL) to populate and modify data.
The available commands on PuppetDB are:
replace catalog
: Replaces the stored catalog of a node. Currently PuppetDB stores only the last catalog compiled by the Puppet Master for each node.replace facts
: Replaces the stored facts of a node. Also, in this case, only the ones received from the latest Puppet run are kept.store report
: Saves the last report of a node's Puppet run (if reporting to PuppetDB is enabled). The configuration parameter report-ttl
manages their retention (by default 14 days).deactivate node
: Deactivates a decommissioned node so that its exported resources can't be collected anywhere. A node is reactivated if a new Puppet run is done on it.On /var/log/puppetdb/puppetdb.log
, all the executed commands are shown.
When the Puppet Master receives a client's facts, it immediately submits them to PuppetDB:
2016-03-19 21:23:14,780 INFO [p.p.command] [51ab082d-a04f-4b11-a88e-ab38adc248d7] [replace facts] web01.example.com
Then the catalog is compiled, sent to the client, and stored on PuppetDB:
2016-03-19 21:23:26,050 INFO [p.p.command] [076e6a53-5d92-44a3-a550-05d9a99114fe] [replace catalog] web01.example.com
Finally, when the report of the Puppet run is received from the client, the Puppet Master submits it to PuppetDB:2016-03-19 21:23:31,247 INFO [p.p.command] [ceff648b-f67d-44db-89c5-e0f9d1e936c4] [store report] puppet v4.3.1 – web01.example.com
3.149.247.159