Working with Data and External APIs

If you develop web-based APIs, or just consume them, there are several tools that can make your life a little easier. The jq command is a command-line utility for parsing JSON responses. The HTTPie program makes reading results easier, and the Siege program lets you test how responsive your APIs are.

In Working with Web APIs, you used json-server[31] as a fake web API. You’ll use that in this section, too, along with the data.json file you created in that section. If you don’t have the file handy, create it again:

 $ ​​cat​​ ​​<<​​ ​​'EOF'​​ ​​>​​ ​​~/data.json
 >​​ ​​{
 >​​ ​​"notes"​​:​​ ​​[
 >​​ ​​{​​ ​​"id"​​:​​ ​​1,​​ ​​"title"​​:​​ ​​"Hello"​​ ​​}
 >​​ ​​]
 >​​ ​​}
 >​​ ​​EOF

Then start up json-server with npx and tell it to watch the data.json file:

 $ ​​npx​​ ​​json-server​​ ​​-w​​ ​​~/data.json

After the server starts, open a new terminal window and use curl to add some new notes:

 $ ​​curl​​ ​​-X​​ ​​POST​​ ​​http://localhost:3000/notes​​
 >​​ ​​-H​​ ​​'Content-type: application/json'​​
 >​​ ​​-d​​ ​​'{"title": "This is another note."}'
 
 $ ​​curl​​ ​​-X​​ ​​POST​​ ​​http://localhost:3000/notes​​
 >​​ ​​-H​​ ​​'Content-type: application/json'​​
 >​​ ​​-d​​ ​​'{"title": "This is a third note."}'

Now, access http://localhost:3000/notes and verify that you see all three notes:

 $ ​​curl​​ ​​http://localhost:3000/notes
 [
  {
  "id": 1,
  "title": "Hello"
  },
  {
  "title": "This is another note.",
  "id": 2
  },
  {
  "title": "This is a third note.",
  "id": 3
  }
 ]

With the server running and displaying data, let’s look at some tools for accessing it.

Manipulating JSON with jq

The jq command-line program lets you format and manipulate JSON data. It’s perfect when exploring APIs from the command line. You provide jq some JSON to process and a filter to apply. It transforms the JSON and displays the results.

Install jq from the package manager. On Ubuntu, use apt:

 $ ​​sudo​​ ​​apt​​ ​​install​​ ​​jq

On macOS, use brew:

 $ ​​brew​​ ​​install​​ ​​jq

One of the most common uses is to “pretty-print” JSON results from curl. When you execute curl http://localhost:3000/notes, the results are already nicely formatted. But not every API returns formatted results.

Open Notify[32] has several free APIs you can query to find information about the International Space Station. If you use curl to request data from this endpoint, which displays the current location of the ISS, the output might not be very readable:

 $ ​​curl​​ ​​http://api.open-notify.org/iss-now.json
 {"timestamp": 1556777805, "iss_position": {"latitude": "29.0539",
 "longitude": "-58.5050"}, "message": "success"}

But jq can format the output. Take the output from curl and pipe it to the jq command, applying the “dot” filter, which doesn’t do any processing:

 $ ​​curl​​ ​​http://api.open-notify.org/iss-now.json​​ ​​|​​ ​​jq​​ ​​'.'
 ...
 {
  "timestamp": 1556777874,
  "iss_position": {
  "latitude": "32.1566",
  "longitude": "-55.0392"
  },
  "message": "success"
 }

That by itself is incredibly helpful, but it’s not very nice to repeatedly hit an external API when you’re testing something. You should make a local cache of this data if you want to play with it more. When you make that local cache, you can format the text too. If you combine the previous command with output redirection, you can save this nicely formatted output to a local file:

 $ ​​curl​​ ​​http://api.open-notify.org/iss-now.json​​ ​​|​​ ​​jq​​ ​​'.'​​ ​​>​​ ​​space.json

Now you can use space.json locally instead of pounding on someone else’s API:

 $ ​​cat​​ ​​space.json
 {
  "timestamp": 1556777915,
  "iss_position": {
  "latitude": "33.9239",
  "longitude": "-52.8864"
  },
  "message": "success"
 }

Let’s go back to your JSON API of notes to explore a few other features. First, you might have noticed that when you fetch data from the JSON API, the first record has the id field first, followed by the title field, but the other records have the title field first, like this:

 $ ​​curl​​ ​​http://localhost:3000/notes
 [
  {
  "id": 1,
  "title": "Hello"
  },
  {
  "title": "This is another note.",
  "id": 2
  },
  ...

If you use the -S argument, jq will sort the fields in the output, which puts each record’s fields in the same order:

 $ ​​curl​​ ​​localhost:3000/notes​​ ​​|​​ ​​jq​​ ​​-S​​ ​​'.'
 [
  {
  "id": 1,
  "title": "Hello"
  },
  {
  "id": 2,
  "title": "This is another note."
  },
  {
  "id": 3,
  "title": "This is a third note."
  }
 ]

This is a handy way of cleaning up output.

You can use jq to extract little bits of data from the API. To display just the first note, use the filter .[0], like this:

 $ ​​curl​​ ​​localhost:3000/notes​​ ​​|​​ ​​jq​​ ​​'.[0]'
 {
  "id": 1,
  "title": "Hello"
 }

Constantly using curl to re-fetch this data is getting a little repetitive—and like with the Open Notify API, it’s not a great practice to keep hitting an API unless you are looking for changes. Save the notes data to a local file:

 $ ​​curl​​ ​​localhost:3000/notes​​ ​​>​​ ​​notes.json

To use jq with a local file, pass the file as the last argument, after you specify the filter. Test it out by retrieving the first note from the data.json file:

 $ ​​jq​​ ​​'.[0]'​​ ​​notes.json
 {
  "id": 1,
  "title": "Hello"
 }

Many times you’re not interested in all of the data in a record. You can use jq to filter out fields as well as records.

Execute this command to tell jq to return all of the results, but only show the title:

 $ ​​jq​​ ​​'.[] | {title: .title}'​​ ​​notes.json
 {
  "title": "Hello"
 }
 {
  "title": "This is another note."
 }
 {
  "title": "This is a third note."
 }

Here, you’re telling jq to grab all of the records, then you’re piping the results to another filter which runs on each record.

With this filter, you’re not just specifying what you want, you’re specifying that you want it to have the label title. But you can change that. Use this filter to get all of the note titles, but change the label for each title from title to Name:

 $ ​​jq​​ ​​'.[] | {Name: .title}'​​ ​​notes.json
 
 {
  "Name": "Hello"
 }
 {
  "Name": "This is another note."
 }
 {
  "Name": "This is a third note."
 }

This is handy if you need to take data from one source and merge it into another with different keys.

And if you just want the values of the fields, you can extract those:

 $ ​​jq​​ ​​'.[] | .title'​​ ​​notes.json
 "Hello"
 "This is another note."
 "This is a third note."

The output has quotes around each entry, as jq has preserved it as JSON strings. But if you use the -r switch, jq prints out raw output instead:

 $ ​​jq​​ ​​-r​​ ​​'.[] | .title'​​ ​​notes.json
 Hello
 This is another note.
 This is a third note.

The filter .[] | .title is taking every row and extracting the title. If you provide a specific index, you can extract a single row. Grab the second note’s title from the file:

 $ ​​jq​​ ​​-r​​ ​​'.[1] | .title'​​ ​​notes.json
 This is another note.

Remember, the indexes are zero-based.

This is a small example of what you can do with jq. You have a data.json file and a space.json file. Try to use jq to transform and manipulate the data in these files. Then, try it out with other JSON data sources.

Inspecting Requests with HTTPie

curl is a fine multipurpose tool for interacting with web APIs. But it’s a little cumbersome if you’re going to do it a lot. HTTPie is an alternative that makes working with web requests a little easier.

Install it through the package manager. On Ubuntu, use apt:

 $ ​​sudo​​ ​​apt​​ ​​install​​ ​​httpie

And on macOS, use brew:

 $ ​​brew​​ ​​install​​ ​​httpie

HTTPie supports downloading files, submitting forms, and even interacting with web apps that use sessions. But it really shines when making API requests.

Use it to get the list of notes from the API:

 $ ​​http​​ ​​localhost:3000/notes
 HTTP/1.1 200 OK
 Access-Control-Allow-Credentials: true
 Cache-Control: no-cache
 Connection: keep-alive
 Content-Length: 163
 Content-Type: application/json; charset=utf-8
 Date: Mon, 04 Mar 2019 04:21:25 GMT
 ETag: W/"a3-/ia+coieeQEhK+/GBIyl5YP/4Ns"
 Expires: -1
 Pragma: no-cache
 Vary: Origin, Accept-Encoding
 X-Content-Type-Options: nosniff
 X-Powered-By: Express
 
 [
  {
  "id": 1,
  "title": "Hello"
  },
  {
  "id": 2,
  "title": "This is another note."
  },
  {
  "id": 3,
  "title": "This is a third note."
  }
 ]

By default, HTTPie shows the headers and the response all at once. The command itself looks almost like what you’d type in a web browser.

If you only want to see the body, use the -b switch:

 $ ​​http​​ ​​-b​​ ​​localhost:3000/notes

Let’s add a fourth note. Here’s what the command would look like if you used curl to do that. Don’t run this yet though:

 $ ​​curl​​ ​​-X​​ ​​POST​​ ​​http://localhost:3000/notes​​
 >​​ ​​-H​​ ​​"Content-type: application/json"​​
 >​​ ​​-d​​ ​​'{"title": "This is a fourth note."}'

Instead, run this command to use HTTPie to create the new note:

 $ ​​http​​ ​​POST​​ ​​localhost:3000/notes​​ ​​title=​​"This is a fourth note."
 HTTP/1.1 201 Created
 ...
 
 {
  "id": 4,
  "title": "This is a fourth note."
 }

You don’t need to specify headers or content type. Most web APIs use JSON now, so HTTPie just assumes that’s what you want to do by default.

Updating the note in the API is a nearly identical command. Use PUT for the HTTP method and use /notes/4 as the endpoint:

 $ ​​http​​ ​​PUT​​ ​​localhost:3000/notes/4​​ ​​title=​​"This is the fourth note."
 HTTP/1.1 200 OK
 ...
 
 {
  "id": 4,
  "title": "This is the fourth note."
 }

You can remove the note with the DELETE method.

Finally, it’s often helpful to see the request that HTTPie is sending. If you use the -v switch, you’ll see the request headers too. Create a fifth note using the -v switch:

 $ ​​http​​ ​​-v​​ ​​POST​​ ​​localhost:3000/notes​​ ​​title=​​"This is a fifth note."
 POST /notes HTTP/1.1
 Accept: application/json
 Accept-Encoding: gzip, deflate
 Connection: keep-alive
 Content-Length: 34
 Content-Type: application/json
 Host: localhost:3000
 User-Agent: HTTPie/0.9.2
 
 {
  "title": "This is a fifth note."
 }
 ...

Testing Performance with Siege

If you’re building a web app, you’ve probably wondered how well it performs under load. There are several tools that can help you figure this out, but Seige is one of the most flexible. It has an interface similar to curl, and you can automate it.

Don’t run Siege against servers you don’t manage. Use it to stress-test your own stuff. You’ll use the Notes API for this.

Install Siege on Ubuntu with apt:

 $ ​​sudo​​ ​​apt​​ ​​install​​ ​​siege

Or install it on macOS with brew:

 $ ​​brew​​ ​​install​​ ​​siege

To use Siege with its default values, give it a URL:

 $ ​​siege​​ ​​http://localhost:3000/notes
 New configuration template added to /home/brian/.siege
 Run siege -C to view the current settings in that file
 ** SIEGE 4.0.4
 ** Preparing 25 concurrent users for battle.
 The server is now under siege...

Siege makes concurrent connections to your API over and over until you tell it to stop by pressing Ctrl+c.

When you do, Siege generates a report on the screen:

 Lifting the server siege...
 Transactions: 9762 hits
 Availability: 100.00 %
 Elapsed time: 13.95 secs
 Data transferred: 2.64 MB
 Response time: 0.04 secs
 Transaction rate: 699.78 trans/sec
 Throughput: 0.19 MB/sec
 Concurrency: 24.93
 Successful transactions: 9762
 Failed transactions: 0
 Longest transaction: 0.10
 Shortest transaction: 0.01

It shows how many transactions it was able to make before it stopped, and it shows the success rate. In this example, Siege received successful responses 100 percent of the time. If this number was lower, you could look at the Successful transactions and Failed transactions output.

Sometimes people will report that their connections took a long time. Siege shows you the times of the longest and shortest transactions. This can help you verify that some resources might have trouble under stress.

Siege uses a configuration file to control its options. It created one for you at ~/.siege/siege.conf on your first run.

Siege can keep a log of your runs, but the default logfile location is in the /var directory, and you’d need to modify permissions there or run siege as a superuser to write results there.

So, enable logging and change the log location to your home directory. Set the value of the logfile entry in the configuration file to $HOME/siege.log and set the value of logging to true. You can do this with nano, or you can do it quickly with sed, which you learned how to use in Chapter 5, Streams of Text. To avoid escaping slashes in pathnames, use a pipe as the delimiter for your expression:

 $ ​​sed​​ ​​-i​​ ​​-e​​ ​​'s|^# logfile =|logfile = $HOME/siege.log|'​​ ​​~/.siege/siege.conf
 $ ​​sed​​ ​​-i​​ ​​-e​​ ​​'s|^logging = false|logging = true|'​​ ​​~/.siege/siege.conf

Use siege -C to verify the changes:

 $ ​​siege​​ ​​-C
 ...
 URLs file: /etc/siege/urls.txt
 thread limit: 255
»logging: true
»log file: /home/brian/siege.log
 resource file: /home/brian/.siege/siege.conf
 ...

Now you can look in ~/siege.log for your results.

Like curl and HTTPie, you can use Siege to make JSON requests, which means you can test the more intense parts of your web application.

Try this out: Tell Siege to make 50 concurrent connections update the fifth note’s title:

 $ ​​siege​​ ​​--concurrent=50​​
 >​​ ​​--time=10s​​
 >​​ ​​--content-type=​​"application/json"​​
 >​​ ​​'http://localhost:3000/notes/5 PUT {"title":"A fifth note."}'

Finally, you can use Siege to simulate real Internet users. Siege can load a list of URLs from a file and hit them randomly.

Create a file named urls.txt and add some URLs to the file that hit various endpoints in the notes API:

 $ ​​cat​​ ​​<<​​ ​​'EOF'​​ ​​>​​ ​​urls.txt
 >​​ ​​http://localhost:3000/notes
 >​​ ​​http://localhost:3000/notes/1
 >​​ ​​http://localhost:3000/notes/2
 >​​ ​​http://localhost:3000/notes/3
 >​​ ​​http://localhost:3000/notes/4
 >​​ ​​EOF

Execute this command to have 50 concurrent users hit these URLs randomly:

 $ ​​siege​​ ​​--concurrent=50​​ ​​--delay=5​​ ​​--internet​​ ​​--file=urls.txt​​ ​​--time=1m

The --internet switch tells Siege to have each connection grab a random entry from the urls.txt file. The --delay switch tells Siege to make each connection wait for a random duration from zero to the specified number. With these options, you can simulate more realistic traffic.

Siege is a powerful tool for ensuring your applications hold up under pressure.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.151.32