Among the most important features of Gatsby is its ability to retrieve and handle data from a variety of disparate sources, like WordPress, Shopify, other GraphQL APIs external to Gatsby, and the local filesystem. Through its plugin ecosystem, Gatsby makes available a wide spectrum of backend services from which to pull data into a Gatsby site.
In Gatsby’s data layer, source plugins are responsible for retrieving data either internally from a local filesystem or externally from APIs, databases, third-party services, and especially content management and commerce systems. Regardless of what they’re responsible for, source plugins can be combined arbitrarily as part of Gatsby’s data layer, which contains data originating from many different sources. In this chapter, we’ll explore source plugins and how to use them to derive a range of data from the systems you want to pull from.
Source plugins are similar to the other Gatsby plugins you’ve seen earlier in this book. Unlike plugins that govern CSS or features like analytics, however, source plugins serve as the intermediary between a data source, such as a local filesystem or an external service, and the Gatsby site presenting that data. They are Gatsby’s canonical data retrieval system for data beyond that provided within the pages directory.
When you run the gatsby develop
or gatsby build
command, your source plugins will issue queries against the data source to retrieve the desired data. Gatsby will then populate its GraphQL API with the data retrieved and make it available to any page or component within the Gatsby site, as well as to other Gatsby plugins.
The Gatsby Plugin Library contains both officially maintained and community plugins.
The installation process for source plugins is the same as for other plugins, as we’ve seen in the previous chapters. In the Gatsby plugin ecosystem, feature plugins use the prefix gatsby-plugin-
, while source plugins have the prefix gatsby-source-
. Installing a source plugin requires executing the same command used for other plugins, where {source-name}
is the unique identifier of the plugin:
# If using NPM $ npm install --save gatsby-source-{source-name} # If using Yarn $ yarn add gatsby-source-{source-name}
From this point forward, in most cases only NPM installation scripts will be included in the text for brevity, but you can use either NPM or Yarn to manage your dependencies. For more information about how to migrate from NPM to Yarn, including Yarn equivalents for NPM commands, consult the Yarn documentation’s migration guide.
All source plugins share a required initial step, just like all other Gatsby plugins. After installation, for Gatsby to recognize and enable the plugin’s functionality you’ll need to add it to your gatsby-config.js file. Whenever you complete installation of a new source plugin, open this file and add the new member to the plugins
array, as follows:
module
.
exports
=
{
siteMetadata
:
{
title
:
`
My Awesome Gatsby Site
`
,
}
,
plugins
:
[
{
resolve
:
`
gatsby-source-
{source-name}
`
,
}
,
]
,
}
Many plugins only require the resolve
key in their object, but source plugins are different. We must explicitly define where the data is coming from and provide any additional information that is required for us to be able to access that data.
Every source plugin also includes an options
object that identifies these key inputs (the source URL, API version, API token, etc.), and each source plugin’s documentation identifies what information should be supplied to the options
object. Consider this example from a gatsby-config.js file that defines options for gatsby-source-filesystem
:
module
.
exports
=
{
siteMetadata
:
{
title
:
`My Awesome Gatsby Site`
,
},
plugins
:
[
{
resolve
:
`gatsby-source-filesystem`
,
options
:
{
name
:
`src`
,
path
:
`
${
__dirname
}
/src/`
,
},
},
],
}
To get a sense for what we need to supply in the options
object, let’s turn our attention to some commonly used source plugins in Gatsby. We’ll first look at how to source data from the surrounding filesystem, before moving on to source plugins for databases, third-party services, and other software systems and APIs.
Many data sources, particularly CMSs and database systems, require some form of authentication token in order to access their data. This is sensitive information that you may wish not to expose to public eyes, particularly if you are using a source repository that is publicly accessible on a platform like GitHub.
Environment variables are variables that can be injected at certain points in the application depending on the environment in which code using them is executed. They are the primary way in which sensitive credentials such as authentication tokens can be used by Gatsby without being revealed publicly. Some data sources will have their own best practices for handling external authentication by source plugins, but environment variables are the most common mechanism.
Consider a hypothetical source plugin with one option, authToken
, which represents a sensitive credential:
module
.
exports
=
{
siteMetadata
:
{
title
:
`My Awesome Gatsby Site`
,
},
plugins
:
[
{
resolve
:
`gatsby-source-mydatasource`
,
options
:
{
authToken
:
`sensitive-token`
},
},
],
}
Instead of placing the value for that token into gatsby-config.js and potentially checking it into source control, exposing it publicly, you can obfuscate it using a library known as dotenv
, which allows you to load environment variables into a Node.js process from a .env file that is not committed to a code repository:
require
(
'dotenv'
).
config
()
module
.
exports
=
{
siteMetadata
:
{
title
:
`My Awesome Gatsby Site`
,
},
plugins
:
[
{
resolve
:
`gatsby-source-mydatasource`
,
options
:
{
authToken
:
process
.
env
.
MYDATASOURCE_AUTH_TOKEN
},
},
],
}
In this example, we require and configure the dotenv
library, which grants the application access to environment variables through process.env
, an object that will contain MYDATASOURCE_AUTH_TOKEN
when provided in a .env file. To access this on your local machine, create a new file named .env in your project root containing the following:
MYDATASOURCE_AUTH_TOKEN=sensitive-token
Many infrastructure providers offer environment variable configuration in their user interfaces, allowing you to maintain a .env file on your local system for development and use configured environment variables for production. See Chapter 12 for more information about deployment with environment variables.
gatsby-source-filesystem
is the source plugin responsible for retrieving data from the Gatsby site’s surrounding filesystem. Much like other static site generators, such as Jekyll, that can derive data from surrounding directories, Gatsby offers the same option for developers who wish to retrieve only local data rather than external data. Of course, gatsby-source-filesystem
can be used in conjunction with other source plugins to retrieve both local and external data. In the next few sections, we’ll install and configure gatsby-source-filesystem
and examine how to work with arbitrary directories.
Installing gatsby-source-filesystem
works the same way as with any other plugin:
$
npm
install
--save
gatsby-source-filesystem
Where things differ is in the options
object in gatsby-config.js. In order to ensure that Gatsby understands where our files are coming from, we need to identify the name of the directory containing the files we want to work with as well as the path to that directory (usually some variation of the path to the directory containing the Gatsby site):
module
.
exports
=
{
siteMetadata
:
{
title
:
`My Awesome Gatsby Site`
,
},
plugins
:
[
{
resolve
:
`gatsby-source-filesystem`
,
options
:
{
name
:
`src`
,
path
:
`
${
__dirname
}
/src/`
,
ignore
:
[
`**/
.*`
],
},
},
],
}
In this example, the directory name we are targeting is src, and the path to that directory is the path to the working Gatsby directory (${__dirname}) and onwards to the directory we need (/src/). Finally, we also include an ignore
key that identifies any files we wish to ignore, such as those starting with a dot, in arbitrary regular expressions within an array.
One of the unique traits of gatsby-source-filesystem
is that it can be used multiple times within a single Gatsby site, and therefore within a single gatsby-config.js file. For instance, you may wish to pull data from multiple local directories that are in separate locations—say, if you have some data serialized as JSON and other data serialized as CSV that you want to combine in a single site.
In the following example gatsby-config.js file, we have two instances of the source plugin pulling from discrete directories:
module
.
exports
=
{
siteMetadata
:
{
title
:
`My Awesome Gatsby Site`
,
},
plugins
:
[
{
resolve
:
`gatsby-source-filesystem`
,
options
:
{
name
:
`json`
,
path
:
`
${
__dirname
}
/src/data/json`
,
ignore
:
[
`**/
.*`
],
},
},
{
resolve
:
`gatsby-source-filesystem`
,
options
:
{
name
:
`csv`
,
path
:
`
${
__dirname
}
/src/data/csv`
,
ignore
:
[
`**/
.*`
],
},
},
],
}
In addition to those specified by the regular expressions you provide as members of the ignore
array, Gatsby also ignores the following files by default when retrieving data:
**/*.un~
**/.DS_Store
**/.gitignore
**/.npmignore
**/.babelrc
**/yarn.lock
**/node_modules
../**/dist/**
As we saw in Chapter 4, GraphQL is the primary means in Gatsby’s data layer to access and read the data our source plugins retrieve. The gatsby-source-filesystem
plugin takes the files that you’ve identified and converts the data they contain into file nodes in the GraphQL API. To see this in action, clone another version of the Gatsby blog starter, which uses gatsby-source-filesystem
to retrieve its internal Markdown files:
$
gatsby
new
gtdg-ch5-filesystem
gatsbyjs/gatsby-starter-blog
$
cd
gtdg-ch5-filesystem
$
gatsby
develop
Now, open up the GraphQL API by navigating to http://localhost:8000/___graphql. If you look at the initial autocomplete list provided for an empty query (Figure 5-1), you’ll see that two additional GraphQL fields have been added at the top thanks to the gatsby-source-filesystem
plugin: allFile
(for all file objects) and file
(for individual file objects).
Now, if we issue the following query, which is the base-level allFile
query, we get a list of universally unique identifiers (UUIDs), as seen in Figure 5-2:
{ allFile { edges { node { id } } } }
What we’ve generated here through our GraphQL API is an array consisting of File
nodes, each of which contains a variety of GraphQL fields that we can now retrieve as well. These include metadata such as the file’s extension, size, and relative path, as seen in Figure 5-3, as well as the file’s contents, which may require further transformation to be ready for prime time in Gatsby:
{ allFile { edges { node { relativePath extension size } } } }
In many cases, the data contained within individual files in a filesystem might not be in the format you need for Gatsby. Rather than performing the required postprocessing when rendering the data, Gatsby recommends using transformer plugins, covered later in this book, to convert File
nodes into more consumable formats such as JSON.
As we saw earlier, it’s possible to pull data from multiple discrete directories by using multiple instances of the gatsby-source-filesystem
plugin. But how do we access these individual directories uniquely within our GraphQL queries?
Gatsby’s internal GraphQL API uses the filter
argument, which we covered in Chapter 4, to identify which individual gatsby-source-filesystem
plugin’s configured directory to use. For instance, here we issue two separate queries that retrieve files from the two distinct directories we configured earlier in this chapter:
{ allFile(filter: { sourceInstanceName: { eq: "json" } }) { edges { node { relativePath extension size } } } } { allFile(filter: { sourceInstanceName: { eq: "csv" } }) { edges { node { relativePath extension size } } } }
Working with multiple directories requires us to filter based on the sourceInstanceName
field, which is available on individual File
nodes.
More information about the gatsby-source-filesystem
plugin can be found in its documentation page on the Gatsby website.
Another important source of data for Gatsby sites is external databases, which either operate as standalone database systems or connect with other third-party systems. Retrieving data by querying a database rather than through APIs often provides greater flexibility in terms of data processing. The Gatsby plugin ecosystem provides integrations with some of the most well-known and commonly used database systems, including proprietary database systems and open source solutions like MongoDB and MySQL.
In addition to plugins designed to work with a specific database system, the gatsby-source-sql
plugin allows the connection of arbitrary SQL databases (including not only MySQL/MariaDB and PostgreSQL but also Amazon Redshift, SQLite3, Oracle, and MSSQL) to Gatsby.
Whether you’re working with a MySQL database, a PostgreSQL database, a MongoDB database, or any other SQL-based database, you can use any of the available source plugins for SQL databases and for MongoDB, MySQL, and PostgreSQL for maximum flexibility when it comes to retrieving the data you’ll need in your Gatsby site.
For more information about these database source plugins, consult the respective source plugin documentation pages for MongoDB, MySQL, PostgreSQL, and other SQL databases.
MongoDB is a NoSQL database with a focus on documents rather than tables. Because the gatsby-source-sql
plugin is solely for SQL databases, we need a distinct MongoDB-oriented source plugin to work with this data source. In the Gatsby source plugin ecosystem, gatsby-source-mongodb
, an officially supported plugin, will do the trick.
We can install the MongoDB source plugin in the usual way:
$
npm
install
--save
gatsby-source-mongodb
Then, in our plugins
array in gatsby-config.js, we define some key information that MongoDB needs from us as well as the query that we wish to issue. The following MongoDB query requests documents that are more current than the indicated Unix timestamp:
plugins
:
[
{
resolve
:
`gatsby-source-mongodb`
,
options
:
{
dbName
:
`local`
,
collection
:
`documents`
,
},
query
:
{
documents
:
{
as_of
:
{
$gte
:
1606850284
}
}
}
},
]
If you need to query from more than one collection, in your options
object, add the additional collection as a second member of an array:
options
:
{
dbName
:
`local`
,
collection
:
[
`documents`
,
`products`
]
},
Within the options
object, Gatsby’s MongoDB source plugin offers a range of configuration options that relate to particular aspects of your MongoDB database, as seen in Table 5-1.
Option | Description |
---|---|
connectionString |
MongoDB Atlas and later versions of MongoDB require a connection string that represents the full connection path; e.g., mongodb+srv://<USERNAME>:<PASSWORD>@<SERVERNAME>-fsokc.mongodb.net (for earlier versions, use dbName and extraParams for those respective values). This value should be obfuscated as an environment variable using a library such as dotenv . |
dbName |
The MongoDB database name. |
collection |
The name of the collection (or collections) to access within the MongoDB database; accepts a single string or an array of values. |
query |
The MongoDB database query. Keys represent collection names, and values represent query objects. |
server |
MongoDB server information. Defaults to a local running server on the default port; e.g.: server: { address: `ds143532.mlab.com`, port: 43532 } |
auth |
An authentication object to authenticate into a MongoDB collection; e.g.: auth: { user: `root`, password: `myPassword` } |
extraParams |
Additional parameters for the connection that can be appended as query parameters to the connection URI; examples include authSource , ssl , or replicaSet . |
clientOptions |
Additional options for creating a MongoClient instance, specific to certain versions of MongoDB and MongoDB Atlas. |
preserveObjectIds |
A Boolean to preserve nested ObjectID s within documents. |
For more information about extraParams
and clientOptions
values, consult the MongoDB documentation for query parameters and MongoClient
.
Once your MongoDB source plugin is configured appropriately, you can query MongoDB document nodes within Gatsby’s GraphQL API as follows. Here, we’re accessing a database named Cloud
and a collection named products
:
{ allMongodbCloudProducts { edges { node { id name url } } }
Next, we’ll take a look at the two SQL databases that Gatsby offers official source plugins for: MySQL and PostgreSQL.
Though the general SQL source plugin (which we’ll look at shortly) offers features that are agnostic to any SQL database, there are scenarios where, as a developer, you’ll prefer to use a source plugin that is more oriented toward those who are familiar with the inner workings of a particular database system. The gatsby-source-mysql
source plugin works speifically with MySQL databases and allows developers to insert MySQL queries directly into the gatsby-config.js file.
To install the MySQL source plugin, use this command:
$
npm
install
--save
gatsby-source-mysql
Then, within your gatsby-config.js file, you’ll need to provide details for connecting to the database as well as any queries you wish to issue:
plugins
:
[
{
resolve
:
`
gatsby-source-mysql
`
,
options
:
{
connectionDetails
:
{
host
:
`
localhost
`
,
user
:
`
root
`
,
password
:
`
myPassword
`
,
database
:
`
user_records
`
}
,
queries
:
[
{
statement
:
`
SELECT user, email FROM users
`
,
idFieldName
:
`
User
`
,
name
:
`
users
`
}
]
}
}
,
]
You can issue multiple queries inside a single plugin object. To do this, simply add a second member to the queries
array containing a unique name
to differentiate this second query from the first:
queries
:
[
{
statement
:
`SELECT user, email FROM users`
,
idFieldName
:
`User`
,
name
:
`users`
},
{
statement
:
`SELECT * FROM products`
,
idFieldName
:
`ProductName`
,
name
:
`products`
}
]
As you can see in Table 5-2, the MySQL source plugin offers a variety of MySQL-specific configuration options within each individual query object.
Options | Required? | Description |
---|---|---|
statement |
Required | The SQL query statement to be executed (stored procedures are supported) |
idFieldName |
Required | A column that is unique for each record; this column must be part of the returned statement |
name |
Required | A name for the SQL query, used by Gatsby’s GraphQL API to identify the GraphQL type |
parentName |
Optional | A name for the parent entity, if any (relevant for joins) |
foreignKey |
Optional | The foreign key to join the parent entity (relevant for joins) |
cardinality |
Optional | The cardinality relationship between the parent and this entity (e.g., OneToMany , OneToOne ; defaults to OneToMany ); relevant for joins |
remoteImageFieldNames |
Optional | An array of columns containing image URIs that need to be downloaded for further image processing |
A full accounting of joins in MySQL queries is beyond the scope of this book, but the Gatsby documentation contains a description of how to use the parentName
, foreignKey
, and cardinality
keys to perform a join.
Now, you can query the results of your MySQL queries within the GraphQL API internal to Gatsby:
{ allMysqlUsers { edges { node { email id } } } }
Note here that the name that follows allMysql
is the same as the name you defined in the query object (Users
).
As mentioned earlier, you can use gatsby-source-sql
to retrieve PostgreSQL data for common requirements, but a more specialized plugin is available for PostgreSQL databases. The gatsby-source-pg
plugin’s goal is to retrieve results from a PostgreSQL database with as little overhead as possible.
To install the PostgreSQL source plugin, execute the following command:
$
npm
install
--save
gatsby-source-pg
Now, configure the plugin in gatsby-config.js to ensure Gatsby can import the database to make the data available through the GraphQL API:
plugins
:
[
{
resolve
:
`
gatsby-source-pg
`
,
options
:
{
connectionString
:
`
postgres://
user
:
pass
@
host
/
dbname
`
,
schema
:
`
public
`
,
refetchInterval
:
60
}
}
]
,
Here, connectionString
represents any valid PostgreSQL connection string (and should be obfuscated in an environment variable using a library such as dotenv
), and refetchInterval
represents the interval on which data should be retrieved again from the PostgreSQL database in question when the data needs to be updated.
Once you’ve configured your PostgreSQL options, you can access the entire database from within your GraphQL API using the postgres
top-level field:
{ postgres { allArticlesList { id title authorId userByAuthorId { id username } } } }
A working example of the PostgreSQL source plugin is available on GitHub, and information about customizing the PostgreSQL source plugin is available on the Gatsby website.
Though Gatsby’s ecosystem provides source plugins specifically targeting well-known SQL databases like MySQL and PostgreSQL, the gatsby-source-sql
plugin also contains out-of-the-box support for both of these and other SQL databases like MariaDB, Amazon Redshift, SQLite3, Oracle, and MSSQL. (The dedicated source plugins for MySQL and PostgreSQL offer a different feature set.)
To install gatsby-source-sql
, execute the following command in the root of your Gatsby site:
$
npm
install
--save
git+https://github.com/mrfunnyshoes/gatsby-source-sql.git
Depending on the database you wish to integrate with, you’ll need to add the corresponding knex
-compliant plugin (knex
is the library gatsby-source-sql
uses to work with databases directly):
$
npm
install
--save
mysql
$
npm
install
--save
mysql2
$
npm
install
--save
pg
$
npm
install
--save
sqlite3
$
npm
install
--save
oracle
$
npm
install
--save
mssql
To configure gatsby-source-sql
in gatsby-config.js, you use the normal approach of adding the source plugin to the plugins
array. However, the options
object in this case requires three things: a typeName
string (describing each individual row in the results table), a fieldName
string (in a future version of the plugin, this will determine the field name in Gatsby’s GraphQL API), and a dbEngine
object. Consider the following example plugins
array in gatsby-config.js:
module
.
exports
=
{
siteMetadata
:
{
title
:
`
gatsby-source-sql demo
`
,
}
,
plugins
:
[
{
resolve
:
`
gatsby-source-sql
`
,
options
:
{
typeName
:
"User"
,
fieldName
:
"postgres"
,
dbEngine
:
{
client
:
'pg'
,
connection
:
{
host
:
'
my-db.my-host-sql.com
'
,
user
:
'root'
,
password
:
'
zs8Jy0DGg0kTlKUD
'
,
database
:
'user_records'
}
}
,
}
}
,
]
,
}
Notice the typeName
defined as the string User
in this example configuration. When you issue your first GraphQL query within Gatsby’s GraphQL API, the typeName
string becomes the name that comes after the prefix all
:
{ allUser { ... } }
The dbEngine
object accepts a knex
configuration object, which contains key information about the database system. For example, if you’re using gatsby-source-sql
to retrieve data from a MySQL database, certain information is required in order to connect to the database:
dbEngine
:
{
client
:
'mysql'
,
connection
:
{
host
:
'
my-db.my-host-sql.com
'
,
user
:
'root'
,
password
:
'
zs8Jy0DGg0kTlKUD
'
,
database
:
'user_records'
}
}
Because this configuration involves highly sensitive database credentials, it’s strongly recommended to use environment variables to provide these values to gatsby-config.js.
The gatsby-source-sql
plugin works a bit differently from the gatsby-source-filesystem
plugin we reviewed earlier. Each database connection is the source plugin’s only opportunity to retrieve the data needed for the Gatsby site from the database, so each use of gatsby-source-sql
in gatsby-config.js must also carry with it the database query you wish to issue. The results returned from the query will then populate the GraphQL API.
In gatsby-config.js, we need to add a queryChain
function to identify the query we want to issue to the database. Keep in mind that this query must adhere to the specification of the database’s internal workings and not the GraphQL specification in Gatsby. Only the results of the database query enter the Gatsby GraphQL API; to retrieve additional results, another instance of the plugin is required in gatsby-config.js.
For example, to issue the following two MySQL queries on a MySQL database:
SELECT
user
,
FROM
users
;
SELECT
user
FROM
users
WHERE
user
.
name
=
'admin'
We would need to define two source plugins with distinct queryChain
functions:
plugins
:
[
{
resolve
:
`
gatsby-source-sql
`
,
options
:
{
typeName
:
`
User
`
,
fieldName
:
`
mysqlUser
`
,
dbEngine
:
{
client
:
'mysql'
,
connection
:
{
host
:
'
my-db.my-host-sql.com
'
,
user
:
'root'
,
password
:
'
zs8Jy0DGg0kTlKUD
'
,
database
:
'user_records'
}
}
,
queryChain
:
function
(
x
)
{
return
x
.
select
(
'user'
,
'email'
)
.
from
(
'users'
)
}
}
}
,
{
resolve
:
`
gatsby-source-sql
`
,
options
:
{
typeName
:
`
Admin
`
,
fieldName
:
`
mysqlAdmin
`
,
dbEngine
:
{
client
:
'mysql'
,
connection
:
{
host
:
'
my-db.my-host-sql.com
'
,
user
:
'root'
,
password
:
'
zs8Jy0DGg0kTlKUD
'
,
database
:
'user_records'
}
}
,
queryChain
:
function
(
x
)
{
return
x
.
select
(
'user'
,
'email'
)
.
from
(
'users'
)
.
where
(
'user.name'
,
'='
,
'admin'
)
}
}
}
,
]
In the queryChain
function definitions shown here, the argument x
represents a database connection object. Because we’re solely concerned with retrieving data, the gatsby-source-sql
plugin only enables read operations, not write operations.
The knex
library is used by the gatsby-source-sql
source plugin as a utility for issuing queries to databases of various types. The documentation contains a full accounting of how to write queries in JavaScript according to various specifications.
Though heavy-duty databases are often appropriate for data destined for Gatsby sites, many developers prefer third-party hosted software-as-a-service (SaaS) services that limit the amount of upkeep required. Three of the most popular SaaS services used for Gatsby sites today are Airtable, AWS DynamoDB, and Google Docs. Each of these has its own Gatsby source plugin.
Some CMSs and commerce systems are SaaS services too, rather than being built on dedicated servers; we’ll cover those in the next section.
Airtable is a quick-and-easy solution for rudimentary data storage and management that’s quickly gaining popularity among developers. The gatsby-source-airtable
source plugin offers a range of features that allow you to retrieve data arbitrarily from any Airtable base tables.
To install the Airtable source plugin, execute the following command:
$
npm
install
--save
gatsby-source-airtable
Now you need to configure your Airtable source plugin. Airtable provides an API key through which data is accessed, located at Help→API Documentation within the Airtable interface. Because this API key is highly sensitive information, it’s strongly recommended that you inject it into your configuration using an environment variable, as described earlier in this chapter. Though you can hardcode your API key during development, for production your configuration should instead look like this:
plugins
:
[
{
resolve
:
`gatsby-source-airtable`
,
options
:
{
apiKey
:
process
.
env
.
AIRTABLE_API_KEY
,
},
},
],
Within the options
object, the Airtable source plugin also needs information about the tables you wish to query within Airtable. This takes the form of a tables
array that can contain multiple table objects. Additionally, in Airtable, every individual table can have one or more named views, which allow for arbitrary filtering and sorting to occur before the data arrives in Gatsby’s data layer. If you don’t specify a view by setting tableView
, you’ll simply receive raw data with no set order.
The following example demonstrates the retrieval of data from two separate tables. The concurrency
value, by default set to 5
, indicates how many concurrent requests the Airtable source plugin should issue to avoid overloading Airtable’s servers:
plugins
:
[
{
resolve
:
`gatsby-source-airtable`
,
options
:
{
apiKey
:
process
.
env
.
AIRTABLE_API_KEY
,
concurrency
:
5
,
tables
:
[
{
baseId
:
`myAirtableBaseId`
,
tableName
:
`myTableName`
,
tableView
:
`myTableViewName`
,
},
{
baseId
:
`myAirtableBaseId`
,
tableName
:
`myTableName`
,
tableView
:
`myTableViewName`
,
}
],
},
},
],
Each table object in the tables
array can take a variety of options, as seen in Table 5-3.
Option | Required? | Description |
---|---|---|
baseId |
Required | Your Airtable base identifier. |
tableName |
Required | The name of the table within your Airtable base. |
tableView |
Optional | The name of the view for a given table; if unset, raw data is returned unsorted and unfiltered. |
queryName |
Optional | A name to identify a table. If a string is provided, recasts all records in this table as a separate node type (useful if you have multiple bases with identical table or view names across bases). Defaults to false . |
mapping |
Optional |
Accepts a format such as mapping: { myColumnName: `text/markdown` } |
tableLinks |
Optional | An array of field names identifying a linked record matching the name shown in Airtable; setting this creates nested GraphQL nodes from linked records, allowing deep linking to records across tables. |
separateNodeType |
Optional | A Boolean describing whether there are two bases with a table having the same name and whether query names should differ from the default of allAirtable or airtable (this requires queryName to be set). Defaults to false . |
separateMapType |
Optional | A Boolean describing whether a Gatsby node type should be created for each type of data (such as Markdown or other attachment types) to avoid type conflicts. Defaults to false . |
Once you have your Airtable source plugin populating your GraphQL API, you can start retrieving data from your Airtable tables. To retrieve all records from a given table myTableName
where myField
is equal to myValue
, you can use a filter operation:
{ allAirtable( filter: { table: { eq: "myTableName" } data: { myField: { eq: "myValue" } } } ) { edges { node { data { myField } } } } }
To retrieve a single record from a given table—i.e., an individual table row where myField
is equal to myValue
—you can use the airtable
field instead:
{ airtable( table: { eq: "myTableName" } data: { myField: { eq: "myValue" } } ) { data { myField myOtherField myLinkedField { data { myLinkedRecord } } } } }
In this example, note that we’re also accessing a linked record that assumes the tableLinks
key is defined in gatsby-config.js.
GraphQL has different limitations on acceptable characters from Airtable. Because Airtable allows spaces in field names but GraphQL does not, the Airtable source plugin automatically rewrites keys such as column names without spaces: for example, a column named My New Column
becomes My_New_Column
in GraphQL. Full gatsby-source-airtable
documentation can be found on the Gatsby website.
Another hosted SaaS database solution, Amazon’s AWS DynamoDB, is also gaining traction among developers (particularly among architects who prefer AWS products). To install the AWS DynamoDB source plugin, execute the following command:
$
npm
install
--save
gatsby-source-dynamodb
Just like with other source plugins, to use the DynamoDB source plugin you’ll need to configure it in your Gatsby configuration file. As with other sensitive information, it’s strongly recommended to use environment variables to inject the values for your AWS credentials:
plugins
:
[
{
resolve
:
`gatsby-source-dynamodb`
,
options
:
{
typeName
:
`myGraphqlTypeName`
,
accessKeyId
:
`myAwsAccessKeyId`
,
secretAccessKey
:
`myAwsSecretAccessKey`
,
region
:
`myAwsRegion`
,
params
:
{
TableName
:
`myTableName`
,
},
},
},
],
More information is available in the AWS DynamoDB documentation about setting AWS credentials for IAM users, configuring permissions for IAM users, and available parameters on DynamoDB queries. Full documentation for the gatsby-source-dynamodb
plugin is also available on the Gatsby website.
In recent years, Google Docs has become a compelling solution for developers who don’t wish to configure and maintain a full content management system or database. Though it’s not an optimal data source for heavy-duty content or commerce implementations due to possible long build times, it can be useful for smaller sites and blogs.
The Google Docs source plugin in Gatsby relies on two additional plugins known as transformer plugins, which we cover at length in the next chapter. For now, all you need to know about them is that transformer plugins handle the processing of images within a Google Docs document.
You can install the gatsby-source-google-docs
source plugin in the usual way:
$
npm
install
--save
gatsby-source-google-docs
gatsby-transformer-remark
Next, you need to generate an OAuth token. In order to make this process easier, the source plugin exposes an additional script that you can use to generate a token. To do this, execute the following command in the root of your Gatsby project:
$
gatsby-source-google-docs-token
Alternatively, you can add the token generation script to your NPM or Yarn scripts:
"scripts"
:
{
"token"
:
"gatsby-source-google-docs-token"
}
You can then generate a token by executing one of the following commands:
# If using NPM
$
npm
run
token
# If using Yarn
$
yarn
token
The next step is to create three environment variables that identify your Gatsby site to the Google Docs service and save them into a .env file :
GOOGLE_OAUTH_CLIENT_ID=myGoogleOauthSubdomain.apps.googleusercontent.com GOOGLE_OAUTH_CLIENT_SECRET=myGoogleOauthClientSecret GOOGLE_DOCS_TOKEN={"access_token":"myAccessToken", "refresh_token":"myRefreshToken", "scope":"https://www.googleapis.com/auth/drive.metadata.readonly https://www.googleapis.com/auth/documents.readonly", "token_type":"Bearer","expiry_date":1606850284}
Finally, you can configure the Google Docs source plugin within your Gatsby configuration file. The first plugin object contains a folder
option, which represents {folder_id} in the Google Drive folder URI, https://drive.google.com/drive/folders/{folder_id}. The second plugin object in the plugins
array configures gatsby-transformer-remark
, which the Google Docs source plugin uses to process images embedded in Google Docs documents:
plugins
:
[
{
resolve
:
`gatsby-source-google-docs`
,
options
:
{
folder
:
`{folder_id}`
,
},
},
{
resolve
:
`gatsby-transformer-remark`
,
options
:
{
plugins
:
[
`gatsby-remark-images`
],
},
},
],
There are two approaches available for using Google Sheets as the data source for your Gatsby site. The Gatsby blog contains a tutorial on using Google Sheets directly as a data source.
For most developers, interacting with data sources requires interacting with a system that is oriented not only toward developers but also toward content editors, commerce site maintainers, and marketing teams. Many organizations use content management systems to work with content, while commerce systems are used to interact with commerce data such as product and pricing information.
Whereas many traditional CMSs and commerce systems have added APIs for data retrieval and management on top of their existing architectures, some newer CMS and commerce upstarts, commonly known as headless vendors, focus more of their attention on the APIs and software development kits (SDKs) developers use to retrieve data. In this section, we cover a variety of both traditional and headless content management and commerce systems and their respective source plugins.
Contentful is a headless CMS that offers rich data retrieval and management capabilities through its API. In addition, Contentful offers a first-class integration with Gatsby Cloud, a hosting provider for Gatsby. Today, Contentful is commonly used by developers who need a headless CMS without the overhead of some traditional CMS features.
To install the Contentful source plugin, gatsby-source-contentful
, execute the following command:
$
npm
install
--save
gatsby-source-contentful
Now, you can configure the source plugin in your Gatsby configuration file. The two most important items you need from Contentful are the spaceId
, representing the Contentful space you wish to query, and the accessToken
, which is available in Contentful’s settings. As always, with this sensitive information, remember to use environment variables rather than hardcoding the values into your configuration.
To use Contentful’s Content Delivery API, which exposes published content for production, add this to your gatsby-config.js:
plugins
:
[
{
resolve
:
`gatsby-source-contentful`
,
options
:
{
spaceId
:
`mySpaceId`
,
accessToken
:
process
.
env
.
CONTENTFUL_ACCESS_TOKEN
,
},
},
],
To use Contentful’s Content Preview API instead, which allows you to access unpublished content that isn’t ready for production, use this:
plugins
:
[
{
resolve
:
`gatsby-source-contentful`
,
options
:
{
spaceId
:
`mySpaceId`
,
accessToken
:
process
.
env
.
CONTENTFUL_ACCESS_TOKEN
,
host
:
`preview.contentful.com`
,
},
},
],
To pull from multiple Contentful spaces, simply add another plugin object to identify the second space to Contentful:
plugins
:
[
{
resolve
:
`gatsby-source-contentful`
,
options
:
{
spaceId
:
`myFirstContentfulSpaceId`
,
accessToken
:
process
.
env
.
CONTENTFUL_ACCESS_TOKEN
,
},
},
{
resolve
:
`gatsby-source-contentful`
options
:
{
spaceId
:
`mySecondContentfulSpaceId`
,
accessToken
:
process
.
env
.
CONTENTFUL_ACCESS_TOKEN
,
},
},
],
The Contentful source plugin offers a variety of configuration options in the options
object, as seen in Table 5-4.
Option | Required? | Description |
---|---|---|
spaceId |
Required | The space identifier for a Contentful space. |
accessToken |
Required | The API key for the Contentful Content Delivery API; if using the Content Preview API, use the Preview API key instead. |
host |
Optional | The base host for all API requests. Defaults to cdn.contentful.com ; for the Preview API, use preview.contentful.com . |
environment |
Optional | The Contentful environment from which to retrieve content. |
downloadLocal |
Optional | A Boolean that indicates whether all Contentful assets should be downloaded and cached to the local filesystem rather than referred to by CDN URL; defaults to false . |
localeFilter |
Optional |
A function that limits the number of locales and nodes created in GraphQL for given Contentful locales in order to reduce memory usage. Defaults to localeFilter: local => locale.code === `tr-TR`. |
forceFullSync |
Optional | A Boolean that prohibits the use of sync tokens upon accessing the Contentful API, preventing a full synchronization of content; defaults to false . |
proxy |
Optional | An object containing Axios (promise library) proxy configuration; defaults to undefined . |
useNameForId |
Optional | A Boolean indicating whether the content type’s name should be used to identify an object in the GraphQL schema instead of the content’s internal identifier; defaults to true . |
pageLimit |
Optional | The number of entries to pull from Contentful; defaults to 100 . |
assetDownloadWorkers |
Optional | The number of workers to use to download assets from Contentful; defaults to 50 . |
The Contentful source plugin makes available two node types in Gatsby’s GraphQL API:
Asset
nodes, representing assets in Contentful, are created in the GraphQL schema under the fields contentfulAsset
(single asset) and allContentfulAsset
(all assets).
ContentType
nodes, representing content items in Contentful, are created in the GraphQL schema under the fields contentful{TypeName}
(single content item) and allContentful{TypeName}
(all content items), where {TypeName}
is the content type’s name, unless you have configured useNameForId
.
To query for all Asset
nodes, you can use the allContentfulAsset
field:
{ allContentfulAsset { edges { node { id file { uri } } } } }
To query for all content items of the content type BlogPost
, you can use the allContentfulBlogPost
field, which takes this name unless you’ve set useNameForId
, in which case it adopts that configured name:
{ allContentfulBlogPost { edges { node { title } } } }
To query for a single content item of the content type BlogPost
whose title matches a particular string, you can use the contentfulBlogPost
field:
{ contentfulBlogPost( filter: { title: { eq: "My Blog Post" } } ) { title } }
Contentful offers rich text capabilities for formatted text fields. A working live example of Gatsby with Contentful is available, and you can find full documentation about the gatsby-source-contentful
plugin on the Gatsby website.
Drupal is a well-established CMS that powers more than 2% of the entire web. After several decades as a monolithic CMS, Drupal has recently introduced headless CMS capabilities in an architectural paradigm known as decoupled Drupal. Drupal offers rich content modeling capabilities as well as an administrative interface that is user-friendly for editorial and marketing teams.
To install the Drupal source plugin, execute this command:
$
npm
install
--save
gatsby-source-drupal
You then need to configure the plugin in your Gatsby configuration file. The only required option is baseUrl
:
plugins
:
[
{
resolve
:
`
gatsby-source-drupal
`
,
options
:
{
baseUrl
:
`
https://my-drupal-site.com
`
,
}
,
}
,
]
,
The Drupal source plugin also accepts a variety of additional options in order to allow developers to have full access to the features of the JSON:API specification, on which Drupal’s REST API is based. Remember that sensitive information in these options should be obfuscated through environment variables using a library such as dotenv
. The options are summarized in Table 5-5.
Option | Required? | Description |
---|---|---|
baseUrl |
Required | A string containing the full URL to the Drupal site. |
apiBase |
Optional | A string containing the relative path to the root of the API; defaults to jsonapi . |
filters |
Optional | An object containing filter parameters based on content item collections, which are then supplied to the query as query parameters (see below for more information). |
basicAuth |
Optional |
An object containing Basic Authentication credentials (username and password); e.g.: basicAuth: { username: process.env.DRUPAL_BASIC_AUTH_USERNAME, password: process.env.DRUPAL_BASIC_AUTH_PASSWORD, } |
fastBuilds |
Optional | A Boolean indicating whether fast builds should be enabled on the Drupal site. The Gatsby Drupal module and an authenticated user with the “Sync Gatsby Fastbuild log entities” permission are required for this functionality. Defaults to false . |
headers |
Optional |
An object containing any request headers required for the query; e.g.: headers: { Host: `https://my-host.com`, } |
params |
Optional |
An object containing any additional required parameters for params: { "api-key": "myApiKeyHeader" } |
skipFileDownloads |
Optional | A Boolean indicating whether Gatsby should refrain from downloading files from your Drupal site for future image processing; defaults to true . |
concurrentFileRequests |
Optional | A number indicating how many simultaneous file requests should be made to the Drupal site; defaults to 20 . |
disallowedLinkTypes |
Optional |
An array containing strings representing JSON:API link types that should be skipped, such as disallowedLinkTypes: [ `self`, `describedby`, `action--action` ], |
Drupal uses the JSON:API specification to drive its REST API, which makes available rich filtering capabilities based on JSON:API syntax. Consider an example in Drupal where the primary endpoint of our JSON:API-compliant API returns a series of collections:
//
Response
to
GET
https://my-drupal-site.com/jsonapi
{
//
...
links:
{
articles:
"https://my-drupal-site.com/jsonapi/articles"
,
products:
"https://my-drupal-site.com/jsonapi/products"
,
//
...
}
}
The JSON:API specification defines filtering through query parameters, with nested fields exposed in square brackets. For instance, to target only products that are tagged with the Drupal tag “Holiday,” our Gatsby configuration file needs to contain an additional filters
option defining the collection and the filter that should be applied to it:
plugins
:
[
{
resolve
:
`
gatsby-source-drupal
`
,
options
:
{
baseUrl
:
`
https://my-drupal-site.com
`
,
filters
:
{
// Collection: Filter criteria
products
:
`
filter[tags.name][value]=Holiday
`
,
}
,
}
,
}
,
]
,
Now, we can issue queries in the Gatsby GraphQL API to populate Gatsby pages and components. Note that because of the way Drupal handles content types, collections are accessed through the field allNode{TypeName}
and individual items are accessed through the field node{TypeName}
, where {TypeName}
is the name of the Drupal content type. To retrieve articles in the collection, we can issue this query, which limits the returned results to 50 items:
{ allNodeArticle(limit: 50) { edges { node { title created(formatString: "MMM-DD-YYYY") } } } }
To retrieve only a single article, we can issue a query that targets only a single content item:
{ nodeArticle( uuid: { eq: "49346fb8-3574-11eb-adc1-0242ac120002" } ) { title uuid created(formatString: "MMM-DD-YYYY") } }
Full documentation about the gatsby-source-drupal
plugin is available on the Gatsby website. For more information about Drupal’s JSON:API implementation and filtering capabilities, see my book Decoupled Drupal in Practice (Apress).
Another popular CMS for developers working with Gatsby sites is Netlify CMS, a free and open source application that facilitates editing of content and data directly in a Git repository. One of the traits that makes Netlify CMS unique is the fact that it is a Git-based CMS. This means that all content and data updates are implemented not through database operations but through source control and code commits.
The primary advantage of using a system like Netlify CMS is its suitability for static site generators like Gatsby. Because Netlify CMS merely provides a user interface that lies above code commits, it’s a compelling solution for content editors and marketers who need granular control over content changes. As one might expect, Netlify CMS works a bit differently from the other headless CMSs discussed in this section.
Unlike the other source plugins we’ve covered so far, Gatsby provides a full-fledged plugin for Netlify CMS that goes well beyond data retrieval use cases, due to the fact that Netlify CMS and Gatsby are capable of deeper levels of integration through an editorial interface built as a React application. For this reason, you may wish to install netlify-cms-app
, the Netlify CMS interface, alongside the canonical Gatsby plugin for Netlify CMS, gatsby-plugin-netlify-cms
:
$
npm
install
--save
netlify-cms-app
gatsby-plugin-netlify-cms
Now, add the plugin to the plugins
array in your Gatsby configuration file. Note that here, we are solely providing the plugin name as a string rather than placing it inside a resolve
object with a nested options
object:
plugins
:
[
`gatsby-plugin-netlify-cms`
,
],
Together, the netlify-cms-app
and gatsby-plugin-netlify-cms
plugins will create a Netlify CMS application in your browser at the path /admin/index.html, where content editors can modify their content. Because Gatsby copies everything in the /static directory (where static assets unmanipulated by Gatsby are placed) to the /public folder, you’ll also need to create a Netlify CMS configuration file located at /static/admin/config.yml.
Your Netlify CMS configuration YAML file will look something like the following:
# static/admin/config.yml
backend
:
name
:
my-netlify-cms-repo
media_folder
:
static/assets
public_folder
:
/assets
collections
:
-
name
:
blog
label
:
Blog
folder
:
blog
create
:
true
fields
:
-
{
name
:
path
,
label
:
Path
}
-
{
name
:
date
,
label
:
Date
,
widget
:
datetime
}
-
{
name
:
title
,
label
:
Title
}
-
{
name
:
body
,
label
:
Body
,
widget
:
markdown
}
Once you save this file, you’ll be able to run gatsby develop
and access the Netlify CMS editorial interface at https://my-gatsby-site.com/admin/ (the trailing slash is required). With the Netlify CMS application now running, you can make arbitrary edits to create and modify content. However, further authentication will be required in order to connect the Netlify CMS application with a working Git repository.
Because Netlify CMS will store any content you create as files that are committed to source repositories rather than to a database, your Netlify CMS “database” is in fact your local filesystem. Therefore, the queries you’ll issue within the Gatsby GraphQL API will match those implemented for gatsby-source-filesystem
, which should also be installed and included in your Gatsby configuration if you wish to include Netlify CMS content within your Gatsby site. When you configure the source plugin, the path to your Markdown files should be defined as ${__dirname}/blog
to adhere to the preceding configuration.
Full documentation about the gatsby-plugin-netlify-cms
plugin is available on the Gatsby website. Because approaches differ across providers, describing how to integrate Netlify CMS with Git source control providers is outside the scope of this book. The Netlify CMS documentation contains information about integrations with GitHub and GitLab.
Prismic is a hosted headless CMS available as a SaaS solution for content management. As a CMS for both editorial teams and developer teams, Prismic makes available an editorial interface as well as an API. In addition to its core feature set of custom content modeling, content scheduling and versioning, and multilingual support, Prismic also offers a feature known as Content Slices, which facilitates the creation of dynamic layouts.
Once you’ve populated your Prismic content repository with some content, you can acquire an API access token by navigating to Settings→API & Security in the Prismic interface, creating a new application (the Callback URL field can remain empty), and clicking “Add this application.” You can then install the Prismic source plugin as usual:
$
npm
install
--save
gatsby-source-prismic
Next, add the Prismic source plugin to your gatsby-config.js file in order to register it. As always, store your sensitive credentials as environment variables using a library such as dotenv
whenever you’re using them in your Gatsby configuration file:
plugins
:
[
{
resolve
:
`gatsby-source-prismic`
,
options
:
{
repositoryName
:
`myPrismicRepositoryName`
,
accessToken
:
process
.
env
.
PRISMIC_API_KEY
,
schemas
:
{
page
:
require
(
`./src/schemas/page.json`
),
article
:
require
(
`./src/schemas/article.json`
),
},
},
},
],
Note that the repositoryName
and schemas
options are the only required options if your Prismic API does not require authentication; otherwise, the accessToken
option is also required. Schemas are available by navigating to the “JSON editor” feature in the Prismic Custom Type Editor and copying the contents into the appropriate required files. Table 5-6 summarizes all of the configuration options available for the Prismic source plugin.
Option | Required? | Description |
---|---|---|
repositoryName |
Required | A string containing the name of your Prismic repository (e.g., my-prismic-site if your prismic.io address is my-prismic-site.prismic.io). |
accessToken |
Optional | A string containing the API access token for your Prismic repository. |
releaseId |
Optional | A string containing a specific Prismic release, which is a collection of changes intended for preview within Gatsby Cloud. |
linkResolver |
Optional |
A function determining how links in content should be processed in order to generate the correct link URL. The document node, field key (API ID), and field value are provided; e.g.: linkResolver: ({ node, key, value }) => (doc) => { // Link resolver logic } |
fetchLinks |
Optional | An array containing a list of links that should be retrieved and made available in the link resolver function so you can fetch multiple fields from a linked Prismic document; defaults to [] . |
htmlSerializer |
Optional |
A function determining how fields with rich text formatting should be processed to generate correct HTML. The document node, field key (API ID), and field value are provided; e.g.: htmlSerializer: ({ node, key, value }) => ( type, element, content, children, ) => { // HTML serializer logic } |
schemas |
Required |
An object containing custom types mapped to Prismic schemas; e.g.: schemas: { page: require(`./src/schemas/page.json`), article: require(`./src/schemas/article.json`), } |
lang |
Optional | A string containing a default language code for retrieving documents; defaults to * , which retrieves all languages. |
prismicToolbar |
Optional | A Boolean indicating whether the Prismic Toolbar script should be added to the site; defaults to false . |
shouldDownloadImage |
Optional |
A function determining whether images should be downloaded locally for further processing. The document node, field key (API ID), and field value are provided; e.g.: shouldDownloadImage: ({ node, key, value }) => { // Return true to download // Return false to skip } |
imageImgixParams |
Optional |
An object containing a set of Imgix (a library for image processing) image transformations for future image processing; e.g.: imageImgixParams: { auto: `compress,format`, fit: `max`, q: 50, } |
imagePlaceholderImgixParams |
Optional |
An object containing a set of Imgix image transformations applied to placeholder images for future image processing; e.g.: imagePlaceholderImgixParams: { w: 50, blur: 20, q: 100, } |
typePathsFilenamePrefix |
Optional | A string containing prefix for filenames where type paths for schemas are stored, including the MD5 hash of your schemas after the prefix; defaults to `prismic-typepaths---{repositoryName}` , where {repositoryName} is your Prismic repository name. |
With the Prismic source plugin configured, you can now issue queries against the Gatsby GraphQL API to retrieve your Prismic data within Gatsby pages and components. To retrieve all content items of type Article
from Prismic, you can issue a query like the following:
{ allPrismicArticle { edges { node { id first_publication_date last_publication_date data { title { text } content { html } } } } } }
You can also retrieve an individual content item by issuing a query like the following, with an argument supplied:
{ prismicArticle( id: { eq: "My Prismic Article" } } ( id first_publication_date last_publication_date data { title { text } content { html } } } }
More example queries are available on the NPM package page, and Gatsby provides full documentation about the gatsby-source-prismic
plugin.
Sanity is a hosted service providing backends for structured content, together with a free and open source editorial interface built in React. With a focus on real-time APIs for retrieving and managing data, Sanity is a potential candidate as a headless CMS for developers working with Gatsby sites. To use Sanity as a data source for Gatsby, you’ll need to configure an instance of Sanity Studio (a React application for interacting with your Sanity content) and a GraphQL API that exposes your Sanity dataset.
To install the Sanity source plugin, execute the following command:
$
npm
install
--save
gatsby-source-sanity
Then configure the plugin in your Gatsby configuration file. As always, any sensitive information should be provided as environment variables through dotenv
:
plugins
:
[
{
resolve
:
`gatsby-source-sanity`
,
options
:
{
projectId
:
`mySanityProjectId`
,
dataset
:
`mySanityDataset`
,
},
},
],
The Sanity source plugin makes available a range of additional configuration options within the options
object, apart from the required projectId
and dataset
options, as seen in Table 5-7.
Option | Required? | Description |
---|---|---|
projectId |
Required | A string containing the Sanity project identifier. |
dataset |
Required | A string containing the name of the Sanity dataset. |
token |
Optional | A string containing the authentication token for retrieving data from private datasets (or when using overlayDrafts ). |
overlayDrafts |
Optional | A Boolean indicating whether drafts should replace published versions in delivery. Defaults to false . |
watchMode |
Optional | A Boolean indicating whether a listener should be kept open and provide the latest changes in real time. Defaults to false . |
With the configuration done, you can query your Sanity data within the Gatsby GraphQL API by using the top-level field allSanity{TypeName}
(all items) or sanity{TypeName}
(individual items), where {TypeName}
is a Sanity document type name. For instance, if you have a Sanity document type known as article
, you can retrieve data for all articles with this query:
{ allSanityArticle { edges { node { title description slug { current } } } }
And you can retrieve an individual article with a query like this:
{ sanityArticle( title: { eq: "My Sanity Article" } ) { title description slug { current } } }
Full documentation regarding the gatsby-source-sanity
plugin is available on the Gatsby website.
Shopify is a popular commerce system for building online storefronts. With the Shopify source plugin, Gatsby sites can retrieve data from the Shopify Storefront API and populate the internal GraphQL API. The gatsby-source-shopify
plugin provides public shop data and also supports both the gatsby-transformer-sharp
and gatsby-image
plugins for image handling (covered at greater length in Chapter 7).
To install the Shopify source plugin, use this command:
$
npm
install
--save
gatsby-source-shopify
In order to access the Shopify Storefront API, you need to acquire an access token that is permissioned such that your source plugin can read products, variants, and collections; read product tags; and read shop content such as articles, blogs, and comments. As always, this access token should be provided as an environment variable through the dotenv
library to avoid revealing sensitive credentials. Once you have the access token, you can add the plugin to your Gatsby configuration file:
plugins
:
[
{
resolve
:
`gatsby-source-shopify`
,
options
:
{
password
:
process
.
env
.
SHOPIFY_ADMIN_PASSWORD
storeUrl
:
process
.
env
.
SHOPIFY_STORE_URL
,
},
},
],
Though these are the only required options for the Shopify source plugin, there are a variety of additional options that can be configured in the options
object, as seen in Table 5-8.
Option | Required? | Description |
---|---|---|
password |
Required | A string containing the administrative password for the Shopify store and application you are using. |
storeUrl |
Required | A string containing your Shopify store URL, such as my-shop.myshopify.com . |
shopifyConnections |
Optional | An array consisting of additional data types to source, such as orders or collections . |
downloadImages |
Optional | A Boolean that, when set to true , indicates that images should be downloaded from Shopify and processed during the build (the plugin’s default behavior is to fall back to Shopify’s CDN). |
typePrefix |
Optional | A string containing an optional prefix to add to a node type name (e.g., when set to MyShop , node names will be under allMyShopShopifyProducts instead of allShopifyProducts ). |
salesChannel |
Optional | A string containing an optional channel name (e.g., My Sales Channel ) whose active products and collections will be the only data sourced. The default behavior is to source all that are available in the online store. |
Once you’ve included the Shopify source plugin in you Gatsby configuration file, you can query Shopify data through the Gatsby GraphQL API. To query all Shopify nodes, you can issue a query such as the following:
allShopifyProduct( sort: { fields: [publishedAt], order: ASC } ) { edges { node { id storefrontId } } }
For more information about the gatsby-source-shopify
plugin and example queries, consult the documentation on the Gatsby website.
WordPress is a well-known free and open source CMS that is used by many websites on the internet. For developers building Gatsby sites, WordPress offers two means of retrieving data: WP-API, which is WordPress’s native REST API, and WPGraphQL, which is a GraphQL API contributed to the WordPress ecosystem (in addition to another, known as the GraphQL API for WordPress). In the Gatsby plugin ecosystem, the gatsby-source-wordpress
source plugin is responsible for retrieving data through WPGraphQL and making it available to Gatsby’s internal GraphQL API.
To install the WordPress source plugin, use this command:
$
npm
install
--save
gatsby-source-wordpress
Like the other source plugins, you need to add the plugin to your Gatsby configuration:
plugins
:
[
{
resolve
:
`gatsby-source-wordpress`
,
options
:
{
url
:
process
.
env
.
WPGRAPHQL_URL
,
},
},
],
The url
option is the only required key in the options
object, but there are many other optional configuration options that the WordPress source plugin makes available. Table 5-9 shows a subset of these.
Option | Type | Description |
---|---|---|
url |
String | The full URL of the WPGraphQL endpoint (required) |
verbose |
Boolean | Indicates whether the terminal should display verbose output; defaults to true |
debug |
Object |
Commonly used debugging options (others include
|
develop |
Object |
Options related to
|
auth |
Object |
Options related to authentication:
|
schema |
Object |
Commonly used options related to retrieving the remote schema (others include
|
excludeFieldNames |
Array | A list of field names to exclude from the newly generated schema; defaults to [] |
html |
Object |
Options related to processing of HTML fields:
|
type |
Object |
Options related to types in the remote schema:
|
Once you’ve saved your Gatsby configuration file, you’ll be able to issue queries against the Gatsby GraphQL API to extract data from your WordPress site. Because the configuration options determine to a great extent how your queries appear, it isn’t possible to provide a full accounting of all querying possibilities with the WordPress source plugin. For this reason, the gatsby-source-wordpress
documentation recommends examining the wide range of examples that consume WordPress data.
Full documentation for the gatsby-source-wordpress
plugin is available on GitHub. There’s also a WordPress plugin that optimizes WordPress sites to work as data sources for Gatsby.
Sometimes, you may need to pull directly from data that is serialized as JSON or YAML and isn’t included in a database. In other cases, you may need to pull directly from other GraphQL APIs. For pulling from GraphQL APIs you need a GraphQL source plugin, but for JSON and YAML you need to use a different approach to import the data. In this section, we’ll cover sourcing data from GraphQL APIs first before turning our attention to data housed in JSON and YAML documents.
It’s often the case that you’ll need to retrieve data from other GraphQL APIs to populate Gatsby’s internal GraphQL API. To accomplish this, the gatsby-source-graphql
source plugin is capable of schema stitching, a process in which multiple external GraphQL schemas are combined to form a single cohesive schema.
The GraphQL source plugin creates an arbitrary type name that surrounds the schema’s overarching query type, while the external schema becomes available under a field within Gatsby’s GraphQL API.
Installing the GraphQL source plugin works the same way as with other source plugins:
$
npm
install
--save
gatsby-source-graphql
Then, as usual, you need to configure the GraphQL source plugin within your Gatsby configuration file. The following example shows what a simple GraphQL source plugin configuration object might look like. The only required options are typeName
(an arbitrary name that identifies the remote schema’s Query
type), fieldName
(the Gatsby GraphQL field under which the remote schema will be available), and url
(the URL of the GraphQL endpoint):
plugins
:
[
{
resolve
:
`gatsby-source-graphql`
,
options
:
{
typeName
:
`myGraphqlName`
,
fieldName
:
`contentGraphql`
,
url
:
`https://my-graphql-api.com/graphql`
,
},
},
],
Some remote GraphQL APIs require authentication to access their data. Always remember to store sensitive credentials as environment variables and inject those values into your configuration file through a library such as dotenv
. In the following more complex example, we see an HTTP header used to provide the authentication:
plugins
:
[
{
resolve
:
`gatsby-source-graphql`
,
options
:
{
typeName
:
`GitHub`
,
fieldName
:
`github`
,
url
:
`https://api.github.com/graphql`
,
headers
:
{
Authorization
:
`Bearer
${
process
.
env
.
GITHUB_ACCESS_TOKEN
}
`
,
},
},
},
],
The headers
object also accepts functions as an alternative, which means it’s possible to use an async
function to provide the credentials, such as a getGithubAuthToken()
function defined in the same file:
headers
:
async
()
=>
{
return
{
Authorization
:
await
getGithubAuthToken
(),
}
},
The options
object accepts several other configuration options, which are listed in Table 5-10.
Option | Required? | Description |
---|---|---|
typeName |
Required | A string containing an arbitrary name for the remote schema Query type. |
fieldName |
Required | A string containing an arbitrary name under which the remote schema will be made available in the Gatsby GraphQL API. |
url |
Required | A string containing the URL for the GraphQL endpoint of the remote GraphQL API. |
headers |
Optional |
Accepts two types:
|
fetchOptions |
Optional | An object containing additional options to pass to the node-fetch library that the GraphQL source plugin uses. Defaults to {} . |
fetch |
Optional |
A function providing a fetch: (uri, options = {}) => { fetch(uri, { ...options, headers: sign(options.headers) }), }, |
batch |
Optional | A Boolean indicating whether queries should be batched to improve query performance rather than being executed individually in separate network requests; defaults to false . |
dataLoaderOptions |
Optional | An object containing GraphQL data loader options, including:maxBatchSize : A number indicating how many queries the GraphQL source plugin should batch; defaults to 5 . |
createLink |
Optional |
A function providing the manual creation of an Apollo Link (Apollo Link is a library that offers fine-grained control over HTTP requests issued by Apollo Client) for Apollo users; e.g.: createLink: pluginOptions => { return createHttpLink({ uri: `https://api.github.com/graphql`, headers: { Authorization: `Bearer ${process.env.GITHUB_ACCESS_TOKEN}`, }, fetch, }) }, |
createSchema |
Optional |
A callback function providing an arbitrary schema definition (e.g., schema SDL or introspection JSON). Returns a // Dependencies const fs = require(`fs`) const { buildSchema, buildClientSchema } = require(`graphql`) // Inside plugin options: // Create schema from introspection JSON createSchema: async () => { const json = JSON.parse( fs.readFileSync(`${__dirname}/introspection.json`) ) return buildClientSchema(json.data) }, // Create schema from schema SDL createSchema: async () => { const sdl = fs.readFileSync(`${__dirname}/schema.sdl`).toString() return buildSchema(sdl) }, |
transformSchema |
Optional |
A function providing an arbitrary schema based on inputs from an object argument containing // Dependencies const { wrapSchema } = require(`@graphql-tools/wrap`) const { linkToExecutor } = require(`@graphql-tools/links`) // Inside plugin options: transformSchema: ({ schema, link, resolver, defaultTransforms, options, }) => { return wrapSchema( { schema, executor: linkToExecutor(link), }, defaultTransforms ) }, |
refetchInterval |
Optional | A number indicating how many seconds the GraphQL source plugin should wait before refetching the data (by default, it only refetches data when the server is restarted). |
Transforming schemas and configuring data loader options are for advanced GraphQL requirements. For more details about schema wrapping and why the transformSchema
option is useful, consult the graphql-tools
documentation. For a full list of available dataLoaderOptions
, see the graphql/dataloader
documentation.
Now, you can query the GraphQL API using queries that match the fieldName
s you’ve defined in your Gatsby configuration file. And by using multiple plugin definitions, as you’ve seen in previous sections, you can make both datasets available to the GraphQL API in Gatsby, as follows:
plugins
:
[
{
resolve
:
`gatsby-source-graphql`
,
options
:
{
typeName
:
`myGraphqlName`
,
fieldName
:
`remoteGraphql`
,
url
:
`https://my-graphql-api.com/graphql`
,
},
},
{
resolve
:
`gatsby-source-graphql`
,
options
:
{
typeName
:
`GitHub`
,
fieldName
:
`github`
,
url
:
`https://api.github.com/graphql`
,
headers
:
{
Authorization
:
`Bearer
${
process
.
env
.
GITHUB_ACCESS_TOKEN
}
`
,
},
},
},
],
With both GraphQL APIs now represented in your Gatsby GraphQL API, you can query both APIs in one GraphQL query, like this:
{ remoteGraphql { allArticles { title } } github { viewer { email } } }
Full documentation for the gatsby-source-graphql
plugin is available on the Gatsby website.
Sometimes, you have raw data that isn’t in a database or other system; in fact, it’s simply a YAML file or JSON file that contains data you need to use to populate your Gatsby site. For raw JSON and YAML data housed in files, we can’t use a normal source plugin that retrieves data from external sources. Nor can we use gatsby-source-filesystem
, because our data is stored in one file rather than in multiple files across directories.
Though this section is entitled “Sourcing Data from JSON and YAML,” in fact the approach required to import raw JSON and YAML data into a Gatsby site for use in Gatsby pages and components is a direct import that bypasses Gatsby’s GraphQL API entirely. Suppose you already have a YAML or JSON file containing data, in a format similar to one of the following:
# content/data.yaml
title
:
My YAML data
content
:
-
item
:
Lorem ipsum dolor sit amet
-
item
:
Consectetur adipiscing elit
-
item
:
Curabitur ac elit erat
// content/data.json
{
"title"
:
"My
JSON
Data"
,
"content"
:
[
{
"item"
:
"Lorem
ipsum
dolor
sit
amet"
},
{
"item"
:
"Consectetur
adipiscing
elit"
},
{
"item"
:
"Curabitur
ac
elit
erat"
}
]
}
You can create a new page component in src/pages that directly consumes the data in that file. If you’re importing YAML data, your import
statements will look like the following:
import
React
from
"react"
import
ExternalData
from
"../../content/data.yaml"
For JSON data, refer to the JSON data file instead:
import
React
from
"react"
import
ExternalData
from
"../../content/data.json"
Now, you can create a rudimentary Gatsby page that generates a list of content based on the data you’ve retrieved from that file:
const
DirectImportExample
=
()
=>
(
<
div
>
<
h1
>
{
ExternalData
.
title
}
<
/h1>
<
ul
>
{
ExternalData
.
content
.
map
(
(
data
,
index
)
=>
{
return
<
li
key
=
{
`content-item-
${
index
}
`
}
>
{
data
.
item
}
<
/li>
}
)}
<
/ul>
<
/div>
)
export
default
DirectImportExample
After saving this page and running gatsby develop
, you’ll see your Gatsby page populated with the data you imported. Note in this example that we’ve bypassed the GraphQL API internal to Gatsby entirely in favor of directly importing our data as a dependency.
It’s also possible to build a Gatsby site entirely based on a YAML data manifest, but that is beyond the scope of this book. For more information about this approach, consult the Gatsby documentation.
Source plugins, one of the most important aspects of the Gatsby ecosystem, are essential to the functioning of your Gatsby site because they are the conduit by which data from a local filesystem or external service or database is made available for use. This chapter only covered a selection of popular source plugins and sourcing approaches, but there’s an infinite supply of potential services to interact with. The Gatsby plugin ecosystem contains a wide variety of additional source plugins with guides for integration with many other services beyond those represented here.
Source plugins are also fundamental for Gatsby developers because they determine how Gatsby builds pages programmatically using templates and arbitrarily sourced data. In this chapter, we’ve focused primarily on how to source data with source plugins so that we can interact with that data in GraphQL queries within Gatsby pages and components. In the next chapter, we’ll turn our attention to connecting the dots between the createPages
API, which generates programmatic Gatsby pages based on data and logic in the gatsby-node.js file; the GraphQL queries enabled by our newly configured source plugins; and the templates determining how those pages ought to look.
3.236.100.210