If you have experience with SQL, then sorting with Cypher is exactly the same as sorting with SQL. We can use the ORDER BY
clause to specify the columns to be used for sorting, as shown in the following query:
MATCH (b:Book)
WHERE ANY ( tag IN b.tags WHERE tag IN ['drama'] )
RETURN b.title
ORDER BY b.title
LIMIT 5
The preceding query looks for books tagged drama
in the database, then sorts them by title, and returns the first five book entries found. We can note the following:
ORDER BY
clause follows the RETURN
clauseLIMIT
or SKIP
clause so that we can sort the data before limiting our page+-----------------------------+ | b.title | +-----------------------------+ | "A Lover's Complaint" | | "A Midsummer Night's Dream" | | "All's Well That Ends Well" | | "Anthony and Cleopatra" | | "As You Like It" | +-----------------------------+
To sort inversely, just postpone the DESC
clause to the ORDER BY
clause, as shown as shown in the following query:
MATCH (b:Book) WHERE ANY ( tag IN b.tags WHERE tag IN ['drama'] ) RETURN b.title ORDER BY b.title DESC LIMIT 5
It gives the following result:
+------------------------+ | b.title | +------------------------+ | "Venus and Adonis" | | "Twelfth Night" | | "Troilus and Cressida" | | "Titus Andronicus" | | "Timon of Athens" | +------------------------+
The result rows are different because the LIMIT
clause is evaluated after the ORDER BY
clause, so Cypher is limiting the result set to five items, which are already sorted. This is important in real-world applications because it allows us to both page data in a small result set and sort it however we want.
How are null
values evaluated by the ORDER BY
clause? They are always considered the largest values; so, they are the last values in case of an ascending sort, while they are the first values in case of a descending sort. This is useful while we are looking for data that is sorted in an ascending manner, and indeed that's likely to be the most common situation. But with a descending sort, we can get unexpected results.
The following query returns books from our database, putting the most recently published books at the top:
MATCH (b:Book) OPTIONAL MATCH (b) -[p:PublishedBy]-> (c) RETURN b.title, p.year ORDER BY p.year DESC LIMIT 5
We are using an OPTIONAL MATCH
clause because we want to get all the books in the database, but as the database does not contain the publication year for some books, we get them at the first position:
+------------------------------+ | b.title | p.year | +------------------------------+ | "Fairy tales" | <null> | | "The Divine Comedy" | <null> | | "Epic of Gilgamesh" | <null> | | "Book of Job" | <null> | | "The Art of Prolog" | <null> | +------------------------------+
So, how to get the null
values to stay at the bottom of the list and the most recent book at the top?
For this, COALESCE
is the function we need. It takes a variable number of arguments and returns the first non-null value. A null
value is returned only if all arguments are null.
Using the COALESCE
function, we can write the preceding query in the ORDER BY
clause as follows:
MATCH (b:Book)
OPTIONAL MATCH (b) -[p:PublishedBy]-> (c)
RETURN b.title, p.year
ORDER BY COALESCE(p.year, -5000) DESC
LIMIT 5
Here, we have instructed Cypher to sort the dataset by a value, that is, the year of publication. If it is a null
value, it is considered as -5000
. Now, we have the result we were expecting, as shown in the following output code:
+-------------------------------------------+ | b.title | p.year | +-------------------------------------------+ | "Getting Started with Memcached" | 2013 | | "Java EE 7 Developer Handbook" | 2013 | | "Akka Essentials" | 2012 | | "Learning Ext JS 3.2" | 2010 | | "The Art of Prolog" | <null> | +-------------------------------------------+
18.118.24.124