Appendix A. Introduction to Kusto Query Language

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Appendix A. Introduction to Kusto Query Language

By Mike Kassis,
Senior Program Manager
Microsoft Cxe Security

The Kusto Query Language, referred to as KQL in this book, is the language you will use to work with and manipulate your data consumed by Azure Sentinel. The logs you feed into your workspace aren’t worth much if you can’t visualize and analyze the important data therein. The best part of KQL is that the power and flexibility of the language is matched by its simplicity. If you have a background in scripting or working with databases, much of what I cover here will feel very familiar. If not, don’t worry, you will walk away from this appendix ready to start writing your own queries and driving value for your organization.

This appendix introduces many of the foundational concepts of KQL without getting too bogged down in the details. I will cover some of the most used functions and operators, which should address 75 to 80 percent of the queries you will write day to day. While KQL basics are rather simple, there are times when you will need to run more advanced queries, so I encourage you to carry your learning to more comprehensive resources, such as the official KQL documentation and online courses.

The KQL query structure

A good place to start learning KQL is to develop an understanding of the overall query structure and how it compares to a few other common languages. I have always found that KQL feels like a hybrid of SQL and PowerShell. The former is a mainstay for database administrators, while the latter is the scripting tool of choice for IT operations teams in Windows-heavy environments. Let’s start by taking a quick look at SQL.

SQL

Let’s start by taking a quick look at SQL where we make use of keywords to structure the query:

SELECT TOP(5)
Country,
Count(Country) as CountryCount
FROM contact
WHERE Country IS NOT NULL
GROUP BY Country
ORDER BY CountryCount DESC

The SELECT and FROM keywords let us detail which variables we want returned, how many records we want returned, and from what table they should be taken. The WHERE keyword on line 5 lets us filter the dataset based on one or more variables. We use the GROUP keyword to say that we want to summarize our data in some way. In this case, we used the count() function on line 3, so we are summarizing the count of records associated with each country. Finally, we can sort our data by using the ORDER keyword.

In the case of SQL, the structure of the query is largely determined by the keywords and the text included with the keywords. Notice that some things seem to happen in a non-intuitive order. For example, we specified we wanted the top 5 results in line 1, but SQL won’t use that information until the very end of the query where it will only keep 5 records. Wouldn’t it make more sense to specify TOP(5) at the end of the query?

Also, another minor annoyance about SQL’s structure is that we had to specify how we wanted to summarize our data in two places. On line 3, we needed our aggregation function, and on line 6, we had to specify what value we wanted that function to summarize by. In KQL, we can do all of this in one line, as we’ll see in a moment.

PowerShell

Let’s look at PowerShell now, which is not a DBA-centric language, but it still serves an important purpose for retrieving and manipulating data.

Get-Process | `
Where CPU -gt 100 | `
Group ProcessName | `
Sort Count -descending | `
Select Count, Name -first 5

I broke this query into multiple lines (using the backtick character) for readability, but think for a moment how this example varies from the SQL example. The first thing that I notice is the use of the pipe symbol ( | ). The structure of a PowerShell command is one where you pass your data across a “pipeline,” and each step provides some level of processing. At the end of the pipeline, you will get your final result. In effect, this is our pipeline:

Get Data | Filter | Summarize | Sort | Select

I would argue that this concept of passing data down the pipeline for further processing is a more intuitive structure than what we saw with SQL because it is easier to create a mental picture of your data at each step. We know that on line 1, our pipeline contains every process running on the system. We know that at line 2, we are only keeping processes that have a CPU time that is more than 100 seconds. On line 3, we know that we are summarizing our data to show the count of processes by the process name. Finally, on lines 5 and 6, we know that the data has been sorted, and we only kept the rows we want.

Obviously, SQL and PowerShell serve two very different purposes, but as we look at KQL’s query structure, you should notice how it seamlessly combines much of the best components of each language into something that is simplistic, flexible and, most importantly, intuitive.

Here is a look at a KQL query, which looks at Azure Active Directory (AAD) sign-in logs. As you read through each line, you should start to see the SQL and PowerShell similarities quite clearly.

SigninLogs
| evaluate bag_unpack(LocationDetails) //Don’t worry about this line for now.
| where RiskLevelDuringSignIn == 'none'
and TimeGenerated >= ago(7d)
| summarize Count = count() by city
| order by Count desc
| take 5

The use of the pipe symbol between each step works much the same way we saw with PowerShell. We are passing our set of data down the “pipeline,” and at each step, we have a keyword, like SQL, in which we can specify the type of processing we want done. One of the best parts of KQL is that within reason, you can make the steps happen in any order you choose. The pipeline for our above example looks like this:

Type	Additional name(s)	Equivalent .NET type
`bool`	`Boolean`	`System.Boolean`
`datetime`	`Date`	`System.DateTime`
`dynamic`		`System.Object`
`guid`	`uuid, uniqueid`	`System.Guid`
`int`		`System.Int32`
`long`		`System.Int64`
`real`	`Double`	`System.Double`
`string`		`System.String`
`timespan`	`Time`	`System.TimeSpan`
`decimal`		`System.Data.SqlTypes.SqlDecimal`

Function	Description
`D`	`days`
`H`	`hours`
`M`	`minutes`
`S`	`seconds`
`Ms`	`milliseconds`
`Microsecond`	`microseconds`
`Tick`	`nanoseconds`

Operator	Description
`+`	`Add`
`-`	`Subtract`
`*`	`Multiply`
`/`	`Divide`
`%`	`Modulo`
`<`	`Less`
`>`	`Greater`
`==`	`Equals`
`!=`	`Not equals`
`<=`	`Less or Equal`
`>=`	`Greater or Equal`
`in`	`Equals to one of the elements`
`!in`	`Not equals to any of the elements`

`Function`	`Description`
`any()`	`Returns random non-empty value for the group`
`arg_max()`	`Returns one or more expressions when argument is maximized`
`arg_min()`	`Returns one or more expressions when argument is minimized`
`avg()`	`Returns average value across the group`
`buildschema()`	`Returns the minimal schema that admits all values of the dynamic input`
`count()`	`Returns count of the group`
`countif()`	`Returns count with the predicate of the group`
`dcount()`	`Returns approximate distinct count of the group elements`
`make_bag()`	`Returns a property bag of dynamic values within the group`
`make_list()`	`Returns a list of all the values within the group`
`make_set()`	`Returns a set of distinct values within the group`
`max()`	`Returns the maximum value across the group`
`min()`	`Returns the minimum value across the group`
`percentiles()`	`Returns the percentile approximate of the group`
`stdev()`	`Returns the standard deviation across the group`
`sum()`	`Returns the sum of the elements withing the group`
`variance()`	`Returns the variance across the group`

Table of Contents for Appendix A. Introduction to Kusto Query Language

Create new playlist

Sign In

Sign Up

Appendix A. Introduction to Kusto Query Language

The KQL query structure

SQL

PowerShell

Data types

Getting, limiting, sorting, and filtering data

Getting data

Limiting data: take

Sorting data: order

Filtering data: where

Summarizing data

Structure of the summarize statement

Aggregation reference

Adding and removing columns

Project and project-away

Extend

Joining tables

Union

Join

Evaluate

Let statements

Suggested learning resources

Table of Contents for
Appendix A. Introduction to Kusto Query Language