Introduction to XQL: Writing Your First Query
In this blog
Introduction
Cortex XDR and XSIAM used a query language named XQL, which stands for XDR Query Language. With XQL, we can form complex queries against data stored in Cortex XDR/XSIAM from various datasets, data models and stories. XQL has similar functions to SQL, such as join and union, but XQL queries are written in stages instead of statements. XQL queries are also non-destructive, and you don't need to worry about altering the data from the queries. The results of XQL are displayed in table format or graph format. XQL results can be used in custom widgets, dashboards, correlation rules and Behavioral Indication of Comprise (BIOC) rules or as part of your threat-hunting investigations. As part of the parsing rules for custom datasets to fit your needs.
The following flowchart is a high-level view of how an XQL Query is formed:
The Query Builder
The Query Builder is a guided search menu that assists you with basic queries. You can investigate alerts quickly by using the pre-built query lookup to search datasets using templates such as identity, endpoints, network, cloud, basic and free text search.
The templates are an entry to XQL Queries to search for data based on common searches quickly.
In addition to the guided searches, we can write our own XQL queries from scratch using the built-in Integrated Developer Environment or IDE. The IDE will assist you with defining the query using color coding for ease of reading and autocomplete.
The IDE can also assist us in correcting mistakes in the query statement that would cause the execution to fail by hovering the mouse over the error or warning.
Forming our hypothesis
For our scenario, we've been tasked by senior SOC Analysts to gather all the DNS queries made by our fictional user, Stanley Hudson, by a senior SOC Analyst. The Network Operations team offered to search the traffic and URL Logs of our NGFW, but due to the placement of the user's endpoint and the internal Active Directory server hosting DNS, the complete picture of these requests wouldn't be available in the NGFW URL logs alone. Our goals for this request are listed below:
- We need to search the endpoint logs for all DNS requests made by Stanley Hudson.
- We must include the DNS Request, DNS Record Type and DNS Resolution results.
- We need to export the results in an Excel document, attach it to the incident for archival and include a chart for quick analysis.
- We've been asked to save this query for future use by the other SOC analysts.
Defining the dataset
Since the XDR Agent is installed on all endpoints where Stanley Hudson can logon, we can use the default dataset to search DNS requests from his account. We'll start by defining the dataset as xdr_data
. The IDE's autocomplete function will also display other datasets available to us.
After specifying the dataset, we can use the bottom half of the IDE to view its schema or available columns. As of this article, at least 987 columns could contain data for our query.
Since we're looking for specific data about DNS, we can use the column filters to limit the rows displayed. We can also use the description column to view what each row's results could contain.
After reviewing the field descriptions, we're interested in the results from dns_query_name
, dns_resolutions
and dns_query_type
for our investigation.
Defining the filters
With XQL, there is a style guide to make the queries easier to read at each stage. Each stage is separated by the pipe (|
) character. We'll start the query with the fields we're interested in such as agent_hostname
, user_id
, dns_query_name
, dns_resolutions
and dns_query_type
.
If we click the run button, we'll notice that some results are returned without results in the columns that we're interested in.
To limit results, we'll use a special keyword named null
to return only results that are not blank in the dns_query_name
column.
As we review the results this time, we see that they include all endpoints including those that do not have a user identified.
For our next stage, we'll add another stage to limit the rows with the username ad\stanley.hudson
in the dataset.
At this stage, our results include the expected results from our query. We can download the results in a tab-separated file for archival by clicking the download button highlighted in green.
Like adding comments in code or scripts, we can include comments in the XQL Query to explain to other SOC Analysts the stages of the query using the double slash (//
) character set.
This is especially important as you create more complex queries or when troubleshooting stages.
Visualize the results
At this stage, we could scroll through the over 3,500 results and try to analyze them; however, it would be easier to visualize this table in a pie chart.
We can use a special stage named comp
to count results in each row of the dns_query_name
column into attempts based on the same request. The as
function renames the column into attempts
.
As we review the results table this time, the table is getting closer to our final idea. However, we think it would be more beneficial to reset the data based on the number of attempts.
We can update our query to sort the attempts column by the descending value by using a another stage called sort
.
At this stage, our results are sorted by attempts
of the dns_query_name
column for ease of analysis.
Finally, we use the Graph option of the results to build a pie chart based on our dns_query_name
and attempts
columns to visualize the results even further.
Saving the query
We can save this query in our Cortex tenant by clicking the Save As
button, and selecting the Query to Library
option. We can save the query in our personal library or share with others.
Clicking on the Query Library tab allows us to reuse or re-visit the query. Other queries from Palo Alto are also made available, and you can reuse or borrow query snippets to use in your XQL Queries.
Conclusion
Cortex XQL offers an intuitive and powerful way to query and analyze data within the Cortex datasets. Its straightforward syntax, intuitive IDE, and extensive functionality make it accessible for beginners to quickly meet our objectives of querying the dataset that matched our hypothesis, limiting the results using stages, exporting the results for archival and quickly creating a pie chart for analysis.
Stay tuned for the next part of our series where we'll use an XQL Query to create a BIOC Rule using XSIAM's data models.