Introduction to XQL: Writing Your First Correlation Rule
In this blog
Introduction
Recently, I was talking to a colleague about the recent Palo Alto Networks CVE-2024-0012, where if the management interface of the NGFW was exposed to the internet or generally exposed network, against best practices, could result in exploiting the administrative web server. In our conversation, I questioned if administrators and SOC analysts were tracking failed login attempts to the administrative web interface of the NGFW to which my colleague didn't know that was possible. In this article, we'll continue to build on my last article "Introduction to XQL: Writing Your First Query" to build an XQL query based on failed logins in the NGFW, and create an alert in the Cortex XDR/XSIAM console.
As a note, simple use cases might be reserved for a BIOC Rule instead. However, BIOC rules are limited to the xdr_data
and cloud_audit_log
datasets, and presets for these datasets. Even though we're starting with a simple XQL Query for one dataset, we may want to expand this query for datasets including other critical administrative interfaces including, but not limited to, network, hardware out-of-band, and hypervisor infrastructure.
What is a correlation rule?
Correlation Rules use the Cortex Query Language (XQL) to analyze multi-event correlations from various sources. Scheduled rules trigger alerts based on these correlations, with customizable time frames such as every X minutes, daily, weekly, or at custom times.
Usage requires the Cortex XDR Pro license, and it is important to note that Cortex XDR automatically disables Correlation rules that reach 5,000 or more hits over a 24-hour period.
Forming our hypothesis
First, we'll return to our XQL Query Flow Chart to follow the flow and form our hypothesis. We'll start by faking a bad login to our NGFW administrative console, and then review the System logs under the Monitor tab. We'll focus on the event with the query ( eventid eq 'auth-fail' )
Success! We can see the logs with all the pertinent information in the Description field. Our NGFW is setup to forward these System logs into the Strata Logging Service, and our Cortex XSIAM tenant has this dataset onboarded already for us to query.
For our results, we want to structure the query to filter on failed login attempts and extract select fields in the Description column into brand new columns to build our alert.
Writing the query
We'll begin by returning to the XQL IDE and querying the dataset panw_ngfw_system_raw
where the System logs are stored, and then filter the events based on the auth-failed
event we saw on the firewall.
In viewing the returned results, we observe we have many columns that are not relevant to our hypothesis, and a new issue. Our Prisma Access Portal is also reporting failed login attempts in addition to our failed logins in the management console. We'll continue to work with these results for now for future use cases, but will filter them out at the end.
There's a lot of good information in the event_description
column that we want to extract into new columns to make analysis easier. We're going to use a new stage in XQL called alter
. With the alter stage, we can create or modify fields within a dataset and derive new columns and rows from existing fields.
In the event_description
results, we're interested in extracting the username, reason, authentication profile being used, and the source IP address of the failed login. We'll combine the alter
stage with arrayindex
to extract the specific elements from the event_description
field using our old friend, RegEx.
In the example result, we'll highlight each section in bold of the results that we want to capture in our query.
'failed authentication for user 'hacker'. Reason: Authentication profile not found for the user. From: 192.168.1.4
Let's step through each stage of our new query.
| alter username = arrayindex(regextract(event_description, "for user '(.*?)'"), 0)
Using arrayindex
, we'll use RegEx to look into the event_description
results for for user
followed by any sequence of characters that is captured by (.*?)
. Since we're using arrayindex
, the 0
is used to retrieve the first (and only in this case) match.
| alter source_ip = arrayindex(regextract(event_description, "From: ([0-9.]+[^.])"), 0)
Continuing to use arrayindex
, we'll use RegEx to look into the event_description
results for From:
and then use a special RegEx to return the IP address result into array 0.
Next, we're going to future proof our authentication profile to differentiate between failed management login attempts along with failed GlobalProtect and Captive Portal login attempts.
| alter reason = arrayindex(regextract(event_description, "Reason: (.*?)(?:\.| auth profile)"), 0)
| alter auth_profile = arrayindex(regextract(event_description, "auth profile '(.*?)'"), 0)
| alter auth_profile = if(auth_profile = null, "No Auth Profile or Mgmt Login Attempt", auth_profile )
Continuing to use arrayindex
, we'll use RegEx to look into the event_description
results for Reason: using a lazy match or phase auth profile
denoted by the pipe character. There is a possiblity that the auth_profile
could have no value returned. Using the special phrase null, we'll replace the empty field with No Auth Profile or Mgmt Login Attempt.
Finally, we'll limit the results with the select fields of the device serial number using _device_id
, the friendly name of the device using _reporting_device_name
, our new columns of reason
, username
, source_ip
, auth_profile
, and the original dataset columns event_description
and event_name
.
We can see our new columns and structured results.
For our use case, we're only interested in the failed management logs. We'll add another filter to exclude the failed Prisma Access Portal login attempts. We'll filter the results on that do not match GP cloud service
by using | filter _reporting_device_name !~= "GP cloud service"
.
Here is our final result:
Since we're happy with our results, we'll click on the Save As
button and select Correlation Rule
.
Creating a Correlation Rule
After clicking the Save As
button, we'll be presented with a wizard to build the correlation rule. We'll start with entering a rule name and a description under General
.
Our XQL Query comes pre-populated in the XQL Search
section, and we only need to specify when the query runs. In Cortex XDR, we can run this query on a time schedule such as:
- Every 10 Minutes: Runs every rounded 10 minutes at preset 10 minute intervals from the beginning of the hour.
- Every 20 Minutes: Runs every rounded 20 minutes at preset 20 minute intervals from the beginning of the hour.
- Every 30 Minutes: Runs every rounded 30 minutes at preset 30 minute intervals from the beginning of the hour.
- Hourly: Runs at the beginning of the hour.
- Daily: Runs at midnight, where you can set a particular Timezone.
- Custom: Displays the Time Schedule as Cron Expression fields, where you can set the cron expression in each time field to define the schedule frequency for running the XQL Search. The minimum query frequency is every 10 minutes.
With Cortex XSIAM, in addition to time schedule, we can query the data in real time as it is ingested. Cortex XSIAM can detect that the query can be run in real time and recommend that you select the Real Time
tab.
Since our query is eligible for real time, we'll select the Real Time
tab.
Next is the Alert Suppression
section where we can define alerts to be surpassed that contain the same field data during a specific time window, or just a specific time window. For our scenario, I'm going to consider that an admin could be having a bad day, and just mistyping their password for a few times. Considering our Account Policy locks out an account after five failed login attempts in 15 minutes, I'll use the fields to set the alert suppression using _reporting_device_name
, username
, and source_ip
.
Our final section is Action
. We'll specify the radio button to Generate an Alert
, but we could also save these results to a dataset, or add/remove to a lookup in a dataset.
Next, we can define the Alert Name
the SOC Analyst will see, and then define the Domain
, Severity
, and Category
of the alerts.
Next, we can define the Alert Description
to quickly describe to the SOC Analyst why the alert was trigger. We can use variables from our fields in the XQL Query to dynamically generate the values.
Drill-Down Query
is an optional XQL Query where you can pivot using the output results for additional information about the alert. An example might be using the $source_ip
or $username
to continue the investigation. For now, we'll leave this blank to keep things simple.
Next, we can map our alert to the MITRE ATT&CK framework. If you're new to MITRE, checkout the article The MITRE ATT&CK Framework: A Beginner's Guide.For our simple use case, we'll select Valid Accounts
in the MITRE Matrix.
Finally, we can map our XQL Query fields in the Alert Fields Mapping
to provide this information quickly the SOC Analyst. In addition, mapping the fields helps to improve incident grouping logic and enables Cortex XSIAM to list the artifacts and assets based on the map fields in the incident.
Finally, we can click the Create button to enable the Correlation Rule. You can also disable the rule during maintenance or when improving the XQL Query to not impact the SOC.
We can test our new Correlation Rule to verify that the rule is generating the alert as expected by trying a bad username and password on our PAN-OS WebUI.
Conclusion
Correlation Rules are exactly how we can use the XQL Queries we generate during threat hunting, to turn the results into actionable incidents that SOC Analysts can work and investigate. By starting with our simple use case and hypothesis, we were able to generate the exact data we needed from the datasets to quickly present and alert our SOC on potentially malicious behavior.
Stay tuned for the next part of our series where we'll use an XQL Query to create a dashboard for reporting.