Accelerating URL Filtering with AI/ML
Traditional URL filtering relied mostly on databases and web crawlers to build a network of known bad and good websites across the internet. This information was compiled and then sent out to your devices as updates so that you would be able to act on what is allowed or not for which users. With new websites being spun up basically in real time to trap your users this is no longer an option for protection. Security is and should be treated as an onion approach. The legacy way of handling this is still viable even if it is not enough any longer, because of this, we now layer on additional tools and techniques to pick up where each is lacking. The database still exists, and when sites are found through these ML/AI techniques, they are categorized and added to the database for all Palo customers, but this is after they have been seen in the wild. The zero-day access is where you want to make sure you also have protection.
With these new AI/ML features we are now able to categorize and block malicious behavior in real time. Nothing is foolproof and, as such, needs to be added with the other layers of security tools to capture as much information as possible to make an informed decision, but this allows Palo to be much more agile in its detection and response as things come up. This means less time your teams are working on containment, less time spent verifying the spread of attacks, and more time working on the other projects that are already underway. When it comes to URL filtering, in general, you are not looking to babysit but rather set it and forget it. You want to make sure everything is categorized accurately so that when you spend the time setting up your policies you're not worrying about if anything is getting through.
Advanced URL filtering
This is one of what Palo Alto Networks calls a CDSS, or Cloud Delivered Security Service. These are subscriptions that are enabled on your firewall that have a local engine and database but also a connection to the Palo Cloud that allows additional resources to be used for more thorough and faster detections. The Advanced URL Filtering subscription secures your traffic against phishing, malware, and Command-and-Control attacks and works in conjunction with the other security subscriptions to protect your entire network. The service now uses Machine Learning to categorize sites and traffic in real-time to bridge the gaps where the database might not have information on a particular site, or a site that is now hosting information different than the last time it was seen. The beauty of Palo's approach to the cloud connectivity of these subs is that when something is detected in one network it is updated and pushed out to all customers. While you have the ability to run these engines locally you are still protected if there is a connection issue with the cloud backend. You are also protected by having all this information shared across all customers.
The other side of this coin is not really maliciously related at all. Let's say you want to make sure your employees are not able to upload company information to unauthorized storage sites because you need to ensure the security of your data. Often, through the countless number of vendors and other companies out there, they request for you to use "Boxupload.com" instead of OneDrive because they don't use that. Your teams don't think anything of it because they work in marketing and not IT so they attempt to share the files that way. You can't be everywhere at once so you need a way to make sure you have everything blocked but what sites you allow. This is known as Shadow IT. Accurate categories are just as important as accurate identification of malicious traffic.
Features
- Caching
- The size of the firewall you have deployed with determine the size of the Caching DB. This allows much faster classification for sites that have already been cached. This DB is updated dynamically and is not dependent on a file update like the AV or Content and Applications updates.
- SSL Sites
- If you are not using decryption the firewall looks for threats during the SSL/TLS handshake and the hostname for the requested site. This hostname is used to match the site to a specific category.
- Inline Machine Learning
- ML Models leverage the cloud as well as Wildfire to help with zero-day detections
- Cloud Connectivity
- To help maintain the best efficacy, the cloud is used as often as possible for additional lookups and to update the ML models in real time.
Business example
The largest threat to any company is often not some state sponsored actor hacking into your network like the movies. It is often low-effort phishing or social engineering techniques used against unsuspecting employees. Derrick, in accounting, is tired from making sure he hits an important deadline didn't notice the email he received from a contact at one of your vendors has a very slight change in the URL for their domain. He responds by going to this site to upload tax forms without realizing he has been phished. Now you have credential information and private company information in the hands of someone who is now able to easily impersonate your employees to gather even more information. This happens all the time and is so simple but goes under the radar with the sheer amount of tries sent out because of the ability of bad actors to leverage automation. They only need to be right one time for it to be a major problem.