The AI Summit New York

Winning versus the Cyber adversary: AI to the rescue

Deloitte's Keynote Speech

AI Summit New York: Winning versus the Cyber adversary: AI to the rescue 

Join Mike Morris , Managing Director – Risk & Financial Advisory, and Abdul Rahman, AVP – AI CoE, Deloitte in this deep dive panel discussion as they explore winning AI strategies in the cybersecurity space.

Introduction and Overview

Mike Morris
So the goal of the session today is to walk you through some of the problems that we see in the cyberspace. We'll set that stage up front, take you through kind of how AI can be applied to the cybersecurity approach, and then some of the lessons learned that we've observed over the last year and a half of creating some technologies.

Cybersecurity Threats and Challenges

Mike Morris
So, a couple of concepts to consider. Actions in cyberspace are near instantaneous without regard to geography. So the attacker has the advantage in cybersecurity. They don't have rules.

They can pivot from country to country while the defenders have their hands tied because they do have limitations as to what they can defend, and they can only defend their own perimeter. That makes it challenging for the defense to ever achieve any true form of security.

If you take a look through history, we've seen analogs similar to this in the past where you had airplanes or air power in the military, as well as submarines and ships. In order for them to be attacked, you had to attack them where they were, right?

AI summit new york 2024 ai in cybersecurity

The Role of AI in Cyber Defense

Mike Morris
As we continue to evolve, technology solutions such as radar and sonar have come online. What we propose essentially and what we're going to talk about is we're looking to use artificial intelligence in cyber to help push the game in the same way that sonar and radar did.

The Challenges of Data Overload and Alert Fatigue

Mike Morris
A couple of things to consider for cyber. One, there are a plethora of cybersecurity tools already out there in the space. There's no shortage of data. The problem with cyber is getting through the data.

It's a very manual process in most cases today. And many of the operators or analysts that are sitting in defense end up having alert fatigue. They can't keep up with the amount of alerts that are being generated off of the products that are there.

There's increasing threat complexity. So attackers are getting smarter, and they have all the same tools that the defenders do. So they get to test and figure out ways around those different security products.

So you can't just rely on essentially security products coming up with the stop. I've already talked about alert fatigue. The time to detect and respond becomes a challenge. In many cases for service providers like Deloitte, our service level agreements will say that if we end up seeing an attacker or a critical event come off one of those products, that we have 30 minutes for a human to go through and do an initial triage or identify whether or not it's malicious or it's an anomaly.

In that case, the attacker has the advantage. They've got 30 minutes to maneuver, and they can maneuver pretty quickly. And so by the time you get through a full triage, it becomes very manual. There's a lack of visibility in cyber.

So CISOs and defense teams don't fully understand their entire threat landscape. They also don't understand in many cases the threats or the actors that would be coming after them based off of their motive.

Talent is scarce. I think there's a 2 million personnel shortage. The number of changes every day gains a couple hundred thousand detections of attacks after the fact. So it's very reactive in nature. And then there are high false positive rates. So when we take a look at the trends in cybersecurity, over the past several years, we keep seeing the attackers increase their knowledge.

If you take a look at 2020, 60% of attacks were using malware or malware identified signatures. If you move that up in 2021, it reduced down to 49%. And then as you get up even higher, it's at about 32%.

Attackers are manipulating operating systems better than they were able to before, and they're hiding in plain sight. In many cases, they're using the security products themselves to allow for them to maneuver inside of the environment.

Reducing False Positives with AI

Abdul Rahman
There are opportunities to leverage AI to do a lot of things to include reduction of this false positive rate that you see that addresses really event fatigue.

Essentially, there's more data and more events occurring and there are people to process this. So operating at machine speed to give up with the adversary requires quite a bit more. The final point there is things that are known are typically codified within rules.

To detect things that are new, or what we'll call zero days requires behavior -based techniques.

AI summit new york 2024 ai in cybersecurity

Client Expectations and Automation in Cybersecurity

Mike Morris
Meanwhile, we talked about how the adversary has the advantage over the defender today, based off of the manual requirements for security. However, clients are looking for outcomes.

They're looking for defense teams, service providers, in Lloyd's case or in the case of the organization, they may have their own security team. They're looking for the defenders to reduce the mean time to prevent, the mean time to detect and the meantime to respond.

So as we went through this journey in over the last 18 months or so, what we've taken a look at is, how can we take care of some of those manual processes? How can we identify lateral movement quicker?

How can we focus around unknown, unknown identification and hook that up with things like orchestration or automation to now pivot to the endpoint or pivot to the network resource to get collection that normally would take 30 minutes for a human to do to drive that down into minutes.

And so that essentially was the premise that we set out to go and achieve. In order to do that though, we needed to get the data into one spot. So it's great to have 15 different products that do different things and all supplement each other, but the reality is those products in many cases store their data in a separate database that the operators and analysts can't get to.

Data Lakes and Predictive AI

Mike Morris
So what we did is we created our own data lake, essentially for storage capacity. So on AWS and GCP, we store clients data in individual S3 buckets, and then we run analytics over top of each of those so that we can get to trend analysis.

So think of verticals. If I have 12 retail clients, I should be able to take a look over the course of three, five years, however much data I have, and start to identify what trends am I seeing in specific verticals.

To do that though, storage is one component. We need it to go through and be able to process and read the different databases in order to get to analytics.  

Abdul Rahman
There's a ton of data and one of the key things we want to do over this data and be able to predict when an attack would occur. And you're not going to be able to do this in every case.

So for adversaries that come into your network that are utilizing a certain tactic or technique or procedure or TTP, they may exploit a certain protocol, they may exploit a certain application, they may go after a specific application that operates at a certain time of day.

When they do that, it leaves like what we call a signature. And those types of signatures are what are called indicators of compromise or IOCs. And so what we want to do is collect data and we want to construct or codify a capability that will train over that data and then be able to identify with a high accuracy and high precision the type of behavior there.

So this is basically setting the stage for something like machine learning or artificial intelligence, which will come in and be able to read through all of the data, apply the model, and then being able to say yes, we have in fact we have seen this behavior.

Now moving into step four for people that work, people that are data scientists and AI subject matter experts or mathematicians, they want to build models and then be able to pass off alerts so that analysts can actually use them.

So once they receive the alerts, the key is what are we going to do with this information? So if the alert says we saw a type of behavior that we've never seen before where someone is looks like they're going to compromise this part of the network, analysts go do something about it and that workflow is really how, you know, the AI, ML experts interact with, you know, the cyber security experts.

So let's dig a little bit deeper in terms of volumes of data and magnitude. In a lot of cases what we'll call cyber is a place where you have a large volume, high velocity of data with very, very specific variety.

I'm characterizing a data space where data is usually highly structured to semi  structured. But what does this really mean? This means in a lot of ways that you know what the labels are.

When you ingest data from a bunch of different sources, you have to achieve sort of alignment for all of the labels of the data in order to be able to build a model that we're trying to look for phenomena.

And in AI, phenomena is really called a feature. So what we're trying to do is take the labels of data that we have and be able to build a model that will predict the kind of phenomena or basically build a kind of feature.

So we engineer these features based on the data that we collect. Now, because the volume is here, we have certain things that we want to achieve. We want to achieve a low false positive rate. We want very, very fast detection times.

So it's no good if at 9am we see an alert and we only process it at 3pm and then hand it over at 4pm, we want to be able to see the alert and within minutes be able to hand that off to say yes, something bad is happening on the network.

So the idea here is that if you get 10 terabytes of logs every single day, and you can reduce the volume of that, something like take a logarithm or be able to point an analyst in an appropriate direction, you've essentially done a kind of reduction.

I don't want to say a dimensional reduction, but really an order of magnitude or two on the types of events that are critical for analysts to see. And this helps with workflow. So like I said earlier, it's difficult because there's more events than there are analysts.

So helping them in any way, shape, or form, pointing them to the type of behavior or the time of day or the type of event is really critical. The other thing that I did want to add just briefly, there's this notion of dwell time.

The Importance of Reducing Dwell Time

Abdul Rahman
Dwell time in a network is the time from when a threat is deployed to the time that it's detected. And in 2022, the average dwell time for a threat to go undetected in a network at a minimum was 24 to 30 days.

In some cases where there are many attacks on different government installations or commercial spaces, this goes into the months, six months, or even possibly a year. So being able to deploy AI to detect some of these indicators of compromise from behavioral -based sources and being able to bring the data together offers a lot of value in addition to rules -based techniques for different tools.

Mike Morris
I think the only thing I would add is where we're trying to go and where cyber needs to move as a whole, is drive to response activities because the adversary right now maneuvers with freedom inside of the environments that they're exploiting.

So the real focus in order to change the game of cyber, specifically with AI, is to reduce the time for collection so that a human operator can get into an active engagement or a cyber knife fight. That's really the way that the industry needs to move.

Abdul Rahman
So, if you can ingest the data fast and then score a model over it and provide a prediction that gives an alert, that's really the kind of workflow that will cut down and save human calories. So, if you can really take those human calories and apply it toward something that's been reduced or provided a direction through this AI, then that would be a lot more efficient than coming through a lot more data.

Many individuals have heard of what we'll call, affectionately, the cyberkill chain. And this has been, I would say, implemented in a couple different ways, but one way that the industry has really adopted is the MITRE ATT &CK framework.

It really supplies the notion of thinking from the left, from when an attack starts, how it propagates, and then finally to the completion of it. And what I'd like to also kind of add as a footnote is that on the bookends of the MITRE ATT &CK framework, there's a lot of behavior that occurs on the network.

Within the MITRE ATT &CK framework, inside of that, there's a lot of activity that occurs on the endpoint. And this should really direct the types of characteristic data that's needed in order to do holistic detection.

You do need network -based detection for things like initial foothold or exfiltration or command and control, but a lot of activity that's occurring within the actual endpoint itself has to be done with something like an EDR.

As you collect this data and you build up this profile, there's different members of, say, Mike's team, for example, that will want to look for certain characteristics that are seen in threat intelligence.

So what does this mean? That means we saw a variant of Petya. We saw a variant of WannaCry. We saw some other variant that's out there that there's no rule that exists. So just some introductory material as another footnote.

Whenever there's a rule that's a known threat that comes from threat intelligence, we process threat intelligence, we understand what the port and protocol is, we write a rule, we deploy it within the firewall or IDS, and that's essentially a countermeasure.

So threat intelligence that exists today corresponds to known threats that's deployed as a countermeasure. You need those to be able to stop things that you know. But the moment that that rule is slightly perturbed or varies based upon a new variant, that rule will no longer be applicable in supplying the kind of protection you need.

So this behavior -based capability coupled with hunting describes the kinds of ways that AI can kind of, if you will, be utilized for hunting teams.

This is sort of a general aggregation of a lot of different things for opportunities to use AI around logs and events to support the security analyst teams. Some of the things that we're going to need to take into account is knowing what is a threat versus not a threat.

And that is essentially what in AI is called building a classifier. So we use different techniques to build different kinds of classifiers. In some cases, is it a threat or not? It might be a binary class, but there may be other considerations.

In addition, we want to be able to work with folks to be able to improve the quality of our models. So models are typically measured on different kinds of metrics. These can be accuracy, precision, F1, recall, et cetera.

And as they're deployed, Mike and his team will say we missed one, we missed two, or there's a false positive rate. And what that becomes is the opportunity for us to acquire more data and retrain the model and then redeploy it.

That interval of retraining, retesting, revalidating varies per network. There is no one tried and rule. But what I will say is that there is too long of a time where the model's predictive power becomes stale.

Adversaries adapt. So it's necessary to retest and retrain AI. And this is part of the life cycle. Are there ways to do this automatically? Yes.

I want to talk a little bit about an example, unknown unknowns. Maybe about 11 or 12 years ago, there was a lot of ideas about detecting unknown threats. And these are the equivalent of what are called zero days.

So remember what I said is that if you have threat intelligence, and you come up with a rule, and you have a countermeasure, that's for unknown threat. So the idea here is, how do you build an AI that will predict or look at something in general, or pick up something or pick up a kind of behavior about something that's not been seen, but still looks suspicious.

So we've built a model, and we've deployed it into a couple of different customers that Mike's team are working through. And we've done this through building out anomaly textures through different kinds of flows.

We have deployed and built this through some auto encoding technology.  

Collaborative Cybersecurity and Final Thoughts

Q&A

You mentioned about all the variants and fluidity in this environment. Things are happening all the time. Deloitte's built an expertise around this, but is there a larger body or form that shares information about all the cyber threats that you guys can tap into, kind of like it takes a village to attack this, as opposed to doing this company by company, sector by sector?

Mike Morris
So they have what they call ISACs. So like financial sector has an ISAC where all the finance will come together. They'll share information of what they're seeing. They'll do the same thing in retail.

So what you have is you have very industry specific, but there's really no governance. Data will go back to DHS. People can subscribe to DHS to receive some data feeds, but it's not really tied in in the way that it should be.

That's a maturation of the environment. Everyone sees that as a hurdle and it is starting to work through, but yeah, they do have individual ISACs. There is. It's the healthcare ISAC. It's the H -ISAC.

My question is mostly around the interfacing of the business teams and the ML teams. I think we have a very similar approach at Bank of New York as well, where we are trying to do operational trade and only situations. The one challenge we'd really want to hear more is about how do you create a pipeline of feedback that feeds into the model because the languages or the toolkits used by operational teams or the business teams tend to be very disparate from the toolkits or the pre -processing required by the ML models. We've been typically taking a batch approach, but if you have any ideas on how to make it more continuous and free flowing, we'd love to hear that.

Abdul Rahman
Batch -based feeds for data, if we sort of break up the problem in terms of data, data management, data engineering, application of business rules, and then really scoring of models and sort of the alerting side.

The whole data engineering, data management side in terms of the life cycle of AI and ML represents about 70 to 80% of a lot of the major muscle movements for data scientists today, especially in cyber.

However, there's only a finite number of feeds, approximately 30 data types that are relevant to cyber. So knowing that and understanding that, and understanding what those labels are on the data, you can actually build models appropriately and then be able to utilize certain tools to be able to feed that data in.

The question becomes, what's an acceptable timing delay and timing advance on data that's ingested to support your capability? And I think that comes on a customer by customer basis. So if they have a temporal requirement to respond within four hours, but data is only refreshed every day, there's obviously going to be, if my math's correct, a mismatch there on the order of about six, right?

Find Out More

Explore prices and passes for the 2024 edition and secure your spot: Prices and Passes

Or check out the higlights from the 2023 event to see what you can be part of: The AI Summit New York 2023 Highlights

2024 Sponsors

Headline Partners

Loading

Industry Partners

Loading

Diamond Sponsors

Platinum Sponsors

Loading

Gold Sponsors

Silver Sponsors

Bronze Sponsors

Associate Sponsors

Loading

Media & Strategic Partners

Loading

VISIONAIRES VIP LOUNGE AND VIP PROGRAM SPONSORS

Loading

 

The Hackathon Sponsors

Loading