Contact Us

Cloud services as exfiltration mechanisms

Written by Airwalk Reply Principal Security Consultant Costas Kourmpoglou

We’re naturally inclined to think about the starting position of an adversary outside of our trust zone. But what if the adversary is already in our trust zone and they want to get (data) out?

Maybe it’s an insider threat or a continuation of the external actor’s path. The ubiquity of cloud storage services helps. Put another way, if you have access to AWS resources, can you use them as an exfiltration mechanism?

Airwalk Reply is an award-winning AWS Partner I’m not talking about how to get data out of an AWS service. This is where most threat models focus. The question here is, as an adversary how can I use an AWS service to exfiltrate data? For example, circa 2017 domain fronting techniques started making an appearance in Amazon CloudFront. AWS responded to the issue by enhancing domain security in CloudFront.

Like CloudFront, blocking access to S3 is the equivalent of blocking access to the web - a lot of websites load artefacts directly from S3. Your corporate proxy most likely allows access to any S3 bucket or maybe there’s an allow list. The trust boundary is your corporate network. I started exploring the threat model and use cases of Amazon S3. For the rest of the article, I’m going to focus on S3 and a specific technique I’ve discovered to use S3 as an exfiltration mechanism.

The setup typically looks like this:

Egress proxy Firewall CASB
A request made from inside the corporate network would first hit an egress proxy, firewall or CASB. The capabilities might be chained, singular, or all-in-one solutions. The outcome that these capabilities tend to seek to achieve is the same, Data Loss Prevention.

A CASB should prevent adversaries from logging on to other Cloud Service Provider’s accounts, other than the corporate ones.

A firewall or an egress proxy should prevent adversaries from establishing connections to arbitrary or otherwise known malicious domains (among other things).

Inside the corporate network, an adversary can either have - 

1. Access to S3 with an organisational AWS account, maybe with no permissions or just s3:GetObject
2. Access to S3 anonymously, i.e., can curl S3

There's a third path. Bringing your credentials, from your account, inside the corporate network. For now, let’s falsely assume that there are no threats down that path and I might explore this as a topic for another post.

A reminder that our constraint across the board is that we don’t have any permissions or we at best have s3:GetObject. 


The Plan

With our setup complete, let’s establish the fundamentals with an example.

Getobject ExampleData
An attacker has control over RoleA in the victim’s account. RoleA has an explicit deny to every action and every resource. Would RoleA be able to successfully exfiltrate data to their AttackerBucket in a different account?

RoleA will get an AccessDenied, nothing can override an explicit deny statement. Plus, the attacker is GETing an object, it’s not a write operation. You’d be inclined to answer that no, RoleA wouldn’t be able to exfiltrate data. On top of that, you can maybe stipulate based on our setup, that the proxy/firewall/CASB would stop the request from reaching the attacker’s bucket.

Courtesy of Andy Warfield's amazing USENIX FAST '23 talk
Courtesy of Andy Warfield’s amazing USENIX FAST ’23 talk 

Well, S3 is a complex beast. It has this great feature called server access logging. Server access logging provides detailed records for the requests that are made to a bucket. 

It acts in the same way you’d expect a web server to act, but instead of delivering the request logs under/var/log, you configure another bucket, a logging bucket.

If you’ve ever configured this feature – which you should for specific workloads – you will notice request logs from internal AWS services like IAM Access Analyzer. 

attackerbucket [19/Oct/2023:12:48:23 +0000] arn:aws:sts::123456789012:assumed-role/AWSServiceRoleForAccessAnalyzer/access-analyzer [..] - "GET /?location HTTP/1.1" 200 - 137 - 28 - "-" "aws-sdk-java/2.20.162 Linux/4.14.255-318-256.530.amzn2.x86_64

OpenJDK_64-Bit_Server_VM/25.382-b05 Java/1.8.0_382 vendor/Amazon.com_Inc. md/internal exec-env/AWS_Lambda_java8.al2 io/sync http/Apache cfg/retry-mode/legacy" – [..] SHA256 AuthHeader TLSv1.2 - -

You can see the role being used, where it originated from and the user-agent. In this case, we can tell that IAM Access Analyzer is running on Lambda on a Java8 runtime.

In the docs, AWS calls out that

“Amazon S3 does not support the delivery of server access logs to the requester or the bucket owner for VPC endpoint requests when the VPC endpoint policy denies them.”

Does that mean, that when we’re outside a VPC (using a corporate firewall) or when the VPC doesn’t have a restrictive endpoint policy Amazon S3 supports delivery of server access logs? Yes, it does.

The execution

Here’s the adversary’s setup, a logging bucket.

AWS Logging Bucket
An attacker can enable S3 server access logging to a bucket of their control and exfiltrate data from within a victim's network using a victim's identity, with no permissions, bypassing data perimeter IAM conditions on their identity. The same principle works with simply issuing a GET request to the 'AttackerBucket', the difference is how your proxy/firewall/CASB would treat the request. If you only allow authenticated requests with your AWS organisation’s credentials, then an attacker can still reach out to the S3 Service and use this technique.

What happens when the adversary is inside a VPC? The VPC policy would need to allow that request to reach the S3 Service.

Amazon S3
Inside a VPC, the adversary requires a leaky VPC endpoint, i.e. an endpoint that would allow the request to be forwarded to the S3 Service. The adversary doesn’t require any permissions on their IAM Role for the request to reach the S3 Service.

This is the sequence diagram of how server access logging works: 

The S3 Service will receive and evaluate the request. It will return the appropriate response, in our case Access Denied. It will then record the request, collect the logs, and deliver them to the logging bucket. The fact that the requestor doesn’t have the IAM access, is part of the request evaluation which will be logged. 

When a VPC endpoint is between the adversary and the S3 Service, it all depends on the evaluation of the request from the VPC endpoint. If the request never reaches the S3 Service in the first place, there aren’t any logs to generate and deliver i.e. list your buckets or use resource conditions to block access to external resources.  

The log format of the output that the attacker will receive on their logging bucket, is well documented by AWS. In our case, it would look like this.

[..] attackerbucket […] – […] REST.GET.OBJECT PIIorOtherDataToExfilGoesHere "GET / PIIorOtherDataToExfilGoesHere HTTP/1.1" 403 AccessDenied 243 - 18 - "-" "UserAgentAlsoHasData " – […]

Where 'PIIorOtherDataToExfilGoesHere' is the object key that we’ve requested from S3 and 'UserAgentAlsoHasData' is the User-Agent we’ve set for this request.

An attacker can use the `Key`, and `User-Agent`, with potentially other S3 GetObject request parameters and include any arbitrary data in those requests. The resulting 'Access Denied' server access log entries, will contain the attacker-controlled data and will be captured on the attacker’s server access logging bucket.

With a 1024-byte long key per request, this is sufficient to exfiltrate large amounts of data pragmatically. Keep in mind that the logs are sometimes delivered out of order, but I’ll leave that as an exercise for the reader on how to solve this.

The summary

In summary – using AWS IAM alone is not an effective data perimeter and without additional network perimeter controls server access logs can be used as the mechanism to exfiltrate data, by providing request information that can carry arbitrary data. The logs end up on the bucket owner’s account. In this article, I have proved that if you can reach the S3 Service, account filtering, bucket filtering, or IAM conditions offer no mitigations for this scenario. The same principle stands for a straightforward HTTP GET request.

I suspect that a similar attack path can be realised in both Azure and GCP.

Summarising what our options are using a decision tree – built using which seems to be obligatory to include in my posts.

Using S3 as an exfiltration mechanish

The decision tree is not exhaustive. It focuses on the technique of using server access logs. You can find the diagrams and YAML here feel free to expand and iterate, as you should. If you want to map this to ATT&CK, we’re firmly in exfiltration over web service territory.

If you’re thinking 'I need to review proxy configuration', 'only allow list known S3 buckets', or 'find a list of known s3 buckets ala web of trust', I’d advise that you’re going down the wrong path. The solution sits outside of this threat model. You need to keep people away from the data. This design principle outlined in the AWS Well-Architected Framework is key in building secure products and services, not only in AWS but everywhere.

I’d suggest reviewing your threat models and feel free to reach out to discuss any of your security or engineering concerns in AWS or elsewhere.

P.S. I believed that this was a vulnerability and I’ve reported it as such to AWS security. The response summarised that what I’ve reported “[..] is specific to a customer application and/or how an AWS customer has chosen to use an AWS product or service.”, i.e. it’s on the customer side of the shared responsibility model.

AWS KMS Threat Model Read more