AWS CloudFront
CloudFront access logs record every request to your site - including the user-agent string that identifies AI crawlers like GPTBot, ClaudeBot, and PerplexityBot. By giving sitefire read-only access to these logs, we can show you exactly which AI bots visit which pages, how often, and how that changes over time.
Time needed: ~15 minutes. Two steps: enable logging, create an IAM role.
How It Works
CloudFront writes a gzip-compressed log file to S3 for every batch of requests. Each line includes the URL path, timestamp, and user-agent. sitefire assumes a read-only IAM role in your account, syncs new log files, filters for AI bot user-agents, and surfaces the insights in your dashboard.
This is the same cross-account IAM pattern used by Datadog, New Relic, and other SaaS tools. No credentials are shared. You stay in full control.
Step 1: Enable CloudFront Logging
If your distribution already has logging enabled, skip to Step 2.
New setup (v2)
Standard Logging v2 (launched November 2024) is the recommended option for new setups. It delivers logs to S3 without requiring bucket ACLs, and the console handles all permissions automatically.
Open the Logging tab
Go to CloudFront > Distributions > select your distribution > Logging tab > click Add.
Configure S3 delivery
- Select Amazon S3 as the destination
- Choose or create an S3 bucket (e.g.,
yourcompany-cf-logs) - Optionally set a prefix
- Output format: any format works - we recommend JSON or W3C
- Field selection: make sure cs(User-Agent) is included (it is by default)
The console automatically creates the required S3 bucket policy. No manual permission setup needed.
Save
Logs start appearing in your bucket within a few minutes.
Step 2: Create an IAM Role for sitefire
This role grants sitefire read-only access to your log bucket - nothing else.
Before you start, email support@sitefire.ai to get your Account ID and External ID. You’ll need both values for the trust policy below.
Create a new IAM role
Go to IAM > Roles > Create Role > select Custom trust policy.
Paste the following trust policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::SITEFIRE_ACCOUNT_ID:root"
},
"Action": "sts:AssumeRole",
"Condition": {
"StringEquals": {
"sts:ExternalId": "UNIQUE_EXTERNAL_ID"
}
}
}
]
}Replace SITEFIRE_ACCOUNT_ID and UNIQUE_EXTERNAL_ID with the values you received from us.
Click Next.
Attach a permission policy
Click Create policy (opens in a new tab), switch to the JSON editor, and paste:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::YOUR-LOG-BUCKET",
"arn:aws:s3:::YOUR-LOG-BUCKET/*"
]
}
]
}Replace YOUR-LOG-BUCKET with your bucket name from Step 1.
Name the policy something like sitefireLogReaderPolicy, then save it.
Go back to the role creation tab, refresh the policy list, and attach the policy you just created. Click Next.
Name and create the role
Name the role sitefireLogReader (or similar) and click Create role.
Send us the details
Email support@sitefire.ai with:
- The Role ARN (e.g.,
arn:aws:iam::123456789012:role/sitefireLogReader) - Your S3 bucket name and prefix (if any)
That’s it. We configure the sync on our end and you’ll see AI bot data in your dashboard within a few hours.