An aggregator site is a niche, frequently-updated source of clean RSS items. That happens to be exactly what AI builders need to feed their agents fresh data. If you’re already running one, you have a paid product sitting half-built on your server.
Our co-founder Jean Galea wrote about this angle recently on WP Mayor (Your WordPress Aggregator Site Is Now an AI Product), making the case for why aggregator owners have a second revenue line they probably haven’t noticed. This post picks up where his left off, with the actual code: how to gate the feed, deliver it in formats agents can use, plug it into a working AI pipeline, and price it.
What does an AI agent actually need from an RSS feed?
Most public RSS feeds are useless to AI agents. They’re stale, they cover too much ground, and the field shapes drift between items. The agent ends up doing more work parsing the feed than it does answering the user.
Your aggregator site solves all three problems before the agent ever sees the URL. The feed updates every 30 minutes (so the data is fresher than a foundation model’s knowledge cutoff by months or years), the sources you’ve added are a hand-picked list inside one specific niche, and every item passes through the same import pipeline so the fields line up the same way every time. An agent can parse it with one library call and move on.
You probably built all of that accidentally on day one. The work is already done. The question is how to package it.
How does a WordPress aggregator site become an agent data source?
WP RSS Aggregator already handles the hard part. Every feed you’ve added is a vetted publication, and that list of publications is the part competitors can’t copy. Per-source keyword filters drop the noise before items even hit your database, so what makes it through is what an agent would want. And under Settings → Custom Feed, the plugin exposes a single URL (defaulting to /wprss) that combines everything into one Atom feed. That URL is what you sell.

For this article, I built a quick test site with five sources, 15 most-recent items per pull, all served as a clean Atom feed. Nothing fancy, just the plugin doing what it does.

And here’s what the feed itself looks like when you hit the URL directly:

Predictable, well-formed XML. Any agent (or any tool that talks to one) can parse it with a one-line library call.
What the plugin doesn’t do natively is gate that feed. Anyone with the URL can read it. That’s fine for a public news widget, but it’s a problem the moment you want to charge anyone for access. So you need a thin layer of access control in front of it.
How do you gate access so the feed can be sold?
There are two reasonable places to put this layer. You can run it as a Cloudflare Worker in front of your site, which keeps the gating logic out of WordPress entirely. Or you can drop a tiny mu-plugin into your install that intercepts the feed request before WordPress assembles a response. The contract for the customer is identical either way: they append ?token=THEIR_TOKEN to the feed URL, the gate validates it, and the feed comes back if the token is good.

Path A: Cloudflare Worker (recommended if you already use Cloudflare)
Put a Worker in front of feeds.yourdomain.com/*. Origin is your WordPress site. Tokens live in a KV namespace, so revoking a customer is a single kv:key delete command.
// cloudflare-worker-token-gate.js
export default {
async fetch(request, env, ctx) {
const url = new URL(request.url);
const token = url.searchParams.get('token');
if (!token) return json({ error: 'missing token' }, 401);
const raw = await env.TOKENS.get(token);
if (!raw) return json({ error: 'invalid token' }, 401);
const meta = JSON.parse(raw);
if (meta.expires && meta.expires < Math.floor(Date.now() / 1000)) {
return json({ error: 'token expired' }, 401);
}
// 60 req/min per token
const rlKey = `rl:${token}:${Math.floor(Date.now() / 60000)}`;
const count = parseInt((await env.TOKENS.get(rlKey)) || '0', 10);
if (count >= 60) return json({ error: 'rate limit: 60 req/min' }, 429);
ctx.waitUntil(env.TOKENS.put(rlKey, String(count + 1), { expirationTtl: 120 }));
// Strip token before proxying upstream
url.searchParams.delete('token');
const originUrl = new URL(env.ORIGIN_URL);
originUrl.search = url.search;
const upstream = await fetch(originUrl.toString(), {
headers: { 'User-Agent': 'aggregator-feed-gate/1.0' },
cf: { cacheTtl: 60, cacheEverything: true },
});
const headers = new Headers(upstream.headers);
headers.set('X-Feed-Customer', meta.customer);
headers.set('Cache-Control', 'private, max-age=60');
return new Response(upstream.body, { status: upstream.status, headers });
},
};
function json(body, status) {
return new Response(JSON.stringify(body), {
status,
headers: { 'Content-Type': 'application/json' },
});
}
The token gets stripped from the URL before the Worker proxies the request upstream. That keeps customer secrets out of your WordPress access logs, which matters because logs end up in support tickets, in screenshots, in error reports, and in places you don’t fully control.
Each token is rate-limited to 60 requests per minute. Enough headroom for a normal polling agent, and not enough to let one customer take your origin down by accident.
Every response carries an X-Feed-Customer header with the customer ID. When something breaks and you need to know which customer was making which requests, having that already in your logs is the difference between a 30-second lookup and an evening of correlation work.
Path B: WordPress mu-plugin (no Cloudflare needed)
If you’d rather not stand up a separate edge layer, this drop-in mu-plugin does the same job from inside WordPress. It hooks parse_request and rejects requests to your custom-feed slug that don’t have a valid token, so the rejection happens before WordPress wastes effort assembling a feed for a request it’s about to throw away. No activation step needed; just drop it into wp-content/mu-plugins/.
<?php
/**
* Plugin Name: Aggregator Feed Token Gate
*/
defined( 'ABSPATH' ) || exit;
const AGGREGATOR_FEED_SLUG = 'wprss';
add_action( 'parse_request', function ( $wp ) {
if ( ( $wp->request ?? '' ) !== AGGREGATOR_FEED_SLUG ) return;
$presented = isset( $_GET['token'] )
? sanitize_text_field( wp_unslash( $_GET['token'] ) ) : '';
if ( $presented === '' ) return reject( 401, 'missing token' );
$tokens = get_option( 'aggregator_feed_tokens', [] );
$match = null;
foreach ( $tokens as $stored => $meta ) {
if ( hash_equals( (string) $stored, $presented ) ) {
$match = $meta;
break;
}
}
if ( $match === null ) return reject( 401, 'invalid token' );
if ( ! empty( $match['expires'] ) && $match['expires'] < time() ) {
return reject( 401, 'token expired' );
}
unset( $_GET['token'] );
$_SERVER['QUERY_STRING'] = preg_replace(
'/(^|&)token=[^&]*/', '', $_SERVER['QUERY_STRING'] ?? ''
);
header( 'X-Feed-Customer: ' . ( $match['customer'] ?? 'unknown' ) );
header( 'Cache-Control: private, max-age=60' );
}, 1 );
function reject( $code, $msg ) {
status_header( $code );
header( 'Content-Type: application/json' );
echo wp_json_encode( [ 'error' => $msg ] );
exit;
}
Note that the token comparison uses hash_equals rather than a plain ==. A regular string comparison can leak token contents through timing side channels, and PHP ships hash_equals as a constant-time alternative for precisely this reason. The presented token also runs through sanitize_text_field before anything else touches it, because tokens arrive from the open internet and the only safe assumption is that someone, somewhere, will eventually send something nasty.
Things that look like a gate but aren’t
A few things look like gates but don’t really hold up. IP allowlists tend to fall apart because customers move and CI runners rotate, so you end up maintaining allowlists instead of building product. Referer-header checks are trivially spoofable, and many agents don’t send a real Referer anyway, so you’d risk blocking legitimate traffic. And keeping the URL secret only works as long as nobody screenshots their dashboard or pastes the URL into a Slack thread. Once a feed URL is public, it stays public.
How do you deliver the feed in a format agents can actually consume?
Ship the gated RSS or Atom feed first. It’s what your plugin already produces, and every workflow tool worth using speaks RSS natively. n8n, Make, Zapier, IFTTT, and any custom script your customer has lying around will work without modification.
If you want to make life easier for the LLM-side tools, add a JSON Feed mirror at a parallel URL. Most modern agents would rather parse JSON than XML, and a small Worker route can do the conversion on the fly. Output the canonical JSON Feed 1.1 spec and existing JSON Feed clients will pick it up without any custom adapter.
And then there’s MCP, which is the format I’d reach for first if your customer is building on Claude Desktop or any MCP-aware agent. An MCP server lets the customer skip the workflow tooling entirely. They add your server to their config and your feed becomes an attachable resource in any chat. Here’s the minimal version, about thirty lines:
// mcp-server.js , minimal MCP server exposing a gated feed
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import { ListResourcesRequestSchema, ReadResourceRequestSchema }
from '@modelcontextprotocol/sdk/types.js';
const FEED_URL = process.env.FEED_URL;
const FEED_TOKEN = process.env.FEED_TOKEN;
const server = new Server(
{ name: 'aggregator-feed', version: '1.0.0' },
{ capabilities: { resources: {} } }
);
server.setRequestHandler(ListResourcesRequestSchema, async () => ({
resources: [{
uri: 'aggregator://feed/latest',
name: 'Latest fashion industry feed',
description: 'Fresh items from the fashion aggregator (15 most recent).',
mimeType: 'application/atom+xml',
}],
}));
server.setRequestHandler(ReadResourceRequestSchema, async () => {
const url = new URL(FEED_URL);
url.searchParams.set('token', FEED_TOKEN);
const res = await fetch(url, { headers: { 'User-Agent': 'aggregator-mcp/1.0' } });
if (!res.ok) throw new Error(`Feed returned ${res.status}`);
return {
contents: [{
uri: 'aggregator://feed/latest',
mimeType: 'application/atom+xml',
text: await res.text(),
}],
};
});
await server.connect(new StdioServerTransport());
From the customer’s side, they never see RSS or JSON or workflow plumbing. They just get fresh feed items inside their conversation when they need them.
How do you connect the feed to an actual AI workflow?
n8n makes the cleanest demo because each step in the pipeline shows up as a labelled box on a canvas. The workflow below polls your gated feed every 30 minutes, dedupes the items by ID, sends each new one to Claude for a tight one-paragraph summary, and drops the result into a Slack channel.

The feed token lives in an n8n credential, not in the URL field on the HTTP Request node. Workflows get shared as screenshots all the time, in tutorials, in support threads, in pitch decks. A credential reference stays masked in those screenshots. A token in a URL field does not.
The system prompt to Claude is also wrapped in cache_control: ephemeral. Anthropic’s prompt cache means you pay the full system-prompt cost on the first call and roughly a tenth of that on every subsequent call inside the cache window. For a workflow processing a hundred items a day, that’s about eleven full sends instead of a hundred. Real numbers depend on your model and prompt length, so spot-check the math in Anthropic’s prompt-caching docs before you build a business case on it.
Workflow JSON is in the downloads at the bottom of the post. Import it, plug in two credentials, hit run.
How do you price and package this?
The simplest packaging is a flat monthly subscription for one feed URL with a usage cap, something like 10,000 requests a month. Easy to sell, easy to bill. A predictable monthly line item is generally what a buyer wants out of an infrastructure feed. Anchor the price on what they’d otherwise have to build in-house: maintaining a list of 30 vetted publications is at least an hour of dev time a month, so $200 isn’t a stretch.
If a customer’s usage really is unpredictable, you can move them to per-request or per-item metering, but only if their usage is what’s driving your hosting bill. Metering adds real operational complexity: real meters, real reconciliation, real edge cases. It usually isn’t worth that overhead until you have enough volume that the alternative is leaving money on the table.
Where revenue starts to get interesting is the agency white-label arrangement. One agency builds 20 client agents, wants a single bill, a single contract, and 20 tokens issued on demand. A 3x-to-5x markup over the per-feed price is a reasonable starting point, though you’ll want to calibrate based on what an agency actually pays for similar infrastructure. This shape rarely shows up as a first deal, but it’s the one that scales once it does.
| Packaging | Best for | Operational lift | Typical entry price |
|---|---|---|---|
| Per-feed subscription | Most first customers, predictable usage | Low (flat billing) | $99 to $300 per month |
| Per-request or per-item metering | Customers whose usage is genuinely spiky | High (real metering and reconciliation) | Variable, sometimes a base fee plus overage |
| Agency white-label | Agencies provisioning many client agents | Medium (token issuance, SLA) | 3x to 5x the per-feed price, usually annual |
Whichever shape you go with, what you’re charging for is the curation. Your customer is paying for your taste in publications, and for the months of reading they’d otherwise have to do themselves to develop that taste. The bandwidth bill is a rounding error.
What are the common mistakes building this?
A few traps to watch for as you build this out. Most are easy to avoid if you know they’re there.
Watch your CDN cache keys. If ?token=A and ?token=B hit the same cache key, customer B can end up with customer A’s response and your audit headers will be wrong about who made the request. Either include the token in the cache key, or set Cache-Control: private at the gate so the response never gets cached publicly. The Worker example above does the second, which is usually simpler.
Think about deduplication early. Aggregator can serve the same item twice when two of your sources cross-post a piece. The fashion test feed I built for this article already has a few duplicates in it from exactly that. Either canonicalize on item GUID upstream, or document that customers should dedupe on id+link on their side. The n8n workflow above does it downstream as a safety net, but you don’t really want every customer rediscovering the issue independently.
Don’t ship without a rate limit. One enthusiastic customer polling every 5 seconds can tank your origin and your hosting bill quietly. The Worker caps each token at 60 requests per minute, which is generous for normal polling and tight enough that abusive use stands out in the logs.
Check whether your item IDs leak your editorial roadmap. If they’re sequential and a customer sees id=4823, they can poll for id=4824 before you’ve published it. Aggregator uses UUIDs by default, so this is mostly a problem if you’ve customized your IDs to be predictable for some other reason. Hash them at the gate if you’ve gone that route.
Set up an audit log from day one, even if it feels like overkill. The moment a customer asks why their feed didn’t deliver something specific, you’ll want a record of what they actually requested and what came back. The X-Feed-Customer header the gate stamps is enough to start with; pipe Worker logs into whatever log aggregator you already use.
Finally, watch how you talk about the product. The temptation is to sell the feed itself, the technology, the gate, the workflow. Those are the easy things to demo, but they’re not what someone is paying for. They’re paying for the curation, the source list, the years of judgement that went into picking which thirty publications matter. A customer can build an RSS aggregator in a weekend. They can’t build that.
How do you test the demand in a week?
You can run the whole demand-validation loop in a working week. The temptation is to skip the people-talking step at the end, but that’s the only step that actually validates anything, so resist.
Start with the niche, and pick one you already aggregate for. Don’t invent a new aggregator to chase a market you don’t know. If you’re running a fashion news site, your niche is fashion industry intelligence, full stop. Pivoting to “AI agent feeds in general” sounds like a bigger market, but it strips away the only edge you have: the curation you’ve already done.
Once the niche is locked, deploy the gate, issue yourself a token, and verify with curl that the gated feed returns the same Atom your unguarded feed does. If it doesn’t, you have a configuration problem to fix before talking to anyone else about anything.
Then write a landing page. Not a marketing site. One short page: a paragraph that describes the curated feed, what it’s good for, the price, and an email address for a demo token. Post the link in two communities where your buyer actually hangs out. For fashion that might be a Slack group, a subreddit, and a couple of the busier LinkedIn AI-agent communities. For SEO, r/SEO, Indie Hackers, and one of the MCP server directories.
Issue a free 7-day token to anyone who emails. Watch your audit log to see whether they actually poll the feed; a request for a token isn’t the same signal as actual usage. If you get zero emails after two or three days, either the niche is wrong or the pitch is wrong. Iterate the page, or pick a different niche, before deciding the whole idea is dead.
Then get on a call with anyone who took a token. Ask them one question: what were you going to do with this if it worked? The answer is what your actual product is. The feed is just the wrapper.
If you’re already running an aggregator site, the technical build is hours of work, not weeks. The real question is whether one specific group of AI builders wants what you’ve already been curating. The only way to find that out is to ask them.
The next step
WP RSS Aggregator has been the data layer for niche publishers for ten years. The thing that’s new is the buyer: AI agents and the people building them now want exactly that kind of data layer, fresh and pre-curated, and they’re willing to pay for it.
The technical build is around fifty lines of code. The pricing page is a paragraph. The hardest part of this whole thing is picking the right niche, and if you’re reading this, you did that years ago.
If you want the broader business case for why this window is open right now, our co-founder Jean Galea wrote it up on WP Mayor: Your WordPress Aggregator Site Is Now an AI Product. This post is the build manual for the case he made there.
If you’ve already tried selling a feed to AI builders, or you’re sitting on a niche aggregator and wondering whether to bother, drop a note in the comments. Curious to hear what’s working and what isn’t.


