Category: SEO

  • Watching the Watchers

    Watching the Watchers

    I’m collecting data on AI crawlers and search spiders and learning what they like to eat. It’s true that few are eating LLMS.txt, but it doesn’t tell the full story.

    To have good SEO, you might not need to add structured content, but some formats, like JSON-LD, question-answer pairs, and named entities, do seem to help and attract visits. Google’s recent report on how to optimize for answer engines is likely because it is uncommon for websites to include this type of content.

    But since optimization is the name of the game, don’t just listen to what Google says you should do; run your own experiments and see what the bots visit.

    If you run a WordPress-powered website, grab my free open source plugin and see for yourself. It is called AEO Pugmill.

    Fun fact: This blog was the first to run my AEO plugin.

  • AEO Experiment

    AEO Experiment

    This is a quick thought experiment, and also an example of writing content that helps humans but it also tailored for AI Crawlers.

    Question: Why does AI keep referencing your competitor instead of you?

    Backstory: I spotted this question on Reddit and answered it. The problem is emerging because human behavior is shifting from using search to find links to places that might provide answers, to using AI to get those answers. Afterall that is the core purpose of search in the first place, to find what (the answer) you’re looking for.

    Answer: Try this little experiment asking an AI to explain which it prefers, your competitors content or your content.

    1. Visit the competitor homepage and your homepage (or multiple corresponding pages on each site to create a matching set).
    2. Grab the page source (e.g., View > Page Source).
    3. Copy and paste the code into a simple text editor.
    4. Drop those text files into an LLM and ask it to analyze without leading questions. AI crafts answers based on probability calculations and attempts to give you the answer it thinks you want. When you give it leading questions, just like a person, it can sway it’s answer toward the result you wanted to hear, so craft your prompt to make it clear your objective is to determine the truth and it’s true opinion without knowing your bias.
    5. Ask it which website does a better job for answer engines and explain how and why.
    6. Then ask how you could use the learnings to improve your site.
    7. Implement the impriovements.

    If your site runs on WordPress I made a little free plugin that helps you fine tune content and put it in a form AI Crawlers seem to like. Answer Engine Optimization is still evolving so the content types (endpoints) it creates just for the bots is experimentation.

    Visit AEOPugmill.com to get the WordPress plugin and see the data I’m collecting on bot behavior.

  • What AI Bots Actually See When They Crawl a WordPress Site

    What AI Bots Actually See When They Crawl a WordPress Site

    AEO Pugmill tracks how AI answer engines consume WordPress content and formats site data for those systems. AEO Pugmill operates as a network tracking how AI answer engines consume WordPress content, paired with a plugin that formats site data for these systems. AI answer engines extract and cite facts, requiring specific structuring for machine readability.

    Adding the plugin to a WordPress installation generates structured data and machine-readable endpoints. Serving specific outputs as distinct URLs allows bots to request resources independently. The trackable endpoints include a plain-text llms.txt index. This index functions as a table of contents, helping crawlers determine which pages to fetch. The system produces structured Markdown renderings of individual posts. This gives bots a clean version of the text, including publication dates, summaries, entity lists, and Q&A pairs, omitting HTML markup and theme elements.

    The plugin generates standalone JSON-LD files containing FAQPage schema, entity mentions, and citations. Updating the standard WordPress XML sitemap adds alternate links pointing to the Markdown endpoints. Additions to the robots.txt file signal the availability of the structured content index. Enriching the standard RSS feed incorporates AEO elements like structured summaries and named entities alongside the post content.

    Embedding outputs directly into the HTML places data where search engines and crawlers expect to find it. The plugin injects FAQPage JSON-LD derived from post metadata. Entities stored in the metadata become typed mentions with links to authoritative references, assisting AI systems in disambiguating subjects. Extracting external links populates the citation JSON-LD. The plugin injects structured data derived from the post summary, falling back to the WordPress excerpt. These embedded elements register as standard HTML page requests. Separating schema into standalone files reduces utility for traditional search while providing no added benefit for AI crawlers that already parse the full page. for traditional search, while providing no added benefit for AI crawlers that already parse the full page. The distinction matters for understanding the limits of bot analytics, as parsing a specific embedded element remains indistinguishable from a full page load.

    Evaluating bot activity occurs by checking incoming user-agent strings against a list of 25 recognized signatures, including GPTBot, ClaudeBot, PerplexityBot, CCBot, Bytespider, DeepSeekBot, and traditional search crawlers. Identifying a match records the canonical bot name, the requested resource type, and the date in a local daily summary table. The system does not keep a per-request log. Analyzing HTML requests captures content signals like word count brackets, freshness, fact density, and URL depth. Sharing data with the wider aggregation network is an opt-in setting. Enabling this feature transmits daily count summaries using a one-way hashed identifier, ensuring no URLs, content, or user data leave the server. When a post goes live, participating search engines receive a notification through an automated ping system that respects a 30-minute burst limit between updates.

    Full architecture and technical implementation details are available at https://www.aeopugmill.com/about.

    The plugin is available for WordPress installation at https://www.aeopugmill.com/plugin.