OpenAI just announced that their wildly popular ChatGPT LLM-AI chatbot now has plugins that allow users to talk to other apps and pull data into the chatbot, including a web browsing plugin which users can use to access websites on the open web and do things like summarize article contents instead of reading the article.
While OpenAI is trying to calm website owners down by stating that plugins are not “crawling the web in any automatic fashion”, this could spell bad news for any websites that rely on advertising income to sustain themselves. Largely internet publishers who are simultaneously trying to find ways to use ChatGPT and GPT-4 to write articles for them but also need advertising revenue to generate profits.
Thankfully, OpenAI has published documentation on how website owners can manage the access of plugins via Robots.txt directives. This puts website owners in charge of how at least this one LLM-AI system can access their websites and use their content.
Big shoutout to our friend Chaz for finding this:
ChatGPT’s plugins use the following user-agent token / string:
- Token: ChatGPT-User
- String: Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; ChatGPT-User/1.0; +https://openai.com/bot
If you are new to SEO please read up on how to edit the Robots.txt file or send the link to this article to your SEO. When done incorrectly a Robots.txt file can cause problems for your site.
If you are not new you’ll see how simple it is. Just copy and paste the code below into your Robots.txt file to block all ChatGPT plugins immediately.
User-agent: ChatGPT-User
Disallow: /
This keeps all plugins on ChatGPT from accessing your entire website.
You can corral ChatGPT plugins to specific parts of your website by using the “Allow” directive to specify which parts of the site they can access. OpenAI’s documentation included this demo code for doing this:
User-agent: ChatGPT-User
Disallow:
Allow: /directory-1/
Allow: /directory-2/
Frequently Asked Questions
Question: How do I stop ChatGPT / OpenAI from using my websites content completely for training their models, etc…?
Answer: Unfortunately there is no known way to do this yet, except possibly hiding that information behind a password protection wall or some other mechanism that keeps the text from being published to the open web. This still wouldn’t stop users from copying and pasting information into a ChatGPT session however.
Question: Can I block ChatGPT plugins from just accessing specific directories on my site?
Answer: Yes, since this is Robots.txt you can add the directories you do not want the plugins accessing to the disallow instead of the allow. Like this:
User-agent: ChatGPT-User
Disallow: /directory-1/
Disallow: /directory-2/
Question: Can I hire you / your team to do this for me?
Answer: While we would love to work with you, we do not do one-off fixes or short-term projects. If you are interested in a long-term SEO solution that knows how to win and stays ahead of the game then please Contact Us and we will set up a call to learn more about your needs.