How to let Chatgtp Bot to crawl your website?

Rohit N 8 months ago 0

Open-AI company has announced its new web crawler for its chatgtp: GTPBOT. Just like other companies Google, Yahoo and Bing, now OpenAI also uses Bot to crawl your website and gather the useful information. Company will use these info to provide you more accurate and updated answers with variety of sources.

What is Bot?

Bot is mainly a small program that do certain task automatically. These bots mostly used in online services like chatting, messaging, crawling webpages etc.

Connect one dot to another

Chat GTP is question and answer based chat service. You ask any question and you will get the answer. In google when you ask question you get links from different websites. You need to visit these sites to find the answers. But in chatgtp you will get exact answers of your question rather than just website links.

So what GTPBOT do?

Before GTP BOT, company find the websites manually and train its AI but now Bot will crawl the website, scrape all the data and teach AI. You can use it by its user-agent.

Useragent. GPTBot’s User agent token is “GPTBot”. Its full user-agent string is:
“Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GPTBot/1.0; +”.

Robots.txt. You can use robots.txt to block GPTBot from accessing your website, or parts of it. To disallow GPTBot to access your site you can add GPTBot to your site’s robots.txt:

User-agent: GPTBot
Disallow: /

To allow GPTBot to access only parts of your site, you can add the GPTBot token to your site’s robots.txt like this:
User-agent: GPTBot
Allow: /directory-1/
Disallow: /directory-2/

If you don’t want chatgtp not you crawl your website, same protocol will be used. To know more about it, check out their document(Link).

Written By

I studied Bachelor in Computer Application in 2012 from UP, India. I also studied Digital Marketing from New Delhi. I can help you to run better Google and Facebook Ads. Contact me if you have any query.

Leave a Reply

Leave a Reply

Your email address will not be published. Required fields are marked *