|
| 1 | +--- |
| 2 | +title: 'Robot.txt Recipes' |
| 3 | +description: 'Several recipes for configuring your robots.txt .' |
| 4 | +--- |
| 5 | + |
| 6 | +## Introduction |
| 7 | + |
| 8 | +As a minimum the only recommended configuration for robots is to [disable indexing for non-production environments](/docs/robots/guides/disable-indexing). |
| 9 | + |
| 10 | +Many sites will never need to configure their [`robots.txt`](https://nuxtseo.com/learn/controlling-crawlers/robots-txt) or [`robots` meta tag](https://nuxtseo.com/learn/controlling-crawlers/meta-tags) beyond this, as the [controlling web crawlers](/learn/controlling-crawlers) |
| 11 | +is an advanced use case and topic. |
| 12 | + |
| 13 | +However, if you're looking to get the best SEO and performance results, you may consider some of the recipes on this page for |
| 14 | +your site. |
| 15 | + |
| 16 | +## Robots.txt recipes |
| 17 | + |
| 18 | +### Blocking Bad Bots |
| 19 | + |
| 20 | +If you're finding your site is getting hit with a lot of bots, you may consider enabling the `blockNonSeoBots` option. |
| 21 | + |
| 22 | +```ts [nuxt.config.ts] |
| 23 | +export default defineNuxtConfig({ |
| 24 | + robots: { |
| 25 | + blockNonSeoBots: true |
| 26 | + } |
| 27 | +}) |
| 28 | +``` |
| 29 | + |
| 30 | +This will block mostly web scrapers, the full list is: `Nuclei`, `WikiDo`, `Riddler`, `PetalBot`, `Zoominfobot`, `Go-http-client`, `Node/simplecrawler`, `CazoodleBot`, `dotbot/1.0`, `Gigabot`, `Barkrowler`, `BLEXBot`, `magpie-crawler`. |
| 31 | + |
| 32 | +### Blocking AI Crawlers |
| 33 | + |
| 34 | +AI crawlers can be beneficial as they can help users finding your site, but for some educational sites or those not |
| 35 | +interested in being indexed by AI crawlers, you can block them using the `blockAIBots` option. |
| 36 | + |
| 37 | +```ts [nuxt.config.ts] |
| 38 | +export default defineNuxtConfig({ |
| 39 | + robots: { |
| 40 | + blockAiBots: true |
| 41 | + } |
| 42 | +}) |
| 43 | +``` |
| 44 | + |
| 45 | +This will block the following AI crawlers: `GPTBot`, `ChatGPT-User`, `Claude-Web`, `anthropic-ai`, `Applebot-Extended`, `Bytespider`, `CCBot`, `cohere-ai`, `Diffbot`, `FacebookBot`, `Google-Extended`, `ImagesiftBot`, `PerplexityBot`, `OmigiliBot`, `Omigili` |
| 46 | + |
| 47 | +### Blocking Privileged Pages |
| 48 | + |
| 49 | +If you have pages that require authentication or are only available to certain users, you should block these from being indexed. |
| 50 | + |
| 51 | +```robots-txt [public/_robots.txt] |
| 52 | +User-agent: * |
| 53 | +Disallow: /admin |
| 54 | +Disallow: /dashboard |
| 55 | +``` |
| 56 | + |
| 57 | +See [Config using Robots.txt](/docs/robots/guides/robots-txt) for more information. |
| 58 | + |
| 59 | +### Whitelisting Open Graph Tags |
| 60 | + |
| 61 | +If you have certain pages that you don't want indexed but you still want their [Open Graph Tags](/learn/mastering-meta/open-graph) to be crawled, you can target the specific |
| 62 | +user-agents. |
| 63 | + |
| 64 | +```robots-txt [public/_robots.txt] |
| 65 | +# Block search engines |
| 66 | +User-agent: Googlebot |
| 67 | +User-agent: Bingbot |
| 68 | +Disallow: /user-profiles |
| 69 | +
|
| 70 | +# Allow social crawlers |
| 71 | +User-agent: facebookexternalhit |
| 72 | +User-agent: Twitterbot |
| 73 | +Disallow: /user-profiles |
| 74 | +``` |
| 75 | + |
| 76 | +See [Config using Robots.txt](/docs/robots/guides/robots-txt) for more information. |
| 77 | + |
| 78 | +### Blocking Search Results |
| 79 | + |
| 80 | +You may consider blocking search results from being indexed, as they can be seen as duplicate content |
| 81 | +and can be a poor user experience. |
| 82 | + |
| 83 | +```robots-txt [public/_robots.txt] |
| 84 | +User-agent: * |
| 85 | +# block search results |
| 86 | +Disallow: /*?query= |
| 87 | +# block pagination |
| 88 | +Disallow: /*?page= |
| 89 | +# block sorting |
| 90 | +Disallow: /*?sort= |
| 91 | +# block filtering |
| 92 | +Disallow: /*?filter= |
| 93 | +``` |
0 commit comments