Ad

Is There A Programmatically Way To Force A Crawler To Not Index Specific Images?

- 1 answer

I want to stop crawlers from indexing specific images on my website, but only if they're older than a specific date. However, the crawler shall not stop indexing the page where the image is currently linked.

My initial approach was to write a script which adds the URL of the image to 'robots.txt', but I think the file will become huge, as we talk about a really huge amount of potential images.

My next idea was to use the <meta name="robots" content="noimageindex"> tag, but I think this approach can be error prone, as I could forget to add this tag to a template where I might want to stop crawlers from indexing the image. It's also redundant and the crawler will ignore all images.

My question is: do you know a programmatically way to force a crawler too not index a image, if a condition (in my case the date) is true? Or is my only possibility to stop the crawler from indexing the whole page?

Ad

Answer

Building on top of what you had in mind, you could just create a separate place to keep the images that you don't want to be indexed, write a script to move files to that location once they're "expired" and just add the url to the the robots.txt file. Perhaps something like /expired_images*.

Ad
source: stackoverflow.com
Ad