- I use Alta-Vista to search for a Tweet I sent last month.
- The search engine returns a selection of pages it thinks contains the text.
- Looking at the pages show the text isn’t present - but, say, an embedded Twitter stream is.
It’s quite common for a search engine to index a page and not realise that the information is either highly dynamic (say an RSS feed of headlines, a “most read” list, a “top comment from elsewhere” box etc.)
It would be useful to tell crawlers (and, potentially, screen readers, find-in-page boxes etc) to say “ignore this bit of the page.”
<irrelevant> is a bit tongue-in-cheek. I suppose that
<aside> is the correct tag, but perhaps with an added attribude