@Sean – thanks for commenting

Yeah, I originally thought about using CSS as the selector, but did some reading and thought that the jquery approach allows a cleverer means of DOM selection. So, for example, you can exclude certain elements based on filters etc. I’m not a massively good CSS’er (like everything else, I hack it to make it work) but my impression was that DOM selection via CSS is a bit more limited?

re. RDFa – yes and microdata – yes, absolutely. If you have a look at my comment on this post http://doofercall.blogspot.com/2008/05/screen-scraping-and-posh.html you’ll see that I suggest a hierarchical approach to grabbing data from the page, with “most accurate” at the top (including API, microformats, RDFa) and “least accurate” (the DOM approach) at the bottom. I think remaining compatible with these emerging approaches as well as with what has gone in the past is pretty key