Topic: Rendering html page after javascript

I'm trying to write a website that parses all images in a given webpage. I initially tried to get all image links by looking for <img> tag, by using nokogiri html parser, and it works well with webpages without javascript.

Some pages use javascript to render the view, and using nokogiri, I'm just getting raw html results before it's rendered.

How can I get a page after being rendered by javascript?

Re: Rendering html page after javascript

Think of parsers like nokogiri and hpricot as wget or curl requests to a page.  They get the initial DOM and return it to you for parsing.  javascript manipulates the DOM but long after the request is made because javascript is client side.  So the server sends the html and javascript to the client (usually a browser) but it is the client's responsibility to run the javascript and update the local DOM.

For situations like this, I've used xvfb - http://www.xfree86.org/4.0.1/Xvfb.1.html
And launched firefox, then interfaced with it using selenium.

I don't know exactly how johnson and taka were meant to be used (just found them today), but maybe it will help here - http://tenderlovemaking.com/2009/04/05/ … e-browser/