Topic: How to get all image, pdf and other files links from a website?

I am working on an application where I have to

1) get all the links of website
2) and then get the list of all the files and file extensions in each
of the web page/link.

I am done with the first part of it  smile
now I have to get the all the files/file-extensions in each of the
page.

Can anybody guide me how to parse the links/webpage and get the file-
extensions in the page?

Re: How to get all image, pdf and other files links from a website?

Sooooo many ways,  below are some links,  they mostly address the part you've already done,  but the tools may (I don't know) address files and extensions.  There is also a good wiki page on screen scraping:

http://railscasts.com/episodes/173-scre … ith-scrapi

http://railscasts.com/episodes/190-scre … h-nokogiri

http://stackoverflow.com/questions/8484 … uced-by-ja

http://en.wikipedia.org/wiki/Web_scraping

Joe got a job, on the day shift, at the Utility Muffin Research Kitchen, arrogantly twisting the sterile canvas snout of a fully charged icing anointment utensil.