Topic: Scraping blog posts
On a site that I am building I have an option for users to write blog posts. Now since some of them already have blogs on other sites (blogspot, wordpress, ...) they would obviously not want to write everything twice if they would want it to be displayed on my site. What I would like to do is to give the user an option to either write the post or provide a link to an existing one, which would then be scraped, populating correct fields (title, content and publish date). I thought I would be able to achieve this fairly simply with nokogiri, but the problem is that as soon as the blogger has done some customization, the css is completely different (for example I checked two different wordpress blogs and the title of the first one was a link with a class of 'entry-title' and the second blog post's title was just an h2 tag). So I cannot just take the contents of predefined css elements.
Is there something that can be done here?
Thank you in advance