Q&A: PROGRAMMING TO GRAB NEWS HEADLINES FROM ANOTHER WEBSITE?

Question by Robert:
Programming to grab news headlines from another website?
Hi I want to code a script that will grab the news headlines and links from another site and have it displayed on my own website. I want to grab the latest news from this site on the top right corner:
http://www.grooveasia.com/default.aspx?aspxerrorpath=/news/item.aspx
How would I go about in doing that and what language would be best? Preferable no database because I want on page load on my website.
Thanks
——————————————
Answer by P L
It doesn’t look like they have an RSS or ATOM feed, so the only way to do it would be to scrape the HTML directly for what you are looking for using regular expressions. This is fragile and very prone to break if they change their Web site. You could do it in any language where you can open an HTTP connection to that page and download the HTML text. into memory. Then you can scrape the text with regular expressions or something. Perl, Python. and PHP will all work for this.
However, you should be aware that re-posting their content on your site might be a copyright violation unless you have permission to do so. And if they catch you doing it, they might block you from being able to access their site.
——————————————
Add your own answer in the comments!









about 1 year ago
Upload your page online and make use of the following free service.
http://www.websiteoptimization.com/services/analyze/
Hope it helps.