Thursday, August 10, 2006

web crawlers and JavaScript


Many web pages use JavaScript to modify the content of the page.  The effects of the javascript can be seen by anyone who has JavaScript enabled in their browser.  But most web crawlers do not interpret and run the scripts when crawling pages.  This is a problem when this content is intended to be seen by the web crawlers.


For example, BlogRolling is a site that lets you store links.  Then you can insert a JavaScript into your web page to display a blogroll.  One use of these blog roll sites is to link to other sites that you find interesting.  Ideally these links would help affect the Google PageRank or the Technorati rank.  But, it appears these crawlers are not intepreting JavaScript, so they are not picking up these links.




I am going to switch away from using BlogRolling for my blogroll, and use the Links feature in LifeType.  Not only to stop using JavaScript, but to remove a dependacny on an external site to display the content on these pages.


PreviewInterestingly, there are some web crawlers that do interpret JavaScript while crawling web pages.  Alexa crawls web pages and generates previews of the pages.  It appears that when the preview is generated, it runs the scripts that are on the page.


Technorati Tags: , , , ,

No comments:

Post a Comment

Unlocking Raspberry Pi Potential: Navigating Network Booting Challenges for Enhanced Performance and Reliability

I've set up several Raspberry Pis around our house for various projects, but one recurring challenge is the potential for SD card failur...