Thursday, August 10, 2006

web crawlers and JavaScript

Many web pages use JavaScript to modify the content of the page.  The effects of the javascript can be seen by anyone who has JavaScript enabled in their browser.  But most web crawlers do not interpret and run the scripts when crawling pages.  This is a problem when this content is intended to be seen by the web crawlers.

For example, BlogRolling is a site that lets you store links.  Then you can insert a JavaScript into your web page to display a blogroll.  One use of these blog roll sites is to link to other sites that you find interesting.  Ideally these links would help affect the Google PageRank or the Technorati rank.  But, it appears these crawlers are not intepreting JavaScript, so they are not picking up these links.

I am going to switch away from using BlogRolling for my blogroll, and use the Links feature in LifeType.  Not only to stop using JavaScript, but to remove a dependacny on an external site to display the content on these pages.

PreviewInterestingly, there are some web crawlers that do interpret JavaScript while crawling web pages.  Alexa crawls web pages and generates previews of the pages.  It appears that when the preview is generated, it runs the scripts that are on the page.

Technorati Tags: , , , ,

No comments:

Post a Comment

Revolutionizing Air Quality Monitoring: How I Upgraded Our System with Mila Integration for Smart Home Automation

In this blog post , I explained how I set up an air quality monitoring system for our neighborhood. With this setup, we can keep an eye on t...