web spider, crawler

web spider is an automated and methodological program that crawl the internet, programmed for task specific purpose and so is built in that specific way.
Some web spiders are developed to find email address in web pages also known as Email harvesting Crawlers, to check links in web pages, to find inactive links, validating links, validating HTML.
Spider is also developed to find the content from web pages and generate statistics or create copy of content from web pages. Especially search engines use web crawling to provide up-to-date.

Web crawler History:


different results from AWStats and Google Analytics

I found different results from Google Analytics and AWStats web analytics, and just did not notice that before today. I was happy to see the results from AWStats and at the same time worried by seeing the results from Google Analytics. AWStats provides over counted statistical data whereas it seemed that Google Analytics was not providing all the data.

I am thinking that following might be reasons for difference in statistical results from Google Analytics and AWStats:-

1. AWStats calculate the visitor for less time span where as Google Analytics calculate visitors for long time period thus reducing visitor count.
2. Automated spams, robots have affects in executing the AWStats whereas Google Analytics statistics are not affected.
3. Other java scripts interfering Google tracking code and turned off cookies would trouble Google Analytic for tracking results.
4. Google Analytics counts the results only if the pages are loaded fully. Navigating away without pages being fully loaded means data is not collected for that action as tracking code would not be able load to operate.

Also, I got something from Google Analytics Help Center,
here's the link:

Some tips to increase your traffic rank in Alexa

"Alexa computes traffic rankings by analyzing the Web usage of millions of Alexa Toolbar users. The information is sorted, sifted, anonymized, counted, and computed, until, finally, we get the traffic rankings shown in the Alexa service." Source:
  1. Install the Alexa Toolbar and also make your website the homepage.
  2. Ask few of your friends to install the Alexa Toolbar and surf your website. Also encourage others to use Alexa toolbar and get reviews from them
  3. Participate in webmaster forums where you can submit your website URL and can get feedbacks.
  4. Use Alexa redirects e.g. ( in place of your URL.
  5. Buy ad, banners and links for traffic from webmasters websites and forums which drive lots of webmaster traffic to your website significantly boosting up your rank.
  6. Post and advertise articles with tips on how to increase your Alexa traffic rankings.

increase traffic rank in Alexa

  • Alexa computes traffic rankings by analyzing the Web usage of millions of Alexa Toolbar users.
  • The traffic rank is based on three months of aggregated historical traffic data from millions of Alexa Toolbar users and is a combined measure of page views and users (reach).
  • Page views measure the number of pages viewed by Alexa Toolbar users. Multiple page views of the same page made by the same user on the same day are counted only once.
  • Reach measures the number of users.
  • Alexa expresses reach as number of users per million. For example, if a site like has a reach of 28%, this means that if you took random samples of one million Internet users, you would on average find that 280,000 of them visit
  • The daily traffic rank reflects the traffic to the site based on data for a single day.The Trend graph shows you a three-day moving average of the site's daily traffic rank, charted over time.

SEO: few tips

Some Basic things to consider for Search Engine Optimization, in short SEO
Writing rich text Content:
For any web page, Content is king. So try to have rich contents in the pages. Target the goal of your purpose through contents in the page.
Selecting correct keywords:
Choosing the right keywords is very essential since those keywords target the optimization process. Right keywords find your site in the search engine.
Optimizing your HTML page Title is very important for the search engines. The Page Title shows up as a first Clickable link in the search result. So the Page Title should be compelling to target the landing pages in your site.
Meta Description Tags:
Meta Tags, Meta Description Tag and Meta Keywords Tag are invisible text elements.
  • Meta Description Tag should be informative enough to list description of your page.
  • Keep the text in Meta Description Tag informative with minimum characters
  • Pair it up with Title Tag but do not repeat your title text in your Meta Description Tag.
  • Try including some of your valuable keywords which influences page ranking but do not stuff the description tag with long list of keywords.
Meta Keywords Tag
  • Although Meta Keyword Tags do not play that big role for optimization to search engines, but they should not be left out.
  • Meta Keywords Tag along with the Meta Description Tag helps in influencing the search engine rank.
  • The Keywords Tag is a word or phrases of the page document.
Use of Robot.txt
  • Robot.txt file allows the spiders to crawl knowing what they can and can't index. This is helpful in keeping spiders out of folders that you do not want index like the admin or secured folders, or even content that you don't want spiders to index.
  • Most of the spiders index any page they come across and links from them can be followed but still it is good idea to add a Robot Tag with the index and follow statements.
For example, letting spiders to index all pages' content
User-agent: *
Here's another example that would block spiders from indexing admin.php and cgi-bin, admin
User-agent: *
Disallow: /admin.php
Disallow: /cgi-bin/
Disallow: /admin/

Using Title and Alt attributes tells relevance of links to search engines.
Creating a Site Map Page - Since every page on the website will be linked to the sitemap, it allows web crawlers (and users) to quickly and easily find content of website.

Validate HTML and CSS making sure that there are no broken links and images in the pages.

Inbound links send visitors to your site; hence links building are important for your websites. Also Directories represent chance to describe about your site with well visitor targeted contents. Thus submitting to Open Directory Project, Yahoo! Directory also plays role for search engine optimization.

QA critical to web development company

Web Quality
Quality is a tough concept when you are dealing with a web site. Web Quality refer to measurable characteristics- things we are able to compare to known standards such as:
HTML standards
HTML Validation
Validating links
CSS Validation
Validating accessibility

In practical terms, this requires a site to validate against a series of checkpoints. These include:
  • Checking for broken links.
  • Checking for missing content, e.g. images.
  • Checking for missing page titles.
  • Checking the spelling and grammar of content.
  • Checking for missing metadata.
  • Checking the file sizes of pages to ensure they are not too large.
  • Checking for browser compatibility.
  • Checking that applications are functioning correctly, e.g. online forms.
  • Checking that any Server Side Scripting or other languages function correctly.
  • Checking that legal and regulatory guidelines are being adhered to, e.g. data protection and privacy.
  • Checking that pages conform to your organisation's Web-Accessibility standard (if any), e.g. missing 'alt-tags'.
  • Checking that the Website Design standard is maintained.