Top 500 Sites in Australia According to Alexa

Need to analyse a list of top sites in Australia, and the following Python scripts helped me to get the site name + domain name off from Alexa with minimum effort:

#!/usr/bin/env python
import re, urllib
r = r'<a  href="/siteinfo/(.*?)"  ><strong>(.*?)</strong>';
for i in range(25):
    u = 'http://www.alexa.com/topsites/countries;%d/AU' % i
    for x, m in enumerate(re.findall(r, urllib.urlopen(u).read())):
        print '%d. %s (%s)' % (x + 1 + i * 20, m[1], m[0].strip())

YMMV. Considering how Alexa has been trying to obfuscate their HTML pages to prevent scrapping, I won’t be surprised that this script stops to work tomorrow…

Category: Technology | Fri, 29 May 2009 2:10 pm

Comments

1.
Avatar for Stewart
Posted by Stewart on Mon, 6 July 2009 9:20 am

Great thanks for this will this work for the UK or another country well i suppose theres only one way to find out.

Thanks Scott


2.
Avatar for Brisbane Translator
Posted by Brisbane Translator on Tue, 25 August 2009 8:57 pm

Hi,
Well translator is the process through we can change one language to an other wanted language just we have idea about that things or that language on which we are going to discussion or communication.


Add a comment

Gravatar is used. Email address is required but will not be displayed. Please keep your comment on topic. No spamming and/or bad language. First time poster will be moderated. Scott reserves the right to delete/edit your comments.