And, happily, they had a Query Example in Ruby! (as well as Java, Perl, PHP, and C#)
It is a bit unclear about where to get the country code’s although I see that you can do a query to get the list using ResponseGroup=ListCountries ( I just cheated and used Alexa site to get the country code I wanted) pasted in the access key and double secret code key and fired it off…
Only it didn’t seem to work! WTF! the error said “The URI http://awis.amazonaws.com/onca/xml is not valid” but but i didn’t change that!!! Carefully reading, and remembering to breathe, the doc’s I noticed that it refereed to the base uri as being “http://ats.amazonaws.com” rather that what was in line 27 of the topsites.rb file : “http://awis.amazonaws.com/onca/xml” , so I tried that and it worked! I guess they changed some stuff and have not updated the sample code? sloppy!
The next issue was that the query only produces a max count of 100 and I wanted thousands! (The Alexa site already shows the top 100 by country.)
I quickly wrote up some ruby code to figure out my start count and generate a filename for each increment which was passed to a modified aws topsite query (changed to write to a file name rather than standard output i.e. the console), and many xml files later I’m done. (now to import the mess! – which proved to be easy to do in excel 2003)
begincount = 1
incr = 100
for x in 0..nol
start = begincount + (x * incr )
filename = “c://aws/aww_ts_” + x.to_s + “.xml”
QueryAWS_topSite(start, incr, filename)
Maybe I will mess with it some more to create one giant xml file (return the xml object and parse out the elements I want before writing to one file?) and otherwise more elegant, but for now it is “good enough”. and geeky fun too!