Sunday, January 8, 2012

Mining service providers from Citrix site using Autoit

Someone I know needed to create a list of Citrix solution providers in every country.
Because I manage the automated tools at my company he asked me if I have any way to export the data from the database automatically instead of him going through over a hundred pages (one for each country).

After a quick glance at the site, which consisted of a simple form, it was clear that getting all the list at once was not possible. My next move was to check if I can simulate checking the provider in a specific country.

Lucky for me the form sent the page details over GET, including which country was checked. I could just copy the url from the browser, replace the relevant country and click Enter.





I used Autoit, which is a nifty script language for Windows, to create a quick application that downloads all the pages and extracts just the relevant information. I copied the country codes from the Citrix site source and used Autoit regular expressions ability in Find & Replace to to turn this (the code from the page’s source):
<option value="AL">Albania</option>
<option value="AO">Angola Republica</option>
<option value="AR">Argentina</option>
<option value="AU">Australia</option>
...
into this (the array form of autoit):
["AL","AO","AR","AU"... ]



I go over the array and automatically download the pages, then I extract the list of providers by using the _StringBetween function which, as the name implies, returns the string between two strings.
local $url = "http://www.citrix.com/partners/locator/results?program=SOLUTION_PROVIDER&page=0&companyName=&countryCode="&$arrcountry[$i]&"&countryCodeDist=&searchMethod=LOCATION&zipCode=&searchRadius=10&stateRegion=&city=&product=&pType=&pLevel="
   
    Local $sData = InetRead ( $url )
    Local $ReadData = BinaryToString($sData)   
    Local $StringToWrite = _StringBetween($ReadData, '<table id="alternatePartnerMembersTable" width="100%">', '<div align="right" id="pagination2">')
And finally I wrote the results to a file.





FileWrite($filehandle, $StringToWrite[0])
Since downloading several pages could take a few minutes I’ve added progress notification in the form of TrayTip

TraySetToolTip ( Round(($i/105*100), 0) & "%" )


My friend then took the list and imported it into Excel, which knew how to parsa the html page automatically into his chart. A manual work of good couple of hours turned to little less then 30 mins.
The complete code is written with comments after the break.
#include <String.au3>

;create array with list of countries
Dim $arrcountry[106] = ["AL","AO","AR","AU","AT","BH","BD","BY","BE","BJ","BM","BO","BR","BG","CA","KY","CL","CN","CO","CR","CI","HR","CY","CZ","DK","EC","EG","EE", _
"FI","FR","GA","DE","GH","GR","GT","HN","HK","HU","IS","IN","ID","IE","IL","IT","JM","JP","JO","KZ","KE","KW","LV","LB","LI","LT","LU","MO","MY","MT","MU", _
"MX","MD","MC","MA","NL","NZ","NG","NO","OM","PK","PA","PE","PH","PL","PT","PR","QA","RO","RU","SA","SN","RS","SG","SK","SI","ZA","KR","ES","LK","SE","CH", _
"TW","TH","TT","TN","TR","US","UG","UA","AE","GB","UY","VE","VN","YE","ZM","ZW"]

;open file for writing
local $filehandle = FileOpen("table.html", 10)

;go over array and download page
for $i = 0 to 105
    local $url = "http://www.citrix.com/partners/locator/results?program=SOLUTION_PROVIDER&page=0&companyName=&countryCode="&$arrcountry[$i]&"&countryCodeDist=&searchMethod=LOCATION&zipCode=&searchRadius=10&stateRegion=&city=&product=&pType=&pLevel="
   
    ;read the url into a binary variable
    Local $sData = InetRead ( $url )
   
    ;turn the binary data into regular string
    Local $ReadData = BinaryToString($sData)

    ;take the data which is in the table
    Local $StringToWrite = _StringBetween($ReadData, '<table id="alternatePartnerMembersTable" width="100%">', '<div align="right" id="pagination2">')
    if ($StringToWrite <> 0) Then
        ;write the data to file
        FileWrite($filehandle, $StringToWrite[0])
    EndIf

    ;update the tooltip to display how much pages are left
    TraySetToolTip ( Round(($i/105*100), 0) & "%" )
Next

;close the file
FileClose($filehandle)



No comments:

Post a Comment