iTunes XML Syntax

Tutorials, iTunes Portable AppStore

While working on one of my projects, iTunes Portable AppStore, I needed to figure out how to read the same data iTunes does while browsing the AppStore. It took a good amount of investigation, but I eventually found out how to read and parse the data for the information I required. Here are the conclusions I came to.

First, iTunes uses a basic browser implementation with a custom formatting engine as the front-end. Therefore, we can read the same data iTunes has in a standard browser. With some use of Wireshark, I figured out that the URL to search the AppStore was: (using PHP syntax)

“http://ax.phobos.apple.com.edgesuite.net/WebObjects/MZSearch.woa
  1.   /wa/advancedSearch?submit=seeAllLockups&media=software
  2.   &entity=software&softwareTerm=" . $searchTerm;

where $searchTerm is the term you are looking for.

Of course, if you put this URL in your browser, you will get an error asking you to load iTunes. This can be solved by masquerading your user-agent as:

iTunes/8.0.2 (Macintosh; U; PPC Mac OS X 10.4.1)
  1. The data that is returned from this call is actually compressed using <a title="gzip" href="http://en.wikipedia.org/wiki/Gzip">gzip</a>.  Your browser will automatically decompress this for you; however if you are doing this in a custom program, you may need to do this yourself.  Here is a C# example:
  2. <pre lang="CSharp">using System.IO.Compression;
  3.  
  4. HttpWebRequest reqFP = (HttpWebRequest)HttpWebRequest.Create(URL);
  5. HttpWebResponse rspFP = (HttpWebResponse)reqFP.GetResponse();
  6.  
  7. Stream responseStream = responseStream = rspFP.GetResponseStream();
  8.  
  9. if (rspFP.ContentEncoding.ToLower().Contains("gzip"))
  10.   responseStream = new GZipStream(responseStream,
  11.     CompressionMode.Decompress);

After this, the data is basically an XML document. The format of which is:

  • Document
    • Protocol
      Settings and metric links?  Not sure about this one.
    • Path
      Page name and Page URL.
    • View
      • ScrollView
        The search results.
      • VBoxView
        The links at the bottom of the page to view more results.

Since I am looking for the results of that search, we'll take a look at the ScrollView element. This element contains a few things:

  • VBoxView
    • Include
      Not sure what this does.
    • VBoxView
      Advertisment data based of test values .
    • View 
      • PictureView
        Some sort of gradient divider.
      • MatrixView 
        • VBoxView
          The search results.
        • VBoxView
          The copyright information at the foot of the page.

Again, we need to look at only that second to last VBoxView element.  So that element contains the following:

  • MatrixView
    • VBoxView
      • FontStyle
        Setting title font color.
      • FontStyle
        Setting text font color.
      • VBoxView
        Header text for the Application section.
      • View
        • VBoxView
          Different image designs based off of iTunes version.
        • VBoxView
          The search results.

By now, you have probably realized that we nested ten (10) times already and we haven't found the data we want. Don't worry, we'll soon find it. Up next is that last VBoxView we found. Here are the contents:

  • VBoxView
    • VBoxView
      • MatrixView
        • HBoxView
          A search result!
        • HBoxView
          Another search result!
        • HBoxView
          etc.

Finally, fourteen (14) nests and we found our actual search results. Now the actual structure, for future reference, to get to this element is:

Document.View.ScrollView.VBoxView.View.MatrixView
  1. .VBoxView.MatrixView.VBoxView.View.VBoxView
  2. .VBoxView.VBoxView.MatrixView

That is really ugly, but that's what we need. Now the breakdown per each HBoxView is what we need. I'm only going to list the important elements.  Note: [x] stands for the xth instance of that element.

  • HBoxView.VBoxView.MatrixView.GotoURL["url"]
    Application's detail URL
  • HBoxView.VBoxView.MatrixView.GotoURL["draggingName"]
    Application's name
  • HBoxView.VBoxView.MatrixView.GotoURL.View.PictureView["url"]
    Application's icon
  • HBoxView.VBoxView.MatrixView.VBoxView.HBoxView[2].TextView
    .SetFontStyle.GotoURL["url"]

    Application's genre's URL 
  • HBoxView.VBoxView.MatrixView.VBoxView.HBoxView[2]
    .TextView.SetFontStyle.GotoURL["draggingName"]

    Application's genre's name

Okay, that is enough information for today. The only other interesting piece of data available is the price, but you have to do some parsing of the Buy element deep in the HBoxView.

I hope this will help anyone looking to scan the iTunes AppStore. Next I'll post about the data available in the iTunes Application page.

No Comments

Leave a Reply

Allowed tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>



  • Donate

    If my work has helped you and you want to return the favor, you could purchase something for me from my Amazon Wish List or send me a donation via PayPal.

  • My Lifestream

  • License

    Unless otherwise noted, all source code and compiled files published on this website are released under the terms of the GNU Lesser General Public License.