Why So Scared

A blog about; Programming, Music and Random Stuff

Parse MusicBrainz XML with jQuery

For a project I’m working on to help people make posts over at http://www.warez-dnb.com/ i’ve been looking into parsing MusicBrainz, I thought id make a quick post to discuss the challenges and solutions I’ve come up with so far.

<html>
 <head>
 <script type="text/javascript" src="jquery.js"></script>
 <script type="text/javascript">
 $(document).ready(function(){
	$.ajax({
		type: "GET",
		url: "http://www.warez-dnb.com/test/getid.php?name=Caspa",
		dataType: "xml",
		success: function(xml) {
			$(xml).find('artist-list').each(function() {

				alert($(this).find("name").text());
			});
		}
	});
 });
 </script>
 </head>
 <body>

 </body>
</html>



This is the index.html file and jQuery code, its pretty simple, I was just testing that I could get a valid connection, then parsing the XML within artist-list and outputting anything that has the XML tag name

One question that might arise from looking at this code is why the url is hosted at warez-dnb, jQuery wont let you import XML from a remote web host, it has to be from the local server, i’ve written an extremely simple PHP function that will query musicbrainz and copy the result it gets to host it locally.

<?php
// Set your return content type
header('Content-type: application/xml');

// Website url to open
$daurl = 'http://musicbrainz.org/ws/1/artist/?type=xml&name=' . $_GET["name"];

// Get that website's content
$handle = fopen($daurl, "r");

// If there is something, read and return
if ($handle) {
    while (!feof($handle)) {
        $buffer = fgets($handle, 4096);
        echo $buffer;
    }
    fclose($handle);
}
?>
posted by Juo in Programming, jQuery, tidbit and have No Comments

Using Scrubyt to Screenscrape

At http://www.warez-dnb.com/ Im working on a bit of Ruby code that when finished is going to make 20 second samples of every single post.

I decided the simplest way to do this would to have a bot that scraped the website for Rapidshare links, went away and downloaded the links, extracted them, made the sample and then uploaded them to our FTP server. (I did say simplest, not most elegant)

After a file is upload this PHP script tests if the samples are uploaded yet. http://www.whysoscared.com/php-function-check-if-a-file-or-url-exists

To make the Ruby script I needed something that could not only screen scrape the webpage but also login. I was using hpricot but then we decided to make Rapidshare links on Warez-DnB only show up to registered users.

require ‘rubygems’
require ’scrubyt’

#only the following parts should need editing
baseurl = “http://www.warez-dnb.com/”
username = “User”
password = “Password”

data = Scrubyt::Extractor.define do
fetch “#{baseurl}?action=login”
fill_textfield ‘user’, username
fill_textfield ‘passwrd’, password
submit
fetch “#{baseurl}?topic=672″
link ‘//div/code’
end

This is the script that will actually scan each page. Using Scrubyt is super simple first it fetches the login page at Warez-DnB fills the username and password field and submits. Now its logged in.

After this the plan is to use a loop based off the RSS feed to see if the post is in the listings category and extract the Rapidshare links.

Its in fairly simple stages at the moment but ill post the majority of the script when its completed.

posted by Juo in Programming, Ruby and have No Comments