BlogMatrix
 

A typical day working with RDF and FOAF

edit David P. Janes 2008-03-16 09:52 UTC 3 comments  ·  ·  ·

I’ve been trying to use FOAF to get profile and friendship/contact information across social networks. I’ve done the “friend” part, I just need to fill in the profile information.
Now, getting this information out of FOAF is problematic at best. Using Python, the rdflib library, and SPARQL I’ve managed to coax data out one painful step at a time. For example, here’s my “friend-getter” code:

SELECT    ?bfoaf ?bname ?bnick ?bmbox_sha1sum ?bimage ?bweblog
WHERE {
?a foaf:knows ?b .
?b rdfs:seeAlso ?bfoaf .
OPTIONAL { ?b foaf:name ?bname } .
OPTIONAL { ?b foaf:nick ?bnick } .
OPTIONAL { ?b foaf:mbox_sha1sum ?bmbox_sha1sum } .
OPTIONAL { ?b foaf:image ?bimage } .
OPTIONAL { ?b foaf:weblog ?bweblog } .}

Clear enough, I guess. Unfortunately, I just can’t go look at bnick and stuff it into my results because bnick might be some sort of “resource” which then has to programmatically traversed also (see http://api.hi5.com/rest/profile/foaf/208329359). I admit that this might – maybe even probably – is a problem with me, maybe I don’t understand SPARQL well enough.

But that’s old business. The way I’ve been doing this is CURLing down the FOAF file, manually inspecting it, writing some Python/rdflib/SPARQL code and seeing what happens.

This morning I decided to try a new approach: look for a SPARQL and/or RDF browser and figure out the correct queries online, then just write the code once, correctly. In my mind, this way all very sweet: an INPUT field for the FOAF/RDF URI, a TEXTAREA for the SPARQL query, a TABLE for the SPARQL results, and a TABLE showing all the RDF triples, since it’s triples “all the way down”.

Here’s what I did find:

  • Google rdf browser
  • Check out Brown Sauce; have to install a local massive development environment – remember now, I’m trying to save time, not lose it
  • Check out http://browserdf.org/: “Faceted Navigation for arbitrary Semantic Web data”. Very promising. Unfortunately, “arbitrary” seems to mean three different data sets
  • Check out Stefano’s Linotype -- a high quality information source usually; find out about Welkin
  • Try Welkin
  • Find out Welkin doesn’t browse the web
  • Download the FOAF file from http://kitschbitch.vox.com/profile/foaf.rdf into test.foaf.
  • Discover that Welkin doesn’t like “*.foaf”
  • Try again with “*.xml”
  • Try again with “*.rdf”
  • Success, except no results. Why? Oppps … I was downloading the wrong URI
  • Try again with the correct URI
  • Verify that it’s a FOAF file
  • Stare at nothingness coming out Welkin
  • Write a blog post about it; partially regret losing 50 minutes of my morning

The problem – a problem – with FOAF and RDF is quite simple. People don’t want formats that can do anything, they want formats that can do something. I got a Flickr API downloader going in about 30 minutes, taking my time. I’ve put hours into FOAF and still am unhappy.

Comment #1Daniel O'Connor

2008-03-24 03:55:30

It sounds like we're interested in the same problem.

Why not take a more pragmatic approach: xpath and foaf, or other RDF?

Basically:

  1. Find a FOAF or other descriptive bit of RDF
  2. Extract out the name & profile information you need
  3. Present the user with interface: I found these facts, pick which ones you want to use

Just because it's RDF doesn't mean everything must me done with a triplestore.

If however you find yourself doing this same read-a-document, extract information in a similar fashion over and over, then I'd suggest moving to a triplestore and sparql.

Comment #2Daniel O'Connor

2008-03-24 03:57:17

But of course, do republish the available RDF for fuller semantic web agents...

Comment #3Danny

2008-03-24 18:56:04

Must admit I have lots of days like that, whatever tech I'm playing with. The problem is usually the same - knowing which tools are available and what they can do. That often means having to try the things, which is a huge thief of time if the aren't the right tools. Need better docs.

"People don’t want formats that can do anything, they want formats that can do something" - fair enough, but viewing RDF as a format is bound to lead to misery...

Once you've got something together - say a Flickr API downloader, and you want to do something more interesting than a straight display, you'll likely need some kind of data or object model in which to manipulate/integrate the data. If the data's got lots of URIs, RDF makes a good model.

"just write the code once, correctly" - I must try that some time :-)

Add Comment