BlogMatrix
 

A typical day working with RDF and FOAF

edit David P. Janes 2008-03-16 09:52 UTC 3 comments  ·  ·  ·

I’ve been trying to use FOAF to get profile and friendship/contact information across social networks. I’ve done the “friend” part, I just need to fill in the profile information.
Now, getting this information out of FOAF is problematic at best. Using Python, the rdflib library, and SPARQL I’ve managed to coax data out one painful step at a time. For example, here’s my “friend-getter” code:

SELECT    ?bfoaf ?bname ?bnick ?bmbox_sha1sum ?bimage ?bweblog
WHERE {
?a foaf:knows ?b .
?b rdfs:seeAlso ?bfoaf .
OPTIONAL { ?b foaf:name ?bname } .
OPTIONAL { ?b foaf:nick ?bnick } .
OPTIONAL { ?b foaf:mbox_sha1sum ?bmbox_sha1sum } .
OPTIONAL { ?b foaf:image ?bimage } .
OPTIONAL { ?b foaf:weblog ?bweblog } .}

Clear enough, I guess. Unfortunately, I just can’t go look at bnick and stuff it into my results because bnick might be some sort of “resource” which then has to programmatically traversed also (see http://api.hi5.com/rest/profile/foaf/208329359). I admit that this might – maybe even probably – is a problem with me, maybe I don’t understand SPARQL well enough.

But that’s old business. The way I’ve been doing this is CURLing down the FOAF file, manually inspecting it, writing some Python/rdflib/SPARQL code and seeing what happens.

This morning I decided to try a new approach: look for a SPARQL and/or RDF browser and figure out the correct queries online, then just write the code once, correctly. In my mind, this way all very sweet: an INPUT field for the FOAF/RDF URI, a TEXTAREA for the SPARQL query, a TABLE for the SPARQL results, and a TABLE showing all the RDF triples, since it’s triples “all the way down”.

Here’s what I did find:

  • Google rdf browser
  • Check out Brown Sauce; have to install a local massive development environment – remember now, I’m trying to save time, not lose it
  • Check out http://browserdf.org/: “Faceted Navigation for arbitrary Semantic Web data”. Very promising. Unfortunately, “arbitrary” seems to mean three different data sets
  • Check out Stefano’s Linotype -- a high quality information source usually; find out about Welkin
  • Try Welkin
  • Find out Welkin doesn’t browse the web
  • Download the FOAF file from http://kitschbitch.vox.com/profile/foaf.rdf into test.foaf.
  • Discover that Welkin doesn’t like “*.foaf”
  • Try again with “*.xml”
  • Try again with “*.rdf”
  • Success, except no results. Why? Oppps … I was downloading the wrong URI
  • Try again with the correct URI
  • Verify that it’s a FOAF file
  • Stare at nothingness coming out Welkin
  • Write a blog post about it; partially regret losing 50 minutes of my morning

The problem – a problem – with FOAF and RDF is quite simple. People don’t want formats that can do anything, they want formats that can do something. I got a Flickr API downloader going in about 30 minutes, taking my time. I’ve put hours into FOAF and still am unhappy.

The Google Social Graph API

edit David P. Janes 2008-02-21 19:41 UTC add comment  ·  ·  ·  ·  ·

After attending SGFooCamp (photos), I've been meaning to playwith Google'snew Social Graph API a little more.

The SGAPI is a fairly simple and powerful tool:

  • it looks for data captured on the web by the Google crawler, namely:
    • XFN links (normal webpage links with rel="something" markers),
    • FOAF information, which I won't go into
  • this data defines a graph, with nodes being webpages (which are URLs, and URLs are people) and edges being relationships
  • it's not a proprietary Google data store
  • there's one RESTful GET call in the API that returns the graph given a starting URL

That sounds kind of abstract, but it's really quite simple. For example, consider all the web services you use:your blogger.com account, your flickr account, your del.icio.us account, your twitter account, and your home page. They're all"you"and should have additional information to indicate that they belong to the same person. XFN defines this asthe rel="me" relationship.

Let's look at a specific example: Mark Kuznicki'saccount. Usingthe magic of thedemo application, we can see that Mark has a number of accounts that are claiming that they're him:

Why? Because Mark entered the URL of his home page and these applications added a link marked with rel="me" on the appropriate A tag. (If you don't believe, follow thelinks and look at the source).

Now, why does it (currently -- 2008-02-20) call these "possible" connections? Simple: because Mark hasn't linked back to these sites on his home page with A linksmarked rel="me".

Why should Mark do this? Simple:

  • because if he does this he can have identity consolidation on all his public social networks, that is, by giving only his home page URL we can definitely know all the social networks that Mark belongs to (and not just claim that Mark is a member of)
  • and by knowing this, we can start exploring other (web) relationships that mark is involved in, such as rel="friend"
  • and if he starts doing this, pretty soon we have a solution to the YANS problem: when Mark joins a site, he need merely specify his home page and his friends are automatically found -- and without having to screw around with the password anti-pattern or manually re-entering all friends.