BlogMatrix
 

A typical day working with RDF and FOAF

edit David P. Janes 2008-03-16 09:52 UTC 3 comments  ·  ·  ·

I’ve been trying to use FOAF to get profile and friendship/contact information across social networks. I’ve done the “friend” part, I just need to fill in the profile information.
Now, getting this information out of FOAF is problematic at best. Using Python, the rdflib library, and SPARQL I’ve managed to coax data out one painful step at a time. For example, here’s my “friend-getter” code:

SELECT    ?bfoaf ?bname ?bnick ?bmbox_sha1sum ?bimage ?bweblog
WHERE {
?a foaf:knows ?b .
?b rdfs:seeAlso ?bfoaf .
OPTIONAL { ?b foaf:name ?bname } .
OPTIONAL { ?b foaf:nick ?bnick } .
OPTIONAL { ?b foaf:mbox_sha1sum ?bmbox_sha1sum } .
OPTIONAL { ?b foaf:image ?bimage } .
OPTIONAL { ?b foaf:weblog ?bweblog } .}

Clear enough, I guess. Unfortunately, I just can’t go look at bnick and stuff it into my results because bnick might be some sort of “resource” which then has to programmatically traversed also (see http://api.hi5.com/rest/profile/foaf/208329359). I admit that this might – maybe even probably – is a problem with me, maybe I don’t understand SPARQL well enough.

But that’s old business. The way I’ve been doing this is CURLing down the FOAF file, manually inspecting it, writing some Python/rdflib/SPARQL code and seeing what happens.

This morning I decided to try a new approach: look for a SPARQL and/or RDF browser and figure out the correct queries online, then just write the code once, correctly. In my mind, this way all very sweet: an INPUT field for the FOAF/RDF URI, a TEXTAREA for the SPARQL query, a TABLE for the SPARQL results, and a TABLE showing all the RDF triples, since it’s triples “all the way down”.

Here’s what I did find:

  • Google rdf browser
  • Check out Brown Sauce; have to install a local massive development environment – remember now, I’m trying to save time, not lose it
  • Check out http://browserdf.org/: “Faceted Navigation for arbitrary Semantic Web data”. Very promising. Unfortunately, “arbitrary” seems to mean three different data sets
  • Check out Stefano’s Linotype -- a high quality information source usually; find out about Welkin
  • Try Welkin
  • Find out Welkin doesn’t browse the web
  • Download the FOAF file from http://kitschbitch.vox.com/profile/foaf.rdf into test.foaf.
  • Discover that Welkin doesn’t like “*.foaf”
  • Try again with “*.xml”
  • Try again with “*.rdf”
  • Success, except no results. Why? Oppps … I was downloading the wrong URI
  • Try again with the correct URI
  • Verify that it’s a FOAF file
  • Stare at nothingness coming out Welkin
  • Write a blog post about it; partially regret losing 50 minutes of my morning

The problem – a problem – with FOAF and RDF is quite simple. People don’t want formats that can do anything, they want formats that can do something. I got a Flickr API downloader going in about 30 minutes, taking my time. I’ve put hours into FOAF and still am unhappy.

Visibily RDF-free Semantic Web

edit David P. Janes 2007-01-09 20:57 UTC add comment  ·  ·

Bill de hÓra (on a Danny Ayers post):

In other words, the work of generating RDF will be placed on people who want to use RDF. I think this idea of extracting RDF from published markup instead of using RDF as the backing data to generate the published markup is a big deal. For one, it will mean less RDF tax on existing publishers, who seem to be happy to stay with HTML, RSS and microformats (uF). Second it distributes costs fairly - RDF proponents will be forced to derive value from what they extract instead of playing schedule chicken with publishers, and pushing costs back onto them to supply the data just so. Third, from a systems design viewpoint, extraction is a much cleaner design than trying to kludge RDF support on top of existing RDBMS storage and web frameworks. It's cheaper today to publish uF via web frameworks, databases and templates than retool internally with RDF based technology - uF by being HTML is a relatively low-impact upgrade on the templating tier, not a rip and replace of the data/object tiers. I've been saying for some time that the Semweb is missing a layer, the one that infers the useful information from syntactic markup. Maybe uF and GRDDL are that layer's ingredients.

RDF Semantic web research isn't working

edit David P. Janes 2006-09-16 20:49 UTC add comment  ·  ·  ·

Zack Rosen has a post called "RDF Semantic web research isn't working". It's a very easy read and yet so packed full of interesting points that I won't quote any of it and will just say "go read it".

A few additional comments:

  • many SW people "don't get it". Sorry, we don't model the world in triples so starting your sales pitch with that just doesn't cut it; and I'm not an idiot for disagreeing with you
  • the SW missed a wonderful opportunity by not jumping on the "mashup" bandwagon, where it would have been a natural fit for arbitrary data passing between apps (rather than crud-o hand rolled XML formats)
  • every page produced by the BlogMatrix Platform has a corresponding XML/RDF page. Placing the structured data into the RDF shouldn't be too difficult except I'm really not going to make the effort if there isn't the demand
  • I'm working on articulating an alternative vision to the Semantic Web called the Datasphere built microformats (for data sharing), structured blogging (for ad hoc data creation), tagging (for fluid structure) and directories (for inherent structure). Stay tuned.

Another RDF vocabulary: SIOC

edit David P. Janes 2006-07-10 15:11 UTC add comment  ·  ·  ·

Here's another useful RDF vocabulary: SIOC, Semantically-Interlinked Online Communities:

SIOC (Semantically Interlinked Online Communities) is an ontology for describing discussion forums and posts on topic threads in online community sites. This includes but is not limited to: blogs, bulletin boards, mailing lists, newsgroups, etc.

My major issue with SIOC is that it's working in the same knowledge space as Atom N3 (previously mentioned here) but uses a different vocabulary. Obviously, RDF knows how to work around this but it's a shame that SIOC didn't use Atom's painstakingly thought out terminology. I suspect this happened because SIOC comes more from the Bulletin Board world than the Blogging one.

Link:

RDF Vocabularies

edit David P. Janes 2006-07-04 00:37 UTC add comment  ·  ·

One of the key concepts for the BlogMatrix platform is that we should be able to produce RDF output backing all pages of the site. To do this, we need to have a RDF vocabularies to describe the encoded data and of course the best way to do this is to reuse existing ones. Here's a couple:

  • FOAF - persons, projects, organizations, groups, documents, images and online groups
  • GEO - lat, lon, WGS84
  • RDFizers - tools for converting existing data types into RDF (to be mined for information)

Idea: we really need to make the Link extension handle multiple links per post. 

Timeline

edit David P. Janes 2006-07-03 23:03 UTC add comment  ·  ·  ·

Here's a pretty cool widget:

Timeline is a DHTML-based AJAXy widget for visualizing time-based events. It is like Google Maps for time-based information. Below is a live example that you can play with. Pan the timeline by dragging it horizontally.

Since we've already got the concept of a "Calendar of Events" (example), look forward to an implementation here on BlogMatrix soon!

Link:

eRDF, microformats and what we're doing

edit David P. Janes 2006-06-19 22:06 UTC add comment  ·

I just want to make a brief post about this "web clipboard" using eRDF, acknowledging its existance. What I'm hoping to do with the BlogMatrix Platform is have every page available in alternate formats -- in particular, RDF in XML and N3 -- and then use some sort of microformat way of linking "important" objects between the HTML and the RDF. This would allow for clever things like copying full address book entries (even though we may be only display a name), reblogging, copying events...

Yes, all very vague. We'll see how it works out.

RDF questions

edit David P. Janes 2006-06-16 15:11 UTC add comment  ·

Here's what we're trying to accomplish with RDF. The BlogMatrix Platform is extending the blogging paradigm (yeah, yuck) with arbitrary data -- you can see map, event and address data being added to blog entries over here. I won't go into the particulars of the mechanisms involved here (yet) except to say it's pretty flexible.

We'd like everything produced in the BlogMatix Platform (i.e. this blogging application) to be visible as RDF. In the future, we'd like to use some sort of microformats type linking to tie the HTML version to the RDF version, but we'll also leave that issue to the side for now.  Here's this entry shown three different ways:

The RDF model is based (for now) on the Atom N3 Ontology. Here's our questions:

  • does this RDF/XML look even remotely right (beyong syntax -- we're using RDFLib to generate)?
  • is there some easy way in RDFLib to create an order of items? If you look at the home page in RDF you'll note the order of entries is totally wrong.

Note that we've just implemented the bare minimum. We're hoping in the upcoming weeks to make all the extensions we've developed export reasonable looking RDF. 

Attached Documents:

RDFLib Questions

edit David P. Janes 2006-06-16 14:35 UTC add comment  ·  ·

Do you know anything about RDFLib, and in particular, N3 serialization? I've been working on plugging this into our page generator and I'm expecting to see something like this but instead I'm seeing is this.

Is this just a RTFM question and if so, where is the FM, or is something not implemented yet?

Atom N3

edit David P. Janes 2006-06-14 19:49 UTC add comment  ·  ·  ·

Atom N3 is ... well, I'll let the The Sun BabelFish Blog explain it:

Atom N3 is an ontology that closely maps the Atom feed format to N3. It clearly reveals the logical structure of the atom feed format, and is what is needed to make atom:

  • easily and clearly extensible
  • available to SPARQL queries
  • easily mappable to java objects through frameworks such as So(m)mer
Read more here. I mention N3 because we're planning to provide an RDF interface to all the data you see here and N3 is one of the formats we plan to provide it in. If we can figure it out.