BlogMatrix
 

The Google Base data model as a Semantic Web language

edit David P. Janes 2006-07-17 11:39 UTC 2 comments  ·  ·

What is the Semantic Web? Here's Wikipedia's definition, which is probably as good as any, but a good working definition is a layer of the World Wide Web that is meant to be read and understood by computer programs (as opposed to the traditional web, where humans are the end consumer).

I beleive the Google Base data model provides an excellent addition to tools and languages currently being used to bootstrap the Semantic Web. In particular:

  1. The GBase data model is easy to produce.
    This is a huge advantage. When I read articles like this (Danny Ayers comments) about the semantic web, I get the impression of ivory towers and massive queries taking weeks to write to query parts of protein databases. Maybe that's not fair, but my vision of the Semantic Web is something much more personal, something almost trivial to produce as a byproduct of day-to-day activities, such as blogging, wikiing, e-mailing and so forth
  2. The GBase data model is easy to consume
  3. The GBase data model is easy to transform into RDF (or anything else)
  4. The GBase data model is easy to understand (RDF's biggest problem, ahem: "A triple can simply be described as three URIs. A language which utilises three URIs in such a way is called RDF" -- that explains a lot!)
  5. There is a lot of being produced by a lot of different people for Google Base
There's a few things that would greatly improve the utility of Google Base and its data model:
  1. Google should export it's database in XML
  2. Google should consider modifying (upgrading?) it's data model as per the suggestions below
  3. We need to see "open" or at least non-Google consumers of GBase data
  4. We need more Google Base data producers. BlogMatrix is doing its part. (We also produce RDF and N3, thank you very much).

I know the word language isn't probably the right one, but it feels right to me.

Comment #1Henry Story

2006-07-18 09:31:53

In your first point 1. you say that the SemWeb is complex. In fact is is a lot more basic than XML. Just think "everything is related in at least one way". Think of two things. Given them a name. (A URL is a good idea then it won't clash with the name anyone else has assigned) Then think of a relation between them and give that relation a name (take a URL again). And you have RDF. It is as simple as that.

XML is a lot more complicated than that, as it is a tree structure, and a document. Long before people were able to write, they were able to communicate using language, with words naming things and relations between things.

You need to read up a little on the SemWeb for it to become clearer, just as you spent some time learning html. Learn N3 or some simple to write language, and it will become clearer. 

Comment #2David Janes

2006-07-18 13:00:06

Yes, I understand what you're saying and I understand well enough the words you are saying. However: I've been working as professional computer guy for 24 years, in computer aided software engineering, banking, treasury/finance/risk management, and in the automotive sector -- all dealing with massive amounts of data -- so I can say with a fair amount of certainity that no one I've ever dealt outside of the semantic web comnunity with thinks about their data as triples. It doesn't matter that it can be modelled that way, just that they don't think about it that way.

How you think about data determines how you create your solutions. The SW community (as I see it) says "well, just think about data our way and you'll see the beauty/power/expressiveness". That's asking for a lot more that just "read[ing] up a little", especially since most people in the data field (once again, that I know) are quite happy with the way they think about data and have created some very impressive useful applications without the help of triples.

Add Comment