BlogMatrix
 

Visibily RDF-free Semantic Web

edit David P. Janes 2007-01-09 20:57 UTC add comment  ·  ·

Bill de hÓra (on a Danny Ayers post):

In other words, the work of generating RDF will be placed on people who want to use RDF. I think this idea of extracting RDF from published markup instead of using RDF as the backing data to generate the published markup is a big deal. For one, it will mean less RDF tax on existing publishers, who seem to be happy to stay with HTML, RSS and microformats (uF). Second it distributes costs fairly - RDF proponents will be forced to derive value from what they extract instead of playing schedule chicken with publishers, and pushing costs back onto them to supply the data just so. Third, from a systems design viewpoint, extraction is a much cleaner design than trying to kludge RDF support on top of existing RDBMS storage and web frameworks. It's cheaper today to publish uF via web frameworks, databases and templates than retool internally with RDF based technology - uF by being HTML is a relatively low-impact upgrade on the templating tier, not a rip and replace of the data/object tiers. I've been saying for some time that the Semweb is missing a layer, the one that infers the useful information from syntactic markup. Maybe uF and GRDDL are that layer's ingredients.

Semantic Web Update

edit David P. Janes 2006-09-22 23:20 UTC add comment

I like these two posts about the Semantic Web by Daniel Lemire, so I'll quote both.

One:

I honestly do not see the Semantic Web being about to take off. As Bob DuCharme pointed out, people are doing “ontologies for the sake of ontologies”. This will get old very quickly. If 8 years and millions of dollars was not enough to produce a single remotely useful application, what will it take?

Two:

I am sorry, but if the expert system debacle taught us anything, it is that, in Computer Science, it is not enough for an idea to sound intuitively useful. Ideas must be put to the test and they must provide value to users. Otherwise, users do not want to be bothered with it. In this sense, Information Technology is an experimental science. Some ideas are useful, others are not. So far, RDF and ontologies have not been shown to be useful. The burden of the proof is not on the users or on those who do not believe. The burden of the proof lies squarely on those promoting the idea. I do not have to argue against ontologies or RDF: if you disagree with me, you have to prove me wrong. That is how Information Technology works: you convince people by changing their life for the best. The reason for this is simple: there are too many good looking ideas out there for us to consider them all, and so we prune them out by whether or not they are proving useful in practice.

His point about Google getting better (in post One) is an important concept to the Datasphere: we have great tools usable by humans to quickly get to relevant data -- the trick is to bridge the HTML to the data; hence, microformats.

RDF Semantic web research isn't working

edit David P. Janes 2006-09-16 20:49 UTC add comment  ·  ·  ·

Zack Rosen has a post called "RDF Semantic web research isn't working". It's a very easy read and yet so packed full of interesting points that I won't quote any of it and will just say "go read it".

A few additional comments:

  • many SW people "don't get it". Sorry, we don't model the world in triples so starting your sales pitch with that just doesn't cut it; and I'm not an idiot for disagreeing with you
  • the SW missed a wonderful opportunity by not jumping on the "mashup" bandwagon, where it would have been a natural fit for arbitrary data passing between apps (rather than crud-o hand rolled XML formats)
  • every page produced by the BlogMatrix Platform has a corresponding XML/RDF page. Placing the structured data into the RDF shouldn't be too difficult except I'm really not going to make the effort if there isn't the demand
  • I'm working on articulating an alternative vision to the Semantic Web called the Datasphere built microformats (for data sharing), structured blogging (for ad hoc data creation), tagging (for fluid structure) and directories (for inherent structure). Stay tuned.

Gartner's 2006 Emerging Technologies Hype Cycle

edit David P. Janes 2006-08-21 13:15 UTC add comment  ·  ·  ·  ·
I'm finding this report rather interesting, if only for the number of areas where it touches on BlogMatrix is doing (rather well, I may add) which I've marked in bold:
Gartner's 2006 Emerging Technologies Hype Cycle Highlights Key Technology Themes

Web 2.0 technologies and business models dominate emerging technologies together with Real World Web and Applications Architecture
“The emerging technologies hype cycle covers the entire IT spectrum but we aim to highlight technologies that are worth adopting early because of their potentially high business impact,” said Jackie Fenn, Gartner Fellow and inventor of the first hype cycle. One of the features highlighted in the 2006 Hype Cycle is the growing consumerisation of IT. “Many of the Web 2.0 phenomenon have already reshaped the Web in the consumer world”, said Ms Fenn. “Companies need to establish how to incorporate consumer technologies in a secure and effective manner for employee productivity, and also how to transform them into business value for the enterprise”

Link:

Google Base -- summing up

edit David P. Janes 2006-07-17 11:53 UTC 4 comments  ·  ·

I hope you enjoyed and found at least a little bit useful this series of posts about Google Base. I'm sure there's a few mistakes and I'll correct them as I -- or you, there's a comments section, you know -- find them.

Everything I've written about Google Base is here.

Broken links are now fixed. 

The Google Base data model as a Semantic Web language

edit David P. Janes 2006-07-17 11:39 UTC 2 comments  ·  ·

What is the Semantic Web? Here's Wikipedia's definition, which is probably as good as any, but a good working definition is a layer of the World Wide Web that is meant to be read and understood by computer programs (as opposed to the traditional web, where humans are the end consumer).

I beleive the Google Base data model provides an excellent addition to tools and languages currently being used to bootstrap the Semantic Web. In particular:

  1. The GBase data model is easy to produce.
    This is a huge advantage. When I read articles like this (Danny Ayers comments) about the semantic web, I get the impression of ivory towers and massive queries taking weeks to write to query parts of protein databases. Maybe that's not fair, but my vision of the Semantic Web is something much more personal, something almost trivial to produce as a byproduct of day-to-day activities, such as blogging, wikiing, e-mailing and so forth
  2. The GBase data model is easy to consume
  3. The GBase data model is easy to transform into RDF (or anything else)
  4. The GBase data model is easy to understand (RDF's biggest problem, ahem: "A triple can simply be described as three URIs. A language which utilises three URIs in such a way is called RDF" -- that explains a lot!)
  5. There is a lot of being produced by a lot of different people for Google Base
There's a few things that would greatly improve the utility of Google Base and its data model:
  1. Google should export it's database in XML
  2. Google should consider modifying (upgrading?) it's data model as per the suggestions below
  3. We need to see "open" or at least non-Google consumers of GBase data
  4. We need more Google Base data producers. BlogMatrix is doing its part. (We also produce RDF and N3, thank you very much).

I know the word language isn't probably the right one, but it feels right to me.

Improving Google Base: simple structure

edit David P. Janes 2006-07-17 09:43 UTC add comment  ·  ·  ·

Another useful feature for Google Base would be to allow "simple" structure to be added to Attribute Types. In this simple structure, readers (i.e. Google Base) are free to move the inner structure elements "up a level" with the net result that there would be no change needed for their DB model.

For example, here's a "location" (from here):

<g:location>
1 Bank Street
Ottawa, Ontario
Canada

</g:location>

I propose they also accept: 

<g:location>
    <g:street-address>1 Bank Street</g:street-address>
    <g:locality>Ottawa</g:locality>, <g:region>Ontario</g:region>
    <g:country-name>Canada</g:country-name>
</g:location> 

The 'location' attribute gets stored exactly the way it would be in the first case (we strip the inner markup) BUT we get the additional benefit of the all the new attributes AND we don't have to throw away information we already know!

One could also see this being used in the proposed "person" attribute

<g:person>
    <g:given-name>David</g:given-name> <g:family-name>Janes</g:family-name>
</g:person>

Note that the new attribute names I'm using are based on the vCard standard

Improving Google Base: reusing attribute definitions

edit David P. Janes 2006-07-16 21:28 UTC add comment  ·  ·  ·

The next several posts will be about using the Google Base data model as a language for the Semantic Web.

As outlined in this series of posts, Google Base is very flexible in defining new attributes for the database. Unfortunately, new attributes have to be defined in terms only of base data types and the definition of the type is implied, not defined, by the tag name the user assigns. This is overly simplistic, an unnecessary restriction and inflexible.

For example, let's say you want to define a new attribute. Let's say the person to contact. Since there's no "contact" defined in the standard Attribute Types under the current model, this is what you'd add:

    <gc:contact_person type="string">Johnny Chase</gc:contact_person>

("gc" is the "http://base.google.com/cns/1.0" namespace, as defined here.) Let's face it: this is pretty thin gruel. The computer knows that this is a string and -- if you can read English -- humans can infer than that this is a "Contact Person". Google Base is so close and can do so much better.

We propose that Google Base should allow new Attribute Types to be defined based on existing Attribute Types. For example:

    <gc:contact base="person">Johnny Chase</gc:contact>

That is, we've defined a new type called Contact that's based on an existing Attribute Type called "person"*. Ooooooo ... very nice, very simple, and we've already gained a lot knowledge -- from a computer point of view -- what "Johnny Chase" is all about. And Google Base hasn't lost anything either -- deep down, it knows it's just a string.

* we know. See the next message.