Once again It’s time for Jazz!

If you are going to the  Rochester International Jazz Festival You’ll see us there. Last night was the first night. It was great as always.  Once agin I’ll mention the only way to do this festival right is to buy a club pass.  With a pass and early arival time it’s easay to catch anywhere from 3-5 acts a night.  Fantastic. And here’s another tip, the festival guide this year does not include a map. So if you’re not familiar with our downtown venues print out the online map so you know where to go.  See you on Jazz street.

Art
Music

Comments (0)

Permalink

The new way is already old

Some may be thinking that the emerging semantic web may be “just a new representation of data” (RDF, OWL, etc..)  But I think it’s a lot more. It’s also, and possibly more importantly a representation of data where none has existed before.  I think we sometimes need to be careful not to loose sight of the obvious., especially in academic discussions. (which often fling the obvious far into the weeds.) That being; that vast multitudes of information are pouring on to the web every day and every minute in a form which is unstructured, difficult to reuse, discover, and make sense of.

Ok so as I understand it the real value of the emerging semantic web (once fully realized) will be the new ability to share, harvest, read, process, and “reason over” the world’s published knowledge programmatically.  The alternative is, well, rather disconcerting, How many monkeys can you muster to sit and read the web and make sense of it to get what you need today? Not many I’ll suggest, monkeys are not cheep and they eat a lot. And yet that is just what businesses are doing today. The number of “knowledge workers” in the modern business office place goes up and up each year. Where do all these people get their information in the “information age”. The answer:  First scan the web, maybe pick up the phone. Honestly think about, when you need an answer where does your hand go first? The door knob? The phone? Or the Mouse?

Is the new paradigm still new?

Here’s the rapidly emerging issue, the new paradigm of gather knowledge (and thus power) from the web is also the old paradigm that must become obsolete.  It is the old paradigm in the sense that burning up valuable “human eyeball time” scanning for information on the web is just not very efficient.  We need a better way.

The consensus of this need represents itself in many ways. The shear mountainous popularity of Google is, (obvious again) a vote in favor of the validity of this need. In this Google Tech talk Doug Lenat from Cycorp points out the limitations of Google;  just try and ask Google a question like “Which is taller the Space Needle or the Eiffel tower?” Doug goes on to describe how (in 2006 )  Google returns no reasonable answer.  I tried it today and the first hit led me to an answer.  Or did it?

Did Google really answer the question?  Sadly no, we only get lucky (or think we did) because someone created a forum post with the same question in the title, so we get a match.  Guess what, if the forum post did not really resolve to an answer we still fail.  How many times has this happened to you?  (And by the way how willing are you to trust an answer from a forum post)

Good answers are hard to find

So you still can’t really get a reliable answer with out some manual work on your own sorting out bits and pieces from the search results and then doing some verification (or some math).  It should be dirt simple for computers to answer these sorts of questions in a reliable fashion. But today it is not.

One step closer is Wolfram Alpha.  Doug Lenat is Bullish on Wolfram alpha and so am I. I played around with it the other day and had surprisingly good results, This tool has great potential and sports a reasonable amount of DWIM (Do what I mean, not what I say).  However today I asked the Space Needle /Eiffel Tower question there and lo and behold it’s… not understanding me. I type in the exact phrase “Which is taller the space needle or the Eiffel Tower?” and I get back “Wolfram Alpha isn’t sure what to do with your input.”  Ok, but I’m willing to try again…so I enter “Space Needle Eiffel Tower”.  Aha I get results. A very nicely formatted page with facts and maps about each tower and it is complete with height values.  Very cool.

But the current limitation is that Wolfram alpha did not understand English grammar well enough.  Optimistically speaking this should be doable today; We can write programs that parse simple sentence structures in question format.  It’s not always guaranteed to work perfectly but I think the folks at wolfram alpha should be able to do better and I’m sure they will.

The ultimate answering machines cometh

Meanwhile similar but different solutions are on the way. Google has announced Google Squared.  Tech Crunch is quick to claim it will  “Crush wolfram alpha“. And  Paul, (The Content Guy) is quick to reminds us that we should all calm down because Google Squared is not the same as Wolfram alpha., and therefore should not be compared head to head and presented as an either/or choice.  Indeed I agree, it isn’t the same. And as those annoying optimists are so quick to say;  “It’s all good”. Anyway I don’t think “really smart” search engines are the end of this story. What is the end game for sites like Ask.com, Wolfram Alpha and Google Squared? Sites who’s motivation, reason for being, is being in the game of helping people get to an answer to a question. What might the ideal “well of web wisdom” look like?  Well here’s what I want, imagine this if you will;  a web site that can answer my questions and:

  • Do a reasonable job at natural language interpretation
  • Provide suggested questions when mine is not understood (i.e “Did you mean…” on  steroids)
  • Truly assemble real quality answers with associated facts and links.
  • Draw material from the vast semantically rich web when needed.
  • Draw on large Linked Open Data sources when needed.
  • Provide links to soruces for verification
  • Provide deep linking to other, possibly more relevant, sites that focus on vertical domains for a deep dive. (engineering, health, sciences, etc.).
  • And so on…

Sound unrealistic?, I don’t think so.  This is not utpoian at all at this point.  No not at all utopian. And  once we have these kinds of resources, the cost of hunting around for information will be greatly reduced saving us time, money and monkeys, and we may just find we have more time to pick up the phone again and call a friend, or even (heaven forbid) reach for the door.

Semantic Technology

Comments (1)

Permalink

Helping Machines Read, in three easy pictures.

Fabien Tiburce has just posted a simple to understand example of how/why microformats (rich snippets) can be used. What I love about this example is that it’s simply three illustrative pictures.  Now come on, we all love pictures, admit it.  Nice work Fabien!

Semantic Technology
Tech

Comments (0)

Permalink

Rich Snippets: Tipping point for the semantic web

Ok, this is it, I’m calling it. In my opinion we are now at a  tipping point that  hails the large scale emergence of the next generation of the web, the  “semantic web”. (ok , call it web 3.0 if it helps)

The big event that marks this on the calendar (this last Tuesday) is Google’s announcement of support for microformats and RDFa, which they call rich snippets.  And the support of both formats is interesting I believe the  support for RDFa may be of somewhat more significance. To be clear I really think  this is the “beginning” of a significant transition period of sorts that may take a year or so.  I have observed that major sea changes of this nature don’t happen overnight.  For example the web was technically fully functional in the 92/93 time frame. But it wasn’t until 94/95 that the wave of popularity started to swell. I believe it will take a year or two for the tide to fully change the course of the web semantically as well, but it will.  You can’t stop the tide short of blowing up the moon.  And in this case the moon is Google. And by semantic change I mean that a vast majority of “actively developed” web sites ( in an approximately mere two year time span) will contain a rich variety of  structured data that is both readable by humans and discoverable by machines.

So how does all this happen in such a big way?  The key to understanding this level of change is human behavior, not technology.  The technology is simple and it turns out that’s almost a requirement for major change.  Making fire is simple, the wheel is simple, bicycles are simple and telephones believe it or not is pretty simple ( got some tin cans?)  And so it goes that the basics of the web html and http are pretty simple and so is RDFa and microformats.

It’s all about human behavior

So what’s the behavior game here? Well it turns out that the behavioral needs also have to be simple for mass adoption.  We all get cold (fire), we all have to move heavy stuff (wheels) and we all need to get places. (bicycles).  Well on the web there is a simple need, we all need to get noticed.  The need for search engine optimization (SEO) will drive this change. In this case it is the behavior of the people (and their tools) that publish on the web that are the change agent.  It’s really that simple. The sequence is surprisingly predictable.

The first generation will be (and probably already have been) hand coding rich snippets into web pages. Next will come the changes in the tools. (actually already underway) Web publication tools with built in functionality to easily insert rich snippets will make it painless and invite the technical and non-technical publishers alike. SEO marketing wonks will then start to talk about <deep voice>”how important it is”</deep voice>to add rich snippets and finally it will become just the way pages are authored and not even mentioned much.  When was the last time you heard any web authors talk about inserting meta data keywords in their work?  You don’t hear about that much but you can bet all the good developers do this best practice regularly. It’s just “how it’s done”.  In some ways it’s all very reminiscent of how RSS took off.

As all this happens the back end applications start to blossom and provide very large rich data sets for new semantically enabled search and discovery tools, which in turn feeds back to drive more people to populate their web sites with semantically rich data. It’s a positive feedback loop that ramps up in a big way. Oh and don’t forget the VC begin to take notice as well, pumping more dollars and energy into the system. Google may may lead the way in terms of popular drive but you can be sure that a whole host of others have been and will be finding ways to leverage the semantically enabled web for other purposes.

Is it really that simple?

And so the behavior changes.  Or does it.  Well now comes the messy part. Change is messy and right off the bat there will be naysayers, and thankfully so.  Because the first implementation will not be perfect, and there will be much thrashing of ideas opinions and gnashing of teeth, and this is how stuff  gets done on the Internet (and where all the fun is.)    Bloggers will complain,  for example already today I read reasonable critiques of Google’s intentions but then again read their readers responses, the perceived problems are probably not insurmountable. Then later the big players start making public statements this way or that and so it goes. The key thing is that the developer dialogue continues to unfold, course corrections are made, and stuff gets better. Huzaah!

And for me well I’ll be watching this closely and of course my site should support RDFa now shouldn’t it. It already has a bunch of RDF metadata, (see my previous post from last year.) I supoose now I must venture on and try out some of the RDFa and microformat solutions for Wordpress right here on this blog.  Hello? (tap tap) Google? Can you here me now?

Innovation
Semantic Technology

Comments (4)

Permalink

Dries Buytaert on Open Source, Open Data and government

“The Obama administration recently excited the world of open source software by choosing to launch recovery.gov on Drupal. Their choice of a free, open source platform over any proprietary system is as hopeful and promising as the purpose of the website they built, which is to lend transparency to the spending of the $800 billion dollar economic stimulus money. We should be happy both that the U.S. government is embracing open software, and that it is promoting Open Data…”

via Where Open Source, Open Data and government meet | Dries Buytaert.

Cyber power
Semantic Technology
Tech

Comments (0)

Permalink

Second, 3rd, 4th…Nth Life: Project Wonderland

Want to build your own virtual world like Second Life? Maybe you need it inside your organization and would prefer to keep it private and customize the heck out of it?  Project Wonderland just might be the ticket.

“Project Wonderland is a 100% Java and open source toolkit for creating collaborative 3D virtual worlds. Within those worlds, users can communicate with high-fidelity, immersive audio, share live desktop applications and documents and conduct real business. Wonderland is completely extensible; developers and graphic artists can extend its functionality to create entire new worlds and new features in existing worlds.”

I just spent the last half hour watching a set of video’s that provide a technical overview for Project Wonderland. I must say I’m impressed. In it’s current state (version 0.4) it lo0ks fully functional and ready to support real time 3d  avatar based interaction complete with a feature rich audio bridge system.  The back end is provided by the open source MMO game engine Project Darkstar.

The system looks good enough today to use for small group collaborative applications and the future 0.5 version promises enhanced scalability and better graphics and avatar support.

social software
virtual worlds

Comments (0)

Permalink

Twitter to wordpress mojo and can tweets feed the semantic web

Paul’s plea for help integrating Twitter with Wordpress has got my hacker juices flowing again and I spent some time today fiddling with this integration challenge. It is the end of the day and I have something working here.  The first thing to sort out was what are our goals.   So first noodling around different questions like  “why tweet in the first place?” and “if your are tweeting your blog post is the tweet the same as the title of the post?” and so on. My answer, simple; I’ve decided to start using status updates on social networks (tweets) in ways that might help others when I find something of potential value. And I came to the conclusion by the way that a tweet about a blog post isn’t or shouldn’t necessarily be the same as the title of the post.  One is the title the other is a teaser (the tweet) to get someone to the post.

I think the most interesting outcome for me was the realization that the global tweet stream might actually have some hidden potential value for the semantic web.  what I once thought was a fairly useless communal stream of consciousness, the endless stream of twitter messages about silly things and foolishness, might actually contain hidden semantic gems.

It turns out there is a common subset of twitter messages that follow the same pattern of semantic metadata.  For example many messages are commonly in the form of;  “Mike Axelrod is going to the Rochester Museum and Science center today http://www.rmsc.org/“  This twitter message actually contains some fairly decent semantic data.  It’s in a nice concise triple form (subject, predicate object) it’s just not quite consumable by machines, yet (that is to say, not in the RDF, OWL)  Hint, Hint  folks, we could have something interesting here.  As it stands if we just filter for all the tweets that have URLS we might have some interesting semantic data to mine.  Tweets are so short that if they do contain an url typically it will be just one.  This guarantees us a semantic triple every time. The sender, the comment and the url.

Ok on to the second part of this post. So how do I actually fit this whole tweeting, twittering foolishness into my lifestyle?  Well here’s what I came up with so far.  First of all  I only want one point of entry for my tweets.  One client, one ping to rule them all. So naturally I’m gravitating to ping.fm.  This tool allows me to route a single status update  (tweet) to all the social networks I use.  This would be LinkedIn, Facebook, and (it seems) twitter now.   Next I want a better client experince than what a web page can offer.  It’s just not conducive to the way I work to have to navigate to an url when I want to post an update. additionally some of the clients have a very concise format and play well on my desktop.   And finally I want control over where tweets go.  Sometimes to linkedIn, sometimes to Facebook, sometimes to both and sometimes to my Wordpress blog.  So ping.fm gives me this fine grained behavior with one exception. To route tweets to my wordpress blog I had to install the ping.fm wordpress plugin, which I’m happy to say works like a charm. (<–look over there in my sidebar and you’ll see it.)

Now there is only one missing piece of the puzzle. Of all the front ends to ping.fm I tried. None of them had all the features I wanted.  Twitterific is wonderfully simple but only works with Twitter, Tweetdeck works well but only gives me the choice to send to Twitter or Facebook.  Twhirl is getting much closer to what I need and has almost everything except one critical feature. The ability to easily pick which ping.fm group I wan to send to. The last one I tried today does this! It’s called MePing but alas it does almost nothing else, compared to the others I tried. (It’s very beta) Well I’m not giving up and I’m sure very quickly one of these clients will fit my full set of requests. I have the feeling we are in a horse race right now and if you are reading this post 6 months from now many of the above mentioned clients will have everything you need (or will have dropped out of the market.)

There is plenty more to explore with twitter intrgration to websites. I certainly won’t be spending a great deal of time looking at all of this but I do believe there may be some value to discover, if applied correctly.

Cyber power
Tech
social software

Comments (3)

Permalink

ZipcodeZoo

ZipcodeZoo is big, really big. But with ZipcodeZoo you can also get local. By entering your zip code you learn about species of plants and animals that live near you.  After I entered my zip code I was qucik to learn to learn that At least 374 species that live near my home in Fairport NY are declining in population.  Zipcode Zoo is also a good exampe of how we can combine ecological data with online mapping tools.  Clicking on the “distribution tab” of a species specific page in Zipcode Zoo reveals numourous maps including in some cases embedded Google Earth renederings of actual sigthings. Virtual Globes such as Google Earth and tools like Google Maps are great because they provide a common browsing experience for the expert and novice alike, and can be repurposed to support almost any kind of geographically related data.

Here’s how big Zipcode Zoo was today:

“This site is big. As of Tuesday, April 28, 2009, this site is home to 2,607,667 web pages describing 1,295,353 animals, 1,067,966 plants, 193,843 fungi, 17,577 chromista, 16,063 protozoa, 16,113 bacteria, and 459 viruses. Pages contain 277,365 photos taken by 1,447 photographers, 1,471 sound recordings, and definitions of 223,189 terms. 85,429 Large photos can be zoomed and panned.

We have gathered a total of 127,715,647 field observations from 28,481 data sets and 1,547 data providers which show latitude and longitude, from which we have generated 254,338 State Maps, 1,430,347 Country Maps, 450,700 Google maps showing up to 200 sightings each, and 450,700 Google Earth maps showing all sightings. Click on a pinpoint on one of these maps, and you’ll learn more about that observation.”

Environment
Tech

Comments (0)

Permalink

University presidents commited to a better climate

Being  affiliated with RIT I applaud  president Destlers (and many other university presidents) commitment to improving our global climate. You can catch the signing  at 10:30 a.m. tomorrow in the Al Davis Room, Student Alumni Union.  Kudos to the RIT Student Environmental Action League for sponsoring the event. The thing I like about this commitment is that it proposes real action. Just take a look at the list. (follow the link) 

“We, the undersigned presidents and chancellors of colleges and universities, are deeply concerned about the unprecedented scale and speed of global warming and its potential for large-scale, adverse health, social, economic and ecological effects. We recognize the scientific consensus that global warming is real and is largely being caused by humans. We further recognize the need to reduce the global emission of greenhouse gases by 80% by mid-century at the latest, in order to avert the worst impacts of global warming and to reestablish the more stable climatic conditions that have made human progress over the last 10,000 years possible.”  more…

Environment

Comments (0)

Permalink

Playing for Change

Playing for Change.  Moving.  Watch this and understand.

Cyber power

Comments (0)

Permalink