New Tech Post
Fresh from the center of the universe comes the New Tech Post. Content includes a focused mix of posts covering business, mobile, social media and technology. Thanks for the tip John, I’ll be adding this to my rss feed reader shortly.
Mike Axelrod’s blog
{ Category Archives }
Fresh from the center of the universe comes the New Tech Post. Content includes a focused mix of posts covering business, mobile, social media and technology. Thanks for the tip John, I’ll be adding this to my rss feed reader shortly.
If you happen to be attending Enterprise DataWorld conference this March then I’d recommend attending this session by John Biderman and Cameron McLean. John and Cam are sure to tell an interesting tale of how they’ve wrangled Semantic MediaWiki into harvesting a world of MetaData from one of their legacy relational databases and made it organized, accessible, annotatable, and well… in a word wikified.
” We were given an imperative from our business leaders to provide a friendlier and more collaborative front end on our metadata…
… This in turn led us to MediaWiki and the rich and diverse array of add-ons developed by the Open Source community, particularly its semantic extensions.”
I was happy to play a small part in this project last summer. I suppose I was more of kibitzer than anything else, but It was enjoyable to solve a few problems for the team as well as get an inside look into this very interesting project.
It gives me great pleasure to announce the new WXXI website launched this week. It has been an intense development schedule but loads of fun. The web development team at WXXI has been a great crew to work with. And I have found myself diving deeper into Drupal than I ever imagined. I also found myself sharpening my CSS and Javascript skills along the way, not a mention a bit more PHP and MySql as well.
The fun part of architecting this new site has been the rapid design cycle enabled by Drupal and all it’s myriad modules and theming capabilites. Got an idea in the morning? You can see it implemented by the afternoon. Or maybe even in a few minutes depending on the situation. Drupal ( or I should say Drupal and all it’s companion modules) succeeds because the granularity in component construction is at the web design pattern level. If a good idea for the web has been thought of, there is probably a Drupal module to support it.
Tomorrow afternoon I will be talking about Content Management and Social Media for the Insurance Industry. This talk is part of the Jumpstart series offered by Earley & Associates. I’ll talk about some of the current troubles of traditional content management solutions and offer up some ideas on how wikis in the enterprise can help. I also provide a brief introduction to Semantic MediaWiki (SMW). (For those of you who may have seen my other presentations on this topic, this will be a shortened version of the same material.)
Looking ahead I’ve been thinking about how to offer more services around Semantic Wikis and what potential customers might want when engaging a consultant in this area. It’s an area worth pondering because building applications with SWM changes the game of software development a bit. For one things it’s a lot faster to develop SMW applications and it’s a bit easier. However it still requires good old fashioned thinking. Also it requires combining skill sets from a variety of traditional areas in software development. In a phrase I could say “traditional software engineering skills still apply”.
What do I mean by this? Well here’s a quick list off the top of my head of the skills I’ve been using and sometimes redefining (or at least refining) as I build or help others build these semantic applications in a wiki environment.
Does this approach work? So far it seems to be. The only other thing I think I can add to this at the moment is that when developing Semantic MediaWiki applications there seems to be a recurring need for what I can only call “applied cleverness”. What do I mean by this? Well it’s hard to say without being specific, but lets just say that the rising tide of new web 2.0/3.0 tools and techniques that are available today are continually inspiring me to combine these tools in new ways that (I believe) have never been done before. There is a tremendous amount of “invention” going on today in this industry. And it’s a heck of a lot of fun to be a part of it. But now I’m wandering into a new topic so I’ll stop here and save this for another post yet to come.
Some may be thinking that the emerging semantic web may be “just a new representation of data” (RDF, OWL, etc..) But I think it’s a lot more. It’s also, and possibly more importantly a representation of data where none has existed before. I think we sometimes need to be careful not to loose sight of the obvious., especially in academic discussions. (which often fling the obvious far into the weeds.) That being; that vast multitudes of information are pouring on to the web every day and every minute in a form which is unstructured, difficult to reuse, discover, and make sense of.
Ok so as I understand it the real value of the emerging semantic web (once fully realized) will be the new ability to share, harvest, read, process, and “reason over” the world’s published knowledge programmatically. The alternative is, well, rather disconcerting, How many monkeys can you muster to sit and read the web and make sense of it to get what you need today? Not many I’ll suggest, monkeys are not cheep and they eat a lot. And yet that is just what businesses are doing today. The number of “knowledge workers” in the modern business office place goes up and up each year. Where do all these people get their information in the “information age”. The answer: First scan the web, maybe pick up the phone. Honestly think about, when you need an answer where does your hand go first? The door knob? The phone? Or the Mouse?
Is the new paradigm still new?
Here’s the rapidly emerging issue, the new paradigm of gather knowledge (and thus power) from the web is also the old paradigm that must become obsolete. It is the old paradigm in the sense that burning up valuable “human eyeball time” scanning for information on the web is just not very efficient. We need a better way.
The consensus of this need represents itself in many ways. The shear mountainous popularity of Google is, (obvious again) a vote in favor of the validity of this need. In this Google Tech talk Doug Lenat from Cycorp points out the limitations of Google; just try and ask Google a question like “Which is taller the Space Needle or the Eiffel tower?” Doug goes on to describe how (in 2006 ) Google returns no reasonable answer. I tried it today and the first hit led me to an answer. Or did it?
Did Google really answer the question? Sadly no, we only get lucky (or think we did) because someone created a forum post with the same question in the title, so we get a match. Guess what, if the forum post did not really resolve to an answer we still fail. How many times has this happened to you? (And by the way how willing are you to trust an answer from a forum post)
Good answers are hard to find
So you still can’t really get a reliable answer with out some manual work on your own sorting out bits and pieces from the search results and then doing some verification (or some math). It should be dirt simple for computers to answer these sorts of questions in a reliable fashion. But today it is not.
One step closer is Wolfram Alpha. Doug Lenat is Bullish on Wolfram alpha and so am I. I played around with it the other day and had surprisingly good results, This tool has great potential and sports a reasonable amount of DWIM (Do what I mean, not what I say). However today I asked the Space Needle /Eiffel Tower question there and lo and behold it’s… not understanding me. I type in the exact phrase “Which is taller the space needle or the Eiffel Tower?” and I get back “Wolfram Alpha isn’t sure what to do with your input.” Ok, but I’m willing to try again…so I enter “Space Needle Eiffel Tower”. Aha I get results. A very nicely formatted page with facts and maps about each tower and it is complete with height values. Very cool.
But the current limitation is that Wolfram alpha did not understand English grammar well enough. Optimistically speaking this should be doable today; We can write programs that parse simple sentence structures in question format. It’s not always guaranteed to work perfectly but I think the folks at wolfram alpha should be able to do better and I’m sure they will.
The ultimate answering machines cometh
Meanwhile similar but different solutions are on the way. Google has announced Google Squared. Tech Crunch is quick to claim it will “Crush wolfram alpha“. And Paul, (The Content Guy) is quick to reminds us that we should all calm down because Google Squared is not the same as Wolfram alpha., and therefore should not be compared head to head and presented as an either/or choice. Indeed I agree, it isn’t the same. And as those annoying optimists are so quick to say; “It’s all good”. Anyway I don’t think “really smart” search engines are the end of this story. What is the end game for sites like Ask.com, Wolfram Alpha and Google Squared? Sites who’s motivation, reason for being, is being in the game of helping people get to an answer to a question. What might the ideal “well of web wisdom” look like? Well here’s what I want, imagine this if you will; a web site that can answer my questions and:
Sound unrealistic?, I don’t think so. This is not utpoian at all at this point. No not at all utopian. And once we have these kinds of resources, the cost of hunting around for information will be greatly reduced saving us time, money and monkeys, and we may just find we have more time to pick up the phone again and call a friend, or even (heaven forbid) reach for the door.
Fabien Tiburce has just posted a simple to understand example of how/why microformats (rich snippets) can be used. What I love about this example is that it’s simply three illustrative pictures. Now come on, we all love pictures, admit it. Nice work Fabien!
Ok, this is it, I’m calling it. In my opinion we are now at a tipping point that hails the large scale emergence of the next generation of the web, the “semantic web”. (ok , call it web 3.0 if it helps)
The big event that marks this on the calendar (this last Tuesday) is Google’s announcement of support for microformats and RDFa, which they call rich snippets. And the support of both formats is interesting I believe the support for RDFa may be of somewhat more significance. To be clear I really think this is the “beginning” of a significant transition period of sorts that may take a year or so. I have observed that major sea changes of this nature don’t happen overnight. For example the web was technically fully functional in the 92/93 time frame. But it wasn’t until 94/95 that the wave of popularity started to swell. I believe it will take a year or two for the tide to fully change the course of the web semantically as well, but it will. You can’t stop the tide short of blowing up the moon. And in this case the moon is Google. And by semantic change I mean that a vast majority of “actively developed” web sites ( in an approximately mere two year time span) will contain a rich variety of structured data that is both readable by humans and discoverable by machines.
So how does all this happen in such a big way? The key to understanding this level of change is human behavior, not technology. The technology is simple and it turns out that’s almost a requirement for major change. Making fire is simple, the wheel is simple, bicycles are simple and telephones believe it or not is pretty simple ( got some tin cans?) And so it goes that the basics of the web html and http are pretty simple and so is RDFa and microformats.
It’s all about human behavior
So what’s the behavior game here? Well it turns out that the behavioral needs also have to be simple for mass adoption. We all get cold (fire), we all have to move heavy stuff (wheels) and we all need to get places. (bicycles). Well on the web there is a simple need, we all need to get noticed. The need for search engine optimization (SEO) will drive this change. In this case it is the behavior of the people (and their tools) that publish on the web that are the change agent. It’s really that simple. The sequence is surprisingly predictable.
The first generation will be (and probably already have been) hand coding rich snippets into web pages. Next will come the changes in the tools. (actually already underway) Web publication tools with built in functionality to easily insert rich snippets will make it painless and invite the technical and non-technical publishers alike. SEO marketing wonks will then start to talk about <deep voice>”how important it is”</deep voice>to add rich snippets and finally it will become just the way pages are authored and not even mentioned much. When was the last time you heard any web authors talk about inserting meta data keywords in their work? You don’t hear about that much but you can bet all the good developers do this best practice regularly. It’s just “how it’s done”. In some ways it’s all very reminiscent of how RSS took off.
As all this happens the back end applications start to blossom and provide very large rich data sets for new semantically enabled search and discovery tools, which in turn feeds back to drive more people to populate their web sites with semantically rich data. It’s a positive feedback loop that ramps up in a big way. Oh and don’t forget the VC begin to take notice as well, pumping more dollars and energy into the system. Google may may lead the way in terms of popular drive but you can be sure that a whole host of others have been and will be finding ways to leverage the semantically enabled web for other purposes.
Is it really that simple?
And so the behavior changes. Or does it. Well now comes the messy part. Change is messy and right off the bat there will be naysayers, and thankfully so. Because the first implementation will not be perfect, and there will be much thrashing of ideas opinions and gnashing of teeth, and this is how stuff gets done on the Internet (and where all the fun is.) Bloggers will complain, for example already today I read reasonable critiques of Google’s intentions but then again read their readers responses, the perceived problems are probably not insurmountable. Then later the big players start making public statements this way or that and so it goes. The key thing is that the developer dialogue continues to unfold, course corrections are made, and stuff gets better. Huzaah!
And for me well I’ll be watching this closely and of course my site should support RDFa now shouldn’t it. It already has a bunch of RDF metadata, (see my previous post from last year.) I supoose now I must venture on and try out some of the RDFa and microformat solutions for WordPress right here on this blog. Hello? (tap tap) Google? Can you here me now?
“The Obama administration recently excited the world of open source software by choosing to launch recovery.gov on Drupal. Their choice of a free, open source platform over any proprietary system is as hopeful and promising as the purpose of the website they built, which is to lend transparency to the spending of the $800 billion dollar economic stimulus money. We should be happy both that the U.S. government is embracing open software, and that it is promoting Open Data…”
via Where Open Source, Open Data and government meet | Dries Buytaert.
Today I gave a 10 minute presentation as a panel speaker for Part 5 of the Semantic Wiki mini series. These are very quick presentations and it did seem like I could say a lot in such a short time. Regardless I felt all the presentations given by the other panelists were very good and taken as a whole I feel the experience would be valuable for anyone interested in developing semantic wikis. My slides as well as the presentations for all the other speakers are up on the wiki page for this session. The series of virtual events has one more session which will dovetail into the face-to-face workshop: “Social Semantic Web: Where Web 2.0 Meets Web 3.0” at the AAAI Spring Symposium, March 23-25, 2009 at Stanford, California.
“This Semantic Wiki mini-series, co-organized by FZI Karlsruhe, Mayo Clinic, Ontolog, RPI Tetherless World Constellation and Salzburg Research, Austria, represents a collaborative effort between members from academia, research, software engineering, semantic web and ontology communities. The 6-month mini-series intends to bring together developers, administrators and users of semantic wikis, and provide a platform where they can conveniently share ideas and insights. Through a series of (mainly virtual) talks, panel discussions, online discourse and even face-to-face meetings, participants will survey the state-of-the-art in semantic wiki technology and get exposure to exemplary use cases and applications. Together, they will study trends, challenges and the outlook for semantic wikis, and explore opportunities for collaboration in this very promising technology, approach or philosophy that people has labeled “semantic wiki.”
Agropedia is a social software site for agriculture. A first glance reveals blogs, forums, wiki and semantic web enabled repositories for knowledge about potatoes and peas!
Form the home page:
agropedia is an agriculture knowledge repository of universal meta models and localized content for a variety of users with appropriate interfaces built in collaborative mode in multiple languages. agropedia aims to develop a comprehensive digital content framework, platform, and tools in support of agricultural extension and outreach. In other words, it aspires to be a one stop shop for any information, pedagogic or practical knowledge related to extension services in Indian agriculture – an audiovisual encyclopedia, to enchant, educate and transform the process of digital content creation and organization completely.
Hmmm…. Well I like the part about “enchanting”. However it’s still very rough around the edges. I also noticed they are using CMap in a collaborative mode to create the ontologies. Browsing the graphic representations of knowledge map for rice reveals the difficult challenge of displaying large ontology models. It’s just difficult to navigate, but it does work. Of course my 24 inch iMac display makes it easier, I’m sure it’s a significant difference on an average display. On the other hand large displays are becoming more and more common. One of our students the other day showed up in the lab with his 24 inch iMac in tow as his “portable computer”. I believe he had one of these (or something like it)
Problem-o-pedia
One thing I just noticed is that apparently there are more than one “agropedia”s on net. This may be a sign of the possibly worn out naming convention of fillintheblankpedia as a way to name a site. Having used this convention myselff (to name “Excellupedia”) I can testify that we had some qualms about it, but went ahead with it anyway. An interesting argument against this naming convention is that these sites can become (are) so much more than encylopedias about this or that… It’s possible that with sites, like this one, we are building a new sort of thing that we don’t really have a good generic name for yet. (socialsoftwarewikiizedforumblogthingamijig just wont do.) Or are we struggling with the possibility that we may be redefining what an encylopedia is. Are we? perhaps.