IBM Mashup Summit

I'm at the IBM Mashup Summit in San Francisco today. As we are within the bowels of a large enterprise, there is a process to follow to get wifi access, so I'm offline.

Pete made an interesting point about the currency and quality of data that reminded me of a post by

I emphasized that there are some areas you don't want to automate, such as merging revision conflicts, because people are better than algorithms for many things, and suggested other service providers borrow from some elements of wiki design like revision history.

I shared our experience with open source application licensing. From the conversation, I think people understood the need for a different license for open source web applications compared to infrastructure. But it also was clear to me that I've not communicated our current status, as someone in the know asked if we were "Open Source." Nobody owns the term and can modify it in their own way, but there is a significant role for OSI to accredit project as OSI Certified. Socialtext is almost six months into the process of getting it's license OSI Certified, we don't claim we are yet, but we do say rightly we are a commercial open source provider. We are about to submit a third revision of our license, so I write more later, but if the process concludes in the negative, we will choose a different OSI license. Not because it will suit our needs, in fact it will decidedly not, but because of the role we want to play in the community. We'll see what the other 15 MPL+Attribution projects do. But attribution is an important issue for mashups, and people here seemed to be in favor of it.

Stephan from Kapow technologies sees the stack as Mashup builders like QEDwiki, Teqlo and Excel and Mashup enablers like Kapow and RSSBus.. Because we don't have UDDIs and WSDLs of the web services world, we need service discovery through a central service repository and builder specific repository. How do I find the data I need and get it into the format I need? Within the enterprise, users want to be able to get to data without involving IT. An example of this is IBMs Mashup Hub, and while more service descriptors are needed, people just want to grab two values off of different sites (using Kapow's web-scraping) and put them together in Excel or SocialCalc. Need to communicate through WS* (he assumes SOAP is what legacy speaks. Someone pointed out that at Mysql conference nobody knew about SOAP, and he countered that people in Europe don't know REST), REST, RSS/Atom feeds, Atom Publishing Protocol, APIs. And access the data through HTTP and HTTPs. Suggested solution: Define microformats to describe each type of service. Define a simple way to inform Builders of the existence of services and define a simple way for Enablers to request service information from central repositories.

At a certain point the notion of having a market of services that people could purchase on a granular billable basis came up. I suggested to start from the opposite side, encouraging the commons. Or more specifically this group could go to Creative Commons and try to host a directory of CC licensed APIs. We also discussed availability, and I pointed out that in other industries we would start with conversations about standardizing SLAs.

Paul Raymond who is in the commercial division of AccuWeather, which provides weather info to 106 million Americans each day. Their primary asset is their brand, they copyright much of their material and want to syndicate under control. Web scraping creates new business models for them, even if it is just linking back. They co-brand over 20k affiliate sites, provide a number of mapping web services and work with other mapping services, have a number of widgets and more. Other business models: subscription and fixed pricing that is secure and authenticated -- or CPM-based control content, campaign, source and cost. Their basic approach is let people hack upon it, but largely encourage marketing attribution in return.

I had to leave before the afternoon sessions by SnapLogic, Jeff Nolan, Reuters and Mashery. We still haven't really talked about security, or the marketing thereof, which is the elephant in the room. It will be interesting to see if a common roadmap emerges.

A guy from the EPA was asked about politicizing of data. He shared how there is a law where you can dispute the bias or accuracy of data and gain resolution. He told the story of how a US Satellite over the north pole started picking up anomalies in ozone levels and scientists believed it was impossible so they normalized the data syndicated. It wasn't until British scientists used balloons to find unreported change that they opened up the logs and corrected the feed.

Data is political and when you have so much change it is the politics, as much as the technology, that needs to be worked out by the community.

UPDATE: More coverage from