Cool ideas
Smarter data, not smarter software
Tim O'Reilly's blog entry commenting on Edd Dumbill's thoughts on the semantic web mirrors things I've been thinking about for the last 5 years:
As Jeff Bezos likes to say when exhorting his staff to innovate, "It's still day one." We're in the stone age of computing.
Ding -- I keep saying the same thing about MusicBrainz. When considering the full potential of the semantic web and the role that MusicBrainz can play in that, we've merely started. MB is crawling its way out of infancy right now.
This idea of making the data smarter is absolutely central. I have been speaking about this myself for some time. As we move to a network-based software platform, where applications don't live on the local machine but are distributed between rich client front ends and huge database back ends, "open source" alone won't really solve our problems. It's open data we're going to be fighting about.
Open data -- indeed. I've tried to argue this concept with Richard Stallmann 3 years ago, but he refused to see that data is fundamentally different from code. I was suggesting that we come up with an OpenData license, but he stubbornly refused and told me to use the GFDL for MusicBrainz. The GFDL is a piece of crap and I am glad the Creative Commons has come up with a set of licenses that are suitable for open datasets.
I believe that the whole open source revolution will also happen with open data. I hope to see a point in time where people will prefer to use open data systems, as opposed to closed and proprietary data systems. Metcalfe's network law also applies to data -- the value of a dataset increases with the square of the number of connections in the dataset. Having data islands (or silos as Edd calls them) is bad for the data and having a meta-dataset to connect them all vastly increases the value of the data set.
This is one of the roles for MusicBrainz, and I am glad to see that people are starting to talk about this concept.
Posted by Mayhem at March 4, 2004 01:05 PM
Be sure to keep an eye on the "dataset is copyrightable" thing that's going through congress. mostly being pushed by lexis/nexis and a few others. opposed by google, yahoo, and others.
huge potential chill effect even if you only consider the meta data of "artist/album name/track names".
I am watching that:
http://mayhem-chaos.net/blog/archives/000453.html
And this is a serious situation. What if GraceNote thinks they own the copyright on a CD title? Then MB, Muze and AMG are in violation of this copyright? Methinks this bill is a load of shit and may very well have some serious drawbacks for OpenData.