Thursday, June 25, 2009

Berners-Lee on government data, just do it!

There are two philosophies to putting data on the web. The top-down one is to make a corporate or national plan, by getting committees together of all the interested parties, and make a consistent set of terms (ontology) into which everything fits. This in fact takes so long it is often never finished, and anyway does not in fact get corporate or national consensus in the end. The other method experience recommends is to do it bottom up. A top-level mandate is extremely valuable, but grass-roots action is essential. Put the data up where it is: join it together later.

A wise and cautious step is to make a thorough inventory of all the data you have, and figure out which dataset is going to be most cost-effective to put up as linked data. However, the survey may take longer than just doing it. So, take some data.

A really important rule when considering which data could be put on the web is not to threaten or disturb the systems and the people who currently are responsible for that data. It often takes years of negotiation to put together a given set of data. The people involved may be very invested in it. There are social as well as technical systems which have been set up. So you leave the existing system undisturbed, and find a way of extracting the data from it using existing export or conversion facilities. You add, a thin shim to adapt the existing system to the standard.

More here for government in "Putting government data online".

