status

location
Washington, DC
Subscribe to GeoRSS Subscribe to KML


Does the OpenDatabase License need CC style Modules?

Published in OpenStreetMap  |  3 Comments


OSM_CC_WorldIn the OpenStreetMap community there is a known problem with the applicability of Creative Commons licensing to geographic data. The CC licenses are truly meant for creative works and not for the creation and aggregation of factual data.

To address this, the OpenStreetMap Foundation Board has pursued the development of a more applicable and friendly Open Database License, ODbL. The goal of this license would be to make it clear the legal protection of geographic data gathered and how it can be used with other data or derived from.

It is a sign of maturity that an open-data project like OpenStreetMap is dealing with the legal issues that surround the otherwise grassroots crowd-sourced community. Similar parallels occurred with the development of GPL and BSD with open-source software and the Creative Commons in user-generated media. Open-Data is following in the footsteps of open-source, from grassroots “hackers” to disrupting some industries and redirecting others.

You spilled your legal all over my data

However, one difference that has affected these two examples is a valuable guide with how the ODbL and other similar licenses should progress. Software licenses are currently a myriad of acronyms and terms: BSD, MIT, GPL, Affero, Apache, and more. SourceForge has a deprecated license guide and in fact offers at least 72 license options.

The result is a confusion to both software producers and consumers. What licenses enforce which restrictions and usages? How can I bring together software under different licenses and their miscibility. This is a question that even after a decade of mainstream open-source experts still ask for advice.

A working model

Creative Commons LicensesCreative Commons headed this off through some nice modularization of the most popular options. Through clear naming, definitions, and iconography users can understand the concepts encased in otherwise unapproachable legal contracts such as Non-commercial, Attribution, Share-Alike, and No-Derivatives with straight-forward choosing of which modules any user wants to apply to their work.

This results in easy, lightweight sharing – encouraging people to contribute to public repositories and also make use of these works. By having simple, well understood licenses, one example is Flickr’s simple search filter that makes it easy to find Creative Commons only images for use in third party materials and presentations. It’s even possible to visualize and determine how you can mix together content released under various modules.

The overall result is that the license has become popular and encourages both sharing and use of shared media – effectively ending the future of traditional stock photography.

Open Data Modules

My point of highlighting Creative Commons is to look at how simple mechanisms can promote effectiveness around licensing of information. The ODbL’s primary purpose is making it clear how to produce and use OpenStreetMap data, but in this action it is addressing the growing need to easily define how the true underlying strands of the web will be shared. You can read a draft version of the ODbL.

The opportunity is to lead the charge on clear, understandable data licenses that citizens can take to their governments to demand the data be released under these terms. There would not be the need for click-throughs of unique terms of service or agreements, but easily shareable data that magnifies the power of any available datasources.

One counter-point to the pre-defined modules is that users that want variations can “select, modify, or delete” sections as necessary. This is definitely not an option – as it will create unclear and probably invalid licenses. In addition, these variations and spin-offs will be unvetted and untrusted. By handling the majority of cases under one common umbrella, the validity and attractiveness of a standard license decreases the difficulty of any organization to claim it wouldn’t work for them.

I posited this question to the OpenStreetMap Legal mailing list hoping to spark a discussion with the various people currently involved with the license. So far the feedback has surprisingly been negative on the benefits of modular based licenses. OpenStreetMap has a long road ahead even after a new license in drafted in convincing the very large community to switch licenses – an effort I hope does not negatively impact the organization but instead illuminates the need for clear licenses from the start of any open data collection project.

With GeoCommons, we are spending a lot of resources gathering, annotating, and sharing out open data sources. Our metadata catalog is shared under a Creative Commons Attribute, Share-Alike license. And nominally all the data we bring in is somehow open, under different monikers. But right now it is very difficult to easily share out the terms of these licenses – so the onus is upon the user to properly use each dataset. With our goal of making geospatial data easy to use for non-experts, we have a very high interest in making geodata licenses as easy to understand as photographs or articles are under Creative Commons.

An open question to open data licenses

OSMCCThe question here is whether the module concept of Creative Commons is an effective mechanism that should be applied to the Open Data License. The goal is to make it so easy for anyone to share information that it would take more effort not to do so. That this type of easily shared information is highly preferential by consumers that other datasets under various and unclear licenses such that these other sources conform to best practices.

What do you think is the best path?

Similar Posts


Responses

  1. Tom Chance says:

    November 24th, 2008 at 6:00 pm (#)

    As somebody who was involved with Creative Commons from the very early days, I have to say this would be a really bad idea for data like OpenStreetMap’s.

    Different licenses are incompatible, and CC only got away with this because (a) there isn’t really all that much mixing together of art works so little risk of license conflicts and (b) where conflicts did arise most people are pretty easy going.

    With geodata, similar to software, there’s a much stronger imperative to mix together data sets. If they were incompatible that would undermine one of the key benefits of OSM, especially to small businesses.

    Choice isn’t an end unto itself. With CC it made sense simply because their licenses cover a massive variety of use cases, from stock photography through music to literature. They have totally different methods of creation, different traditional financial models and different potentials for transformation down the line. I don’t see a similar case for geodata, just some people just not wanting to play by the rules for their own gain rather than joining in a community.

  2. Richard Fairhurst says:

    November 24th, 2008 at 7:12 pm (#)

    Disregarding attribution (which most people want, or at least, won’t refuse), CC has essentially three modules:

    - non-commercial (CC-NC)
    - no derivatives (CC-ND)
    - ‘share-alike’ (CC-SA)

    ND is just silly for a database – what would be the point? And I may be missing something, but there doesn’t seem to be any clamour, at all, for NC on databases. I can think of one edge case where it might have been welcomed but that was arguably for a not very well thought out initiative. (Never mind that the massive resources required to host an OSM-like dataset may make it difficult to be strictly NC.)

    So you’re left with share-alike and “the absence of share-alike”, i.e. PD/attribution only: two licences. GPL and BSD for databases. Happily, two such licences already exist in the ODbL “family”: the ODbL itself, and the Open Data Commons Public Domain Dedication & Licence (PDDL).

    CC’s branding of their different and usually incompatible licences is essentially clever marketing; good for encouraging adoption of “Creative Commons” in the widest sense, but one which has had fundamentalists frothing at the mouth for ages (http://www.fsf.org/licensing/licenses/index_html#which-cc).

    It actually masks one of CC’s biggest failings, which is that the same licence can have wildly varying effects. For example, CC-BY-SA is largely indistinguishable from CC-BY for photos – 95% of uses count as a Collective Work. Contrast with the all-devouring Derivative Work when applied to data, as per OSM.

    Very aptly, my iTunes has just chosen to play the Richard Stallman vs Rick Astley mashup and is busy exhorting me to “join us now and share the software”.

  3. Brett says:

    November 30th, 2009 at 5:24 pm (#)

    Richard,
    I think you underestimate the need for NC. For our local government, NC is critical. Not to keep people from making money off the data, but to prevent people from using old datasets. Basically, NC gets slapped on any dataset that is given out one time. If you want to lift the NC, you have to create an agreement to regularly obtain dataset updates; or to think of it another way, we would want a commercial use license on the dataset that expires after a certain period of time, with the current dataset having a fresh commercial use license.
    NC also gets placed on certain datasets on which we cannot be absolved of legal responsibility (e.g. NFIP maps or surveys by county surveyors) if the dataset is used commercially. If we allow commercial use, we are liable for damages arising from inaccuracies in the data. So, we bar commercial use. Ultimately, this makes the dataset “use at your own risk” rather than “non-commercial”.
    ND has a very specific use too; to prevent the passing on of incomplete datasets. A real common problem is for a cadastral reseller to drop certain parcels, roads, and right of ways from the dataset to compact down the size of the dataset. Nice for them, an expensive nightmare for the department who takes the phone call about a missing parcel or right of way. Obviously, ND is an inapt way of going about this though as compared to a license that would require the dataset to be passed on in full or as a contiguous layer.