Subject: Re: License Question
From: Shawn Garbett <>
Date: Wed, 15 Dec 2004 08:27:24 -0800 (PST)

--- "Michael R. Bernstein" <> wrote:
> On Tue, 2004-12-14 at 13:59, Shawn Garbett wrote:
> > 2) The client medical record data collected in the
> > system must be made available for research
> purposes in
> > anonymized format. If adoption of the system
> occurs on
> > a broader scale, then a much larger volume of
> > consistently collected data will be available for
> > research purposes. The goal is that this research
> > leads to better treatment protocols for mental
> > illness.
> > 
> > I'm not familar with any licenses that require
> data
> > sharing.
> None of the OSI-approved licenses have this
> requirement, as it
> constitues a restriction on use.
> I suggest you use some other way to encourage
> data-sharing, such as
> conditioning access to the shared pool of anonymized
> data on
> contributions to it. If the system includes a
> built-in method of
> periodically exporting anonymized data, this should
> be a relatively easy
> sell.

Well the plan is that the anonymizer and export will
be part of the available software. There are already 3
other companies expressing deep interest and they
wholeheartely agreed on data and code sharing. One
company has even considered contributing developers
(so the detailed knowledge of the system comes
directly back to in-house).

What's going on right now is there is a "research
protocol" from a major university (I cannot name), but
it requires buying (and I'm talking lot's of $$$) into
their program. I raised a stink about the fact that
research protocols are supposed to be open and
repeatable as part of their fundamental nature. The
university made no comment and ignored my repeated
requests for a description of their research protocol
and data format.

The only other format commonly available is HL7 which
requires buying a license to use as well (cheap in the
grand scheme of things, but never-the-less
proprietary). So one of the things that the product in
the commons would establish would hopefully be a
common data format for mental health data that is
open. Open participation will help develop the data
model towards what's applicable in a broader sense.

I understand that restrictions on use is counter to
the idea of open source. However, the goal of open
source is the free exchange of information to build
community resources. Required data sharing is not
fundamentally different than required code sharing.
Basic lambda calculus makes no distinction between
data or computation. Nor does assembly language, hence
the large number of buffer overflow exploits. The idea
of extending open source, with the concept "open data"
makes great sense in areas where humanity could
benefit from study of the data, specifically areas of
science and medicine.

There is a push towards ownership of data in the
scientific community just like there is a push toward
code ownership in the information system industry.
Note the genome race was between public funded labs
and a private labs. The private lab wanted ownership
of the genomic data by deriving it first. Fortunately
it lost. What we're talking about is pushing back,
hopefully generating enough momentum to foster change
in medical circles.

So what I'm proposing is the concept of an open
source/open data license. One is still free to use the
software however one choses, as long as certain pieces
of data are made available (anonymized, for HIPAA
compliance) to all users of the software. Note: That's
not to everyone, just those who have accepted the
license and joined the community. National and state
laws can superceed the data sharing requirements, so
the license will also need a clause "except where 
prohibited by law." 

Shawn Garbett

