Dear all,
Over the years, I have spoken at/with many of you who are sympathetic to
the idea of independent freely accessible repositories of primary source
legal data, but who have said: "But I cannot do anything. I have no
expertise in electronic repositories! I have not the budget! I must
submit to the mercy of Thompson/West! We must all submit!"
But we all know in our hearts that we do not want to submit.
Apropos of this, I am inviting you, members of the law library
community, to participate in an exercise in "crowd source" editing. It
is open only to members of the law librarian community to insure quality.
If you are interested, read on. If, after reading on, you are still
interested, please e-mail me privately, and I will provide you with a
login, password, and instructions.
THE PROJECT
1. Background: Last Spring, Carl Malamud and Public.Resource.org made
available the text of the US Reports, F.2d(from 1950 to end) and F3d (to
July, 2007). This data was obtained with grant funds from Fastcase,
Inc. It is in HTML, and Fastcase did a wonderful job with tagging
important header information, like caption, date of decision, court,
etc. Public.Resource.org then made convenient archive files of these
documents and made them available to anyone who thought they could do
something useful with them. There have been several takers, including
Rutgers - Camden, altlaw.org (Columbia Law project), CALI (for their
e-Langdell project), and several others. You can see it or get a copy
of your own here: http://bulk.resource.org/courts.gov/c/
You can search the Rutgers - Camden edition here:
http://lawlibrary.rutgers.edu/resource.org/search.shtml
Altlaw: http://www.altlaw.org
Justia.com: http://cases.justia.com/us-court-of-appeals/
Fastcase: http://www.fastcase.com
This is a pretty big collection. 780,000+ documents. So, when I say
that when collecting the metadata from the documents, I found 2,000 or
so errors, I think we can all agree that Fastcase did pretty good.
Working with the data, we have now gotten the errors on the Circuit
court decisions down to under 500 documents.
Finally, in early August, many of the people who are working on this
data, and with similar projects got together in Chicago, and are
actively working ways to pool our resources in developing quality
control systems, updating, and authentication. Information about that
conference is here http://igotf.org.
2. The Crowd Source: These 500 documents need to be looked at, and
useful information gathered, by people. To do this, I have set up a web
page that will make this pretty easy. The page displays a case on one
side, and a fill-out form on the other. For the most part, cut and
paste the data from the document to the form and click "go". There is
also a space for a summary. A short sentence or two on the main topic
will do nicely.
3. THE PAYOFF: The collected metadata will be used to allow fielded
searching of this entire collection, be used to assist with the
authentication and maintenance of the collection, and to assist with
interoperability issues between the repositories that have been
established and will be established. It will be distributed freely to
all participants along with the bulk of the metadata repository that is
already available for download. (Metadata archive files suitable for
upload into MySql are available here:
http://lawlibrary.rutgers.edu/resource.org/arch/f2d-metadata.tgz
http://lawlibrary.rutgers.edu/resource.org/arch/f3d-metadata.tgz
http://lawlibrary.rutgers.edu/resource.org/arch/scotus-metadata.tgz
As you can see, this is not just for me. Every repository will get a
copy to use. Every user will be able to make use of the data. And, if
a lot of you sign up, no one will have to do very much to get it.
So, if you have computer disk space and want your own copy of the
Federal reports and US Reports with robust metadata, for free, you can
have it. Along with a community of institutions to cooperate with to
insure quality and maintenance.
If can't do your own repository, but want to see a distributed, free
system become established, and want to do your part. Contact me and
contribute.
John
-- _________________________________________ John P. Joergensen Librarian II/Assoc. Prof. Rutgers University School of Law - Camden jjoerg@camden.rutgers.edu ________________________________________
This archive was generated by hypermail 2b29 : Thu Aug 21 2008 - 11:18:23 PDT