[LAW-LIB:56439] Law Librarian Crowdsource!

From: John P. Joergensen (jjoerg@camden.rutgers.edu)
Date: Thu Aug 21 2008 - 11:15:23 PDT

  • Next message: Goodwin, Cindy: "[LAW-LIB:56440] Help requested from anyone that has a copy of The Practitioner's Guide to the Sarbanes-Oxley Act"

    Dear all,

    Over the years, I have spoken at/with many of you who are sympathetic to
    the idea of independent freely accessible repositories of primary source
    legal data, but who have said: "But I cannot do anything. I have no
    expertise in electronic repositories! I have not the budget! I must
    submit to the mercy of Thompson/West! We must all submit!"

    But we all know in our hearts that we do not want to submit.

    Apropos of this, I am inviting you, members of the law library
    community, to participate in an exercise in "crowd source" editing. It
    is open only to members of the law librarian community to insure quality.

    If you are interested, read on. If, after reading on, you are still
    interested, please e-mail me privately, and I will provide you with a
    login, password, and instructions.

    THE PROJECT

    1. Background: Last Spring, Carl Malamud and Public.Resource.org made
    available the text of the US Reports, F.2d(from 1950 to end) and F3d (to
    July, 2007). This data was obtained with grant funds from Fastcase,
    Inc. It is in HTML, and Fastcase did a wonderful job with tagging
    important header information, like caption, date of decision, court,
    etc. Public.Resource.org then made convenient archive files of these
    documents and made them available to anyone who thought they could do
    something useful with them. There have been several takers, including
    Rutgers - Camden, altlaw.org (Columbia Law project), CALI (for their
    e-Langdell project), and several others. You can see it or get a copy
    of your own here: http://bulk.resource.org/courts.gov/c/

    You can search the Rutgers - Camden edition here:
    http://lawlibrary.rutgers.edu/resource.org/search.shtml

    Altlaw: http://www.altlaw.org
    Justia.com: http://cases.justia.com/us-court-of-appeals/
    Fastcase: http://www.fastcase.com

    This is a pretty big collection. 780,000+ documents. So, when I say
    that when collecting the metadata from the documents, I found 2,000 or
    so errors, I think we can all agree that Fastcase did pretty good.

    Working with the data, we have now gotten the errors on the Circuit
    court decisions down to under 500 documents.

    Finally, in early August, many of the people who are working on this
    data, and with similar projects got together in Chicago, and are
    actively working ways to pool our resources in developing quality
    control systems, updating, and authentication. Information about that
    conference is here http://igotf.org.

    2. The Crowd Source: These 500 documents need to be looked at, and
    useful information gathered, by people. To do this, I have set up a web
    page that will make this pretty easy. The page displays a case on one
    side, and a fill-out form on the other. For the most part, cut and
    paste the data from the document to the form and click "go". There is
    also a space for a summary. A short sentence or two on the main topic
    will do nicely.

    3. THE PAYOFF: The collected metadata will be used to allow fielded
    searching of this entire collection, be used to assist with the
    authentication and maintenance of the collection, and to assist with
    interoperability issues between the repositories that have been
    established and will be established. It will be distributed freely to
    all participants along with the bulk of the metadata repository that is
    already available for download. (Metadata archive files suitable for
    upload into MySql are available here:
    http://lawlibrary.rutgers.edu/resource.org/arch/f2d-metadata.tgz
    http://lawlibrary.rutgers.edu/resource.org/arch/f3d-metadata.tgz
    http://lawlibrary.rutgers.edu/resource.org/arch/scotus-metadata.tgz

    As you can see, this is not just for me. Every repository will get a
    copy to use. Every user will be able to make use of the data. And, if
    a lot of you sign up, no one will have to do very much to get it.

    So, if you have computer disk space and want your own copy of the
    Federal reports and US Reports with robust metadata, for free, you can
    have it. Along with a community of institutions to cooperate with to
    insure quality and maintenance.

    If can't do your own repository, but want to see a distributed, free
    system become established, and want to do your part. Contact me and
    contribute.

    John

    -- 
    _________________________________________
    John P. Joergensen
    Librarian II/Assoc. Prof.
    Rutgers University School of Law - Camden
    jjoerg@camden.rutgers.edu
    ________________________________________
    



    This archive was generated by hypermail 2b29 : Thu Aug 21 2008 - 11:18:23 PDT