Re: Accuracy assessment and spatial autocorrelation

From: Ed Laurent (elaurent@EcologyFund.net)
Date: Fri May 07 2004 - 11:08:45 PDT

  • Next message: Jonathan Greenberg: "Recommendations for Laser Rangefinders"

    Hi Jonathan,

    I agree with the similar response that spatial autocorrelation is not really a concern for the question you are asking. However, it seems you need to define your question a little better. Are you happy predicting that a pixel next to your signature pixel is classified with an accuracy of x? To me that doesn't really mean that much, especially for vegetation where the spectral signature you are using is directly deterimed by your classification variable (the shrub species). For indirect classification (e.g., birds) its a little more impressive because two contiguous correctly classified presence pixels may not have the same vegetation (e.g., the bird uses two different types of vegetation).

    If I was a reviewer for the manuscript you submit from this work, I would want to know how well the classification worked for pixels outside of your signature plot.

    Another thing to keep in mind is to use kappa or some other measure of accuracy other than pcc, especially if you have an unequal number of signature pixels per class. If your classes are really lopsided (e.g., 200 absence and 20 presence), you can achieve very high accuracy without really doing any better than a random assignment of classes to pixels.

    If you want, you can write me off list at laurente@msu.edu to discuss this more. I'm also working on a beta version of some software for this type of classification that you might be interested in.

    Also, I can send you a copy of the following proceedings paper that might be of interest:

    Laurent, E.J., J.P. LeBouton, M.B. Walters and J. Liu. 2002. Integrating human, satellite and avian perspectives of the landscape for analysis of forest bird distribution patterns. In D. Chamberlain and A. Wilson (eds.) Avian Landscape Ecology: Pure and Applied Issues in the Large-Scale Ecology of Birds. Proceedings of the 11th Annual IALE(UK) Conference. Colin Cross Printers Ltd., Garstang, Great Britain.

    -Ed

    ********************************************
    Edward J. Laurent
    Ph.D. Candidate, Landscape/Wildlife Ecology
    Department of Fisheries and Wildlife
    13 Natural Resources Building
    Michigan State University
    East Lansing, MI 48824-1222
    Ph: (517) 353-5468
    Fax: (517) 432-1699
    http://www.msu.edu/user/laurente
    ********************************************

    --- Jonathan Greenberg <greenberg@ucdavis.edu> wrote:
    Remote Sensors:

        Me and a colleague (who shall remain unnamed... We will refer to him as
    Solomon D.) are having a lively discussion about training/test data with
    remote sensing and I was hoping to get some additional feedback on this
    problem. We created a species map with maximum likelihood (using 1m IKONOS
    imagery), and here's how we created training data (and how we are
    approaching, in one case, the testing):

        We have mostly USFS plot data with a known center location and plot
    boundary, and that has cover values for each species we are after in our
    classification. We choose pixels from plots with a high percentage of a
    single species, that are readily identifiable as the species in question
    (e.g. If we know a plot only contains red fir trees, we manually choose each
    pixel belonging to a tree within the boundary of the plots). This, of
    course, is not an optimal way of doing this -- in theory we should have
    collected individual species in the field, but this was our curse with the
    data we had.

        Ok, so now we have a bunch of pixels per class, taken from a limited
    number of plots (e.g. We may have 1000 red fir pixels, but we took them from
    10 plots). The questions is, is it "legitimate" to subdivide the 1000
    pixels into two randomly chosen training and test groups (say 60% train and
    40% test), and use the 60% to create the map, and validate it with the
    remaining 40%, OR do we have a problem with spatial autocorrelation problem
    because, while we have 1000s of pixels, the training and test pixels are all
    right next to each other in the 10 plots.

        In my mind the issue is muddled, because we are training based on color,
    and is does the color (within a class) have a strong enough spatial pattern
    to warrant a very different training/test setup (e.g. Taking the pixels from
    6/10 plots for training and 4/10 for testing?) Thoughts?

    --j

    -- 
    Jonathan Greenberg
    Graduate Group in Ecology, U.C. Davis
    http://www.cstars.ucdavis.edu/~jongreen
    http://www.cstars.ucdavis.edu
    AIM: jgrn307 or jgrn3007
    MSN: jgrn307@msn.com or jgrn3007@msn.com
    

    _____________________________________________________________ Conserve wilderness with a click (free!) and get your own EcologyFund.net email (free!) at http://www.ecologyfund.com.



    This archive was generated by hypermail 2b29 : Fri May 07 2004 - 11:43:43 PDT