Skip to content

Statistic Operations

n-lagomarsini edited this page Jun 24, 2014 · 1 revision

In this page is described the implementation of new Statistical operations similar to those of JAI. The main difference between the 2 groups of operations consists of a different organization: while the JAI operations are all subclasses of the JAI StatisticsOpImage class, the new operations are executed by the same image operator, but they uses different Statistics object used, but the image operator is often the same.

WORKFLOW:

  • creation of the Statistics objects used for calculating some image statistics like mean, sum, variance.
  • creation of the StatisticsOpImage, descriptor and RenderedImageFactory classes for using Statistics objects on the image data.
  • development of some tests on the previously described classes.

The JAITools statistical operators could also be used instead of creating a new module but they were not used for some reasons:

  • the NoData handling is different. While in the Jai-Ext project, NoData are checked inside a single Range of NoData, in the JaiTools project NoData are checked inside a list of Ranges, with a different performance overhead.
  • the Range classes used for the NoData are not the same; Jai-Ext Range class has better performances than that of JaiTools Range.
  • the Jai-Ext Range class is used by all the modules inside the project and then, for compatibility, it has been maintained even in this module.
  • inside the JaiTools statistical operations there is no support for multithreading.

The module jt-stats contains the following classes:

  • StatisticsOpImage.java : this class is used for computing the provided statistics on every image tile.
  • StatisticsDescriptor.java : this class provide a description of the previous operation and indicates which parameters should be passed.
  • StatisticsRIF.java : this class is a RenderedImageFactory called by the JAI.create() method.
  • SimpleStatsOpImage.java : this subclass of the StatisticsOpImage.java class is used for calculating statistics that do not need to store the pixel data in an array.
  • ComplexStatsOpImage.java : this subclass of the StatisticsOpImage.java class is used for calculating statistics that store the pixel data in an array.
  • Statistics.java : this abstract class defines the methods for calculating the statistics.
  • StatsFactory.java : factory class for creating various Statistics object.
  • MeanSum.java : subclass of the Statistics.java class for finding the sum or the mean of the image pixels.
  • Max.java : subclass of the Statistics.java class for finding the maximum value of the image pixels
  • Min.java : subclass of the Statistics.java class for finding the minimum value of the image pixels
  • VarianceSTD.java : subclass of the Statistics.java class for finding the variance or the standard deviation of the image pixels.
  • HistogramMode.java : subclass of the Statistics.java class for calculating the Histogram or the Mode of the image pixels.
  • Median.java : subclass of the Statistics.java class for finding the Median of the image pixels.

The StatisticsOpImage class is an extension of the JAI OpImage class and is used for calculating various kind of statistics on the entire image:

Simple Statistics

  • Mean
  • Sum
  • Max
  • Min
  • Variance
  • Standard Deviation

Complex Statistics (Note: they need additional parameters at creation time)

  • Histogram
  • Mode
  • Median

The computation is different between the two subclass of the StatisticsOpImage class. In the SimpleStatsOpImage class it is performed by storing the statistics for every tile and then accumulating them into a global container by calling the accumulateStats() method. For every tile is created a temporary statistic container which updates the statistics for every pixel with the addSampleNaN() or addSampleNoNaN() method (the first should be used for Double or Float data). When the operation on the tile is finished, the stored statistics are added to the global container. This operation is wrapped with the "synchronized" keyword for maintaining the class thread-safe. In the ComplexStatsOpImage class the calculation is performed by creating a global container which stores the pixel values. These values are added by calling the addSample* method that performs the selected operation in a thread-safe way. When all the image pixels are stored, then the selected statistics are calculated on the stored values.

These feature have been created for taking advantage of the multithreading inside JAI; with these improvements the statistical calculations can be made in parallel giving the final result in minor time than that of a single-threaded statistical calculation. The optional presence of NoData Range or ROI object is taken into account by avoiding the statistics update if the selected pixel is outside the ROI or is a No Data. The result of the computation is returned with the getProperty() method. The types of computations to perform are defined by setting in input an array of StatsType( enum class inside the Statistics class). It is important to remember that the user cannot execute simple statistics with complex ones, because the multithreading is handled in two different ways. Also the user must select on which band the computations should be done by passing in input an array of band index.

The StatisticsDescriptor is used for describing the functionalities and the input parameters of the StatisticsOpImage. The static method create() returns a new instance of the SimpleStatsOpImage or the ComplexStatsOpImage by calling the JAI.create() method. This last operation calls the RenderedImageFactory associated with the selected operation, in this case StatisticsRIF.

The StatsFactory class is used for creating all the subclasses of the Statistics abstract class. The easiest way to create a new statistic object consists of calling the createXXXObject(), where the XXXX stands for the type of statistic to calculate; otherwise it can be created by indicating the index of the selected statistic as indicated by the StatsType. For the complex statistics additional parameter must be set.

All the subclasses of the Statistics class must implement the addSample() method for updating the statistics, the accumulateStats() method for aggregating the results from different tiles(only for simple statistics), the clearStatistic() method for clearing the temporary results of computation, the getResult() method which returns the result of the selected statistic.

A simple code for better understanding the statistical operations, for example Mean:

// s[x][y] = pixel value of the source.
// globalStatObj = global statistic container.
// statObj = temporary statistic container.
// insideROI = boolean indicating that the value is inside ROI.
// validData = boolean indicating that the value is not a No Data.
// numTiles,srcHeight,srcWidth = source image tiles number, height, width.

// Code for a simple statistical operation, for example Mean:

Statistics globalStatObj = StatsFactory.createMeanObject();

for(int t = 0; t<numTiles;t++){
    Statistics statObj = StatsFactory.createMeanObject();
    for(int y = 0; y<srcHeight;y++){
        for(int x = 0; x<srcWidth;x++){
            if(insideROI && validData){
               statObj.addSample(s[x][y],validData);            
            }
        }
    }
    // Synchronized operation
    globalStatObj.accumulateStats(statObj);
}
double mean = (Double)globalStatObj.getResult();

// Code for a conplex statistical operation, for example Histogram:

Statistics globalStatObj = StatsFactory.createHistogramObject(maxBound, minBound, numBins);

for(int t = 0; t<numTiles;t++){    
    for(int y = 0; y<srcHeight;y++){
        for(int x = 0; x<srcWidth;x++){
            if(insideROI && validData){
               // Thread-safe operation
               globalStatObj.addSample(s[x][y],validData);            
            }
        }
    }
}

double[] histogram = (double[])globalStatObj.getResult();

Some tests are present for checking if the operation are correctly performed:

  • StatisticsTest.java
  • CompleteStatsTest.java
  • ComparisonTest.java

The first test compares the statistic results returned from all the subclasses of the Statistics object. The comparison is made by pre-calculating the statistics and then assuring that those values are equal (with a fixed tolerance) to the Statistics objects results. Also this test evaluates if the other methods like accumulateStats() or clearStatistics() do their work.

The second class is used for testing the StatisticsOpImage subclasses operations. In this class all the statistics of an image are calculated by calling the getProperty() method. These statistics are calculated in various contexts: with and without No Data, with and without ROI and, if ROI is used, with and without ROI RasterAccessor.

The third class estimates the mean calculation time of the StatisticsDescriptor (only for Mean, Extrema and Histogram operations) and the JAI MeanDescriptor, ExtremaDescriptor and HistogramDescriptor. This test is executed by repeatedly calling the getProperty() method for every cycle and saving the calculation time. At the end of every cycle the JAI TileCache is flushed so that all the image tiles must be recalculated. The average, maximum and minimum computation time are not saved for all the iterations, because the first N iterations are not considered due to the Java HotSpot compilation. The number of the iterations to consider and not can be set by passing respectively these 2 Integer parameters to the JVM: JAI.Ext.BenchmarkCycles and JAI.Ext.NotBenchmarkCycles. For selecting which of the 2 descriptors must be tested, the JAI.Ext.OldDescriptor JVM parameter must be set to true or false(true for the old descriptor and false for the new one); the JVM parameter JAI.Ext.TestSelector can be from 0 to 5 and indicates the image data types; the statistic to perform must be indicated with the JAI.Ext.Statistic parameter. If the native acceleration should be used, then the JAI.Ext.Acceleration parameter must be set to true. Finally, if the user wants to add the NoData or ROI control, the JAI.Ext.RangeUsed and the JAI.Ext.ROIUsed parameters must be respectively set to true. The computation times are print to the screen at the end of the process.