^{1}

^{2}

Supervised classification of hyperspectral images is a challenging task because of the higher dimensionality of a pixel signature. The conventional classifiers require large training data set; however, practically limited numbers of labeled pixels are available due to complexity and cost involved in sample collection. It is essential to have a method that can reduce such higher dimensional data into lower dimensional feature space without the loss of useful information. For classification purpose, it will be useful if such a method takes into account the nature of the underlying signal when extracting lower dimensional feature vector. The lifting framework provides the required flexibility. This article proposes the adaptive lifting wavelet transform to extract the lower dimensional feature vectors for the classification of hyperspectral signatures. The proposed adaptive update step allows the decomposition filter to adapt to the input signal so as to retain the desired characteristics of the signal. A three-layer feed forward neural network is used as a supervised classifier to classify the extracted features. The effectiveness of the proposed method is tested on two hyperspectral data sets (HYDICE & ROSIS sensors). The performance of the proposed method is compared with first generation discrete wavelet transform (DWT) based feature extraction method and previous studies that use the same data. The indices used for measuring performance are overall classification accuracy and Kappa value. The experimental results show that the proposed adaptive lifting scheme (ALS) has excellent results with a small size training set.

Hyperspectral data set consists of hundreds of images corresponding to different wavelengths for the same area on the surface of the Earth [

A feature reduction approach based on feature-extraction projects the original feature space onto the lower dimensional space by transformation. Recently, wavelet-based multiresolution analysis has been the widely used feature-extraction method in signal processing [

Due to its fixed filter bank structure, a wavelet does not always capture all the transient features of the input signal. This may result in lower classification accuracy. The conventional convolution-based implementation of the DWT has high computational and memory cost. For classification purpose, it will be useful to have a multiresolution tool that takes into account the nature of the underlying signal. The lifting framework proposed by Wim Sweldens provides the required flexibility [

This article proposes the adaptive lifting wavelet transform to extract the lower dimensional feature vectors for classification purpose. The lifting framework allows the decomposition filter to adapt the input signal so as to improve or leave intact the desired characteristics of the signal. Most adaptive lifting schemes proposed by the researchers have been used for image compression and denoising application: adaptive predict [

In the context of remotely sensed images, most of the existing studies propose lifting scheme for lossless coding and for denoising. The multiplicative speckles in synthetic aperture radar (SAR) images were reduced by wavelet transform based on lifting scheme [

Thus, no literature was found on hyperspectral image classification within lifting framework. In this article, we propose the adaptive update operator that retains the transient features of the hyperspectral signature that helps to improve the classification. The rest of the article is organized as follows. The proposed adaptive update lifting scheme is introduced in Section 2. The experiments and results are presented in Section 3 and Section 4, respectively. Finally, conclusions follow in Section 5.

As the application in hand is to produce the feature vectors from hyperspectral signatures for classification, the objective is to retain the distinctive information of the input signature. The spectral signature of different land cover classes are shown in

It divides the hyperspectral signature (underlying input signal)

indexed

This step produces the detail coefficients

In time domain,

Therefore, the given fixed predictions results in large differences and hence detail coefficients, thereby retaining the distinct information about discontinuities.

Because of subsampling at first stage, even samples

Here, T is the threshold computed as:

In time domain,

As

Because of the above adaptive update function, the approximate signal retains the discontinuity information. Thus the proposed lifting scheme takes the average of neighboring samples in the smooth regions of the signal and does not modify the signal at edges to retain the distinctive information useful for classification.

To study the effectiveness of the proposed adaptive update lifting scheme for classification, experiments are done on the following hyperspectral data sets. These images are used as it is without any processing.

Washington DC Mall data set is collected by the Hyperspectral Digital Imagery Collection Experiment (HYDICE) system, consisting of 191 spectral channels in the region of the visible and infrared spectrum [

University of Pavia data set consists of Pavia University scenes acquired by the Reflective Optics System Imaging Spectrometer (ROSIS) sensor [

In this experiment, a single hidden layer feed forward neural network is used for classification of feature vectors. After training by back propagation algorithm, the network is used as a classifier to classify the whole image. The

number of input nodes is determined by the dimension of the transform-based feature vector. The number of output nodes is equal to the number of classes in the image. The number of hidden layer nodes is set as equal to the square root of product of the number of input layer nodes and output layer nodes [

As limited numbers of labeled pixels are practically available because of the complexity and cost involved in sample collection, we kept the size of the training and test set small. The training and test pixels for both data sets are obtained with the help of the labeled field map available from the data source. In both cases, the training and test data sets are mutually exclusive and randomly selected.

Let ^{S} represents the matrix containing the hyperspectral signature of the training pixels belonging to land cover class S identified in the image. Thus, X consists of the training samples of each class. The proposed one-dimensional adaptive lifting wavelet transform is applied to each signature kept row wise in matrix X up to the desired level. Thus, in case of the DC mall data set, the original hyperspectral signature of dimension [1, 192] is converted into the wavelet-based feature vector of dimension [1, 12] after decomposition up to level 4. For the Pavia University data, after two-level decomposition, the feature vector dimension gets reduced from [1, 96] to [1, 24]. Since approximate signal obtained by the proposed ALS retains useful features, the approximate coefficients are used to construct wavelet-based feature vector.

Using the adaptive wavelet-based feature vectors, the single hidden layer feed forward neural network is trained by backpropagation algorithm. The network training is stopped after achieving termination criterion of either getting the desired test sample accuracy or upon reaching the maximum number of iterations. On completion of training, neural network is employed as a classifier to classify each image pixel signature into one of the land cover classes as follows.

The trained ANN classifier is then used to classify every pixel’s hyperspectral signature as follows.

1) Decompose original hyperspectral signature of a pixel up to the desired level using proposed adaptive lifting wavelet transform.

2) Use only the approximate coefficient as transform domain feature vector to feed into the trained network.

3) A given pixel is assigned the class of the output node having highest value.

4) Repeat these steps for all image pixels to generate the classification map.

The performance is evaluated in terms of overall accuracy and kappa value, which are calculated as follows.

1) Overall accuracy (OA) is defined as ratio of the number of correctly classified samples to the total number of samples. It is computed from the confusion matrix as follows.

where n_{ij} is the element of the confusion matrix and denotes the number of samples of j^{th} ^{th}

2) Kappa coefficient (K) is computed from the confusion matrix. It is based on the difference between the actual agreement and the chance agreement (row and column totals) [

where n denotes the total number of test samples and C denotes the number of classes of the given data set. Also n_{i}_{+} denotes the sum of the elements if the i^{th} row and n_{+j} denotes the sum of the elements of column j.

The performance of proposed method is compared with first generation DWT-based feature extraction method. The results of the experiments on both data sets reported in

From

For the University scene, it is observed from

It is noted that the overall accuracy and kappa value are lower in the results of university scene compare to DC mall. This is due to the class distribution of University data set, even labeled training sample cannot represent the class distribution over all region. For both data sets, the overall classification accuracies for the classes that are represented by a few training samples are high. For instance, in the case of University scene, 99% accuracy was obtained for the Metal sheet class.

Feature extraction | Features used | DC Mall | University scene | ||||||
---|---|---|---|---|---|---|---|---|---|

K | OA (%) | Mean | SD | K | OA (%) | Mean | SD | ||

Proposed ALS | Approximate | 0.96 | 97.2 | 95.43 | 2.35 | 0.81 | 83.4 | 82.8 | 0.64 |

First generation DWT | |||||||||

Haar | Detail | 0.88 | 90.1 | 90 | 0.15 | 0.66 | 69 | 65.2 | 2.49 |

Db4 | Detail | 0.93 | 95.23 | 91.42 | 3.76 | 0.65 | 67.7 | 58 | 3.7 |

Algorithm | OA (%) | K |
---|---|---|

Proposed ALS | 97.2 | 0.96 |

NWFE + Gaussian classifier | 92.00 | - |

NWFE + 2NN | 87 | - |

Algorithm | OA (%) | K |
---|---|---|

Proposed ALS | 83.40 | 0.81 |

Morphology & SVM | 81.01 | 0.75 |

PCA & EMP | 85.22 | 0.80 |

SVM & majority voting | 85.42 | 0.81 |

This paper presents the new adaptive update scheme to extract lower dimensional feature vector for the classification of hyperspectral signatures. The proposed approach preserves the distinctive features of hyperspectral signatures in the approximate signal, which lead to improvement in the classification result. It is clear that using the proposed approach, each class identified in both images is well separated. In fact, with the proposed approach, we are able to accurately distinguish the classes having similar spectral reflectance, for example, the classes such as Roof and Road in the case of DC mall and Meadows and Trees for the University of Pavia image. Another important aspect is that it gives better results even using a very small training set. This paper clearly demonstrates that our method is most robust for the classification of hyperspectral signatures with respect to overall accuracy, kappa value, and training data size.

The authors would like to thank David Landgrebe and Paolo Gamba for providing the data set.