Coherent Spectral Feature Extraction using Symmetric Autoencoders

Hyperspectral data, invaluable for environmental and resource studies, often exhibit significant spectral variability due to environmental conditions, material properties, and sensor characteristics. This variability complicates land-cover classification and analysis, especially in real-world scenarios where training and test datasets are geographically disjoint. To address these challenges, we propose using the Symmetric Autoencoder (SymAE), an architecture designed to disentangle class-invariant "coherent" features from variability-causing "nuisance" features through a purely data-driven approach. By leveraging permutation invariance and stochastic regularization, SymAE avoids the need for handcrafted noise priors or complex radiative transfer modeling. Furthermore, SymAE can generate virtual spectra by manipulating latent representations, offering unique insights into spectral variability. Experiments show that the extracted coherent features enable state-of-the-art classification performance and enhance leading spectral-spatial classification methods, particularly in challenging disjoint scenarios.

The problem with spectral variability

Spectral variability in hyperspectral data arises from intrinsic factors like material properties and surface characteristics, as well as extrinsic factors such as environmental conditions, atmospheric effects, and sensor-target geometry. These factors often interact in complex, nonlinear ways, amplifying inconsistencies in observed spectra and confounding classifiers, thereby hindering accurate land-cover classification and material identification.

Demonstration of spectral variability within the road class. (a) False-color image of an urban area with samples from three road segments highlighted: clear road 1 (red), cloud-covered road (orange), and clear road 2 (sky-blue). (b) Normalized reflectance spectra of the highlighted road segments across different spectral bands, illustrating the intra-class variability due to different conditions. This variability complicates precise identification of surface features.

To address these challenges and improve land cover identification performance, we propose using an autoencoder architecture, called the Symmetric Autoencoder (SymAE). Our method focuses on extracting class-invariant spectral features, which we call coherent features, disentangled from features representing variability within classes, which we term nuisance features, in its latent space.

Our approach is motivated by the following premise:

For a given spectral class, there exists a subset of spectral characteristics that remain coherent despite various sources of spectral variability, including intrinsic, extrinsic, and environmental factors. Isolating these coherent features could enhance spectral classification, as they are potentially more robust to spectral variability.

The Architecture

The architecture of the symmetric autoencoder (SymAE) disentangles coherent features from features representing variability in its latent space. The coherent features are assumed to be consistent across the pixels in a group. They propagate through the network via solid black arrows, extracted by a symmetric function (CEnc mean) that is invariant to pixel ordering. Colored arrows indicate the propagation of pixel-specific nuisance effects, processed by the nuisance encoder (NEnc). Dropout masks are applied to the nuisance features to introduce stochastic regularization. The decoder (Dec) combines coherent and nuisance features to reconstruct pixel spectra.

The encoders of SymAE decompose spectra into two components, coherent codes and nuisance codes. We can generate virtual spectrum by using the decoder on coherent code from one pixel and nuisance code from another.

An example from Kennedy Space Center dataset

SymAE lets us create virtual versions of real spectra, so we can see how pixels would appear under different conditions.

(a) The Oak Hammock vegetation class exhibits variability in its spectral signatures within the Kennedy Space Center dataset. This variation potentially arises from interfering factors like atmospheric or ground-based variations. (b) To overcome these interfering factors and isolate vegetation-specific characteristics, the raw spectra undergo a process called redatuming. This process essentially transforms the spectra, creating a common set of nuisance effects shared across all spectra. Redatuming enhances spectral uniformity within the Oak Hammock class while preserving the unique reflectance features that distinguish it from other classes.

Redatuming:

Generation of virtual spectra with uniform nuisance features

This is achieved by extracting nuisance code from a reference pixel, and using it to generate virtual spectra.

Ribbon plots demonstrate how redatuming reduces variability within classes. Each ribbon represents the spectral distribution of a specific class, with the central line showing the average spectrum and the ribbon's half width representing the standard deviation. (a) Displays training set spectra from four distinct classes. (b) Shows the corresponding redatumed counterparts, where pixels from the same class largely overlap, illustrating reduced variability. (c) Presents the reference pixel used for redatuming. (d)-(f) Follow the same pattern for test set upland vegetation classes. While the reduction in variability is less pronounced than with the training set, redatumed pixels show a noticeable decrease in spread. (g)-(i) Repeat the analysis for wetland classes.

Here's a video demo further showing redatuming:

Comparison to Denoising Autoencoders

Redatuming is DISTINCT from denoising!

Comparative analysis of application of DAE and SymAE on test data. (a) Raw spectra from two land-cover classes in Kennedy Space Center scene. (b) DAE demonstrates a propensity to smooth spectral data, yet notable within-group variations remain evident. (c) Redatuming, as implemented by SymAE, outperforms denoising by DAE in mitigating intra-class variance. However, it is important to note that redatumed spectra may exhibit significant dissimilarities from the original raw spectra. (d) Shows the reference Salt Marsh pixel used for the redatuming, along with pixels from the respective ground truth classes that are closest in L1 sense to the redatumed spectra.

In essence, the SymAE-generated virtual images are not completely denoised; they still retain residual nuisance effects originating from the reference pixels. Nonetheless, due to the uniformity of nuisance features across the entire image, the relative distinctions among redatumed pixels can prove useful for subsequent image-processing tasks.

Redatuming improves classification

Virtual images, generated through the redatuming process, contribute to enhanced pixel classification accuracy.

K-Nearest Neighbors (KNN) pixel classification results on KSC scene maps. (a) ground truth map of the KSC scene, serving as the baseline. (b) Pixel classification utilizing KNN on the raw image, resulting in an overall accuracy of 81.6% for the test set ground truth. (c) Pixel classification conducted on a virtual image with uniformized nuisance, demonstrating an elevated average overall accuracy of 92.8%.

Using coherent code further enhances discrimination

SymAE allows for clustering pixels based on reflectance code, i.e., REnc1(D[.]) that is not affected by the atmospheric variations and other nuisance effects.

SymAE extracts coherent features that enhance class separability, particularly for spectrally similar classes.
(a, d) Raw spectra of spectrally close-by classes.
(b, e) These classes are difficult to separate in 2D raw spectra space.
(c, f) The classes with subtle differences in raw spectra are more easily discriminated in the latent coherent code space.
The most significant improvement in the K-means clustering experiment is observed for classes with subtle differences, such as CP Hammock and CP/Oak Hammock depicted in (d), (e), and (f).

Here's a video demo showing improved separation between classes in latent reflectance code space:

Note: The code referred to as 'Reflectance code' in this animation is actually the coherent code. This terminology was used in an earlier version of my paper, but it has since been updated. However, the animation was recorded using the previous terminology.

For more details, you can read the paper

here.