Hindawi Publishing Corporation
EURASIP Journal on Advances in Signal Processing
Volume 2010, Article ID 465612, 9 pages
doi:10.1155/2010/465612
Research Article
Polarimetric SAR Image Classification Using Multifeatures
Combination and Extremely Randomized Clustering Forests
Tong yuan Zou,
1
Wen Ya ng ,
1, 2
Dengxin Dai,
1
and Hong Sun
1
1
Signal Processing Lab, School of Electronic Information, Wuhan University, Wuhan 430079, China
2
Laboratoire Jean Kuntzmann, CNRS-INRIA, Grenoble University, 51 rue des Math
´
ematiques, 38041 Grenoble, France
Correspondence should be addressed to Wen Yang, [email protected]
Received 31 May 2009; Revised 4 October 2009; Accepted 21 October 2009
Academic Editor: Carlos Lopez-Martinez
Copyright © 2010 Tongyuan Zou et al. This is an open access article distributed under the Creative Commons Attribution License,
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Terrain classification using polarimetric SAR imagery has been a very active research field over recent years. Although lots of
features have been proposed and many classifiers have been employed, there are few works on comparing these features and their
combination with different classifiers. In this paper, we firstly evaluate and compare different features for classifying polarimetric
SAR imagery. Then, we propose two strategies for feature combination: manual selection according to heuristic rules and automatic
combination based on a simple but efficient criterion. Finally, we introduce extremely randomized clustering forests (ERCFs) to
Cloude decomposition and Wishart distribution. In [6],
Pottier and Lee further improved this algorithm by including
anisotropy to double the number of classes. In [7], Lee et al.
proposed an unsupervised terrain and land-use classification
algorithm based on Freeman and Durden decomposition
[8]. Unlike other algorithms that classify pixels statistically
and ignore their scattering characteristics, this algorithm not
only uses a statistical classifier but also preserves the purity
of dominant polarimetric scattering properties. Yamaguchi
et al. [9] proposed a four-component scattering model
based on Freeman’s three-component model, and the helix
scattering component was introduced as the fourth compo-
nent, which often appears in complex urban areas whereas
disappears in almost all natural distributed scenarios.
PolSAR image classification using advanced machine
learning and pattern recognition methods has shown excep-
tional growth in recent years. In 1991, Pottier et al. [10] firstly
introduced the Neural Networks (NNs) to PolSAR image
2 EURASIP Journal on Advances in Signal Processing
Table 1: Polarimetric parameters considered in this work.
Feature[ref] Expression
Amplitude of HH-VV correlation
coeff.[22, 23]
|
S
VV
|
2
|S
HH
|
2
Cross-polarized ratio in dB [25]
10
· log
|
S
HV
|
2
|S
HH
|
2
Ratio HV/VV in dB [22]
10
· log
|
0
HV
σ
0
HH
+ σ
0
VV
S
HV
S
∗
HV
S
HH
S
∗
HH
+ S
VV
S
∗
VV
classification. In 1999, Hellmann [11] further introduced
fuzzy logic with Neural Networks classifier; Fukuda et al. [12]
introduced Support Vector Machine (SVM) to land cover
classification with higher accuracy. In 2007, She et al. [13]
introduced Adaboost for PolSAR image classification; com-
the common polarimetric features are investigated, and
the two feature combination strategies are given. In
Section 3, the recently proposed ERCFs algorithm is ana-
lyzed. The experimental results and performance evaluation
are described in Section 4 and we conclude the paper in
Section 5.
2. Polarimetric Feature Extraction
and Combination
2.1. Polarimetric Feature Descriptors. PolSAR is sensitive to
the orientation and characters of target and thus yields
many new polarimetric signatures which produce a more
informative description of the scattering behavior of the
imaging area. We can simply divide the polarimetric features
into two categories: one is the features based on the original
data and its simple transform, and the other is based on
target decomposition theorems.
The first category features in this work mainly include
the Sinclair scattering matrix, the covariance matrix, the
coherence matrix, and several polarimetric parameters. The
classical 2
× 2 Sinclair scattering matrix S can be achieved
through the construction of system vectors [20]:
S
=
⎛
⎝
S
HH
S
HV
⎢
⎢
⎣
S
HH
+ S
VV
S
HH
− S
VV
2S
HV
⎤
⎥
⎥
⎥
⎦
,
[
T
]
=
k
p
· k
∗T
p
· Ω
∗T
l
,
(2)
where
∗ and T represent the complex conjugate and the
matrix transpose operations, respectively.
When analyzing polarimetric SAR data, there are also
a number of parameters that have useful physical inter-
pretation. Ta ble 1 lists the considered parameters in this
study: amplitude of HH-VV correlation coefficient, HH-VV
phase difference, copolarized ratio in dB, cross-polarized
ratio in dB, ratio HV/VV in dB, copolarization ratio, and
depolarization ratio [21].
EURASIP Journal on Advances in Signal Processing 3
Polarimetric target decomposition theorems can be used
for target classification or recognition. The first target
decomposition theorem was formalized by Huynen based on
the work of Chandrasekhar on light scattering with small
anisotropic particles [26]. Since then, there have been many
other proposed decomposition methods. In 1996, Cloude
and Pottier [27] gave a complete summary of these different
target decomposition methods. Recently, there are several
new target decomposition methods that have been proposed
[9, 28, 29]. In the next, we shall focus on the following five
target decomposition theorems.
(1) Pauli Decomposition. The Pauli decomposition is
a rather simple decomposition and yet it contains a
= (S
HH
+S
VV
)/
√
2, β = (S
HH
−S
VV
)/
√
2and
γ
=
√
2S
HV
.
(2) Krogager Decomposition.TheKrogagerdecompo-
sition [30] is an alternative to factorize the scattering
matrix as the combination of the responses of a
sphere, a diplane, and a helix; it presents the following
formulation in the circular polarization basis (r,l):
S
(
r,l
)
rl
|,if|S
rr
| > |S
ll
|, k
+
d
=|S
ll
|, k
+
h
=
|
S
rr
|−|S
ll
|, and the helix component presents a left
sense. On the contrary, when it is
|S
ll
| > |S
rr
|, k
−
d
=
|
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎣
f
s
β
2
+ f
d
|α|
2
+
3 f
v
8
0 f
s
β + f
d
α +
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎦
.
(5)
We can estimate the contribution on the dominance
in scattering powers of P
s
, P
d
,andP
v
, corresponding
to surface, double bounce, and volume scattering,
respectively:
P
s
= f
s
1+
β
3
i=1
p
i
log
3
p
i
, p
i
=
λ
i
3
k
=1
λ
k
,
A
=
λ
2
− λ
3
λ
C − jD H + jG
C + jD B
0
+ BE+ jF
H
− jG E − jF B
0
− B
⎤
⎥
⎥
⎥
⎦
. (8)
The set of nine independent parameters of this
particular parametrization allows a physical interpre-
tation of the target.
On the whole, the investigated typical polarimetric
features include
(i) F
1
: amplitude of upper triangle matrix elements of S;
(ii) F
2
: amplitude of upper triangle matrix elements of C;
(iii) F
3
: amplitude of upper triangle matrix elements of T;
(iv) F
4
(viii) F
8
: the three parameters H-α-A of the Cloude-pottier
decomposition;
(ix) F
9
: the nine parameters of the Huynen decomposi-
tion.
2.2. Multifeatures Combination. Recently researches [14, 31,
32] concluded that employing multiple features and different
combinations can be very useful for PolSAR image classifica-
tion. Usually, there is no unique set of features for PolSAR
image classification. Fortunately, there are several common
strategies for feature selection [33]. Some of them give only
a ranking of features; some are able to directly select proper
features for classification. One typical choice is the Fisher-
score which is simple and generally quite effective. However,
4 EURASIP Journal on Advances in Signal Processing
it does not reveal mutual information among features [34].
In this study we present two simple strategies to implement
the combination of different polarimetric features: one is
by manual selection following certain heuristic rules, and
the other is automatic combination with a newly proposed
measure.
(1) Heuristic Feature Combination.
The heuristic feature combination strategy uses the
following rules.
(i) Feature types are separately selected in the two
category features.
(ii) In each category, the selected feature types should
−→
P
i
,
−→
P
j
. (9)
−→
P
i
is the terrain classification accuracy of the ith feature
type in feature type pool. corrcoef (
·) is the correlation
coefficient.
The Dep
i
is actually the reciprocal of average cross-
correlation coefficient of the ith feature type, and it can
represent the average coupling of the ith feature type and
the other feature types. We assume that these two metrics are
independent as done in feature combination, and then the
selection metric of the ith feature type can be defined as
R
i
= Dep
i
· A
i
with single feature type f
i
Output: a certain combination S ={f
1
, f
2
, , f
M
}
-Compute the selection metric R ={r
1
, r
2
, , r
N
},
r
i
is the metric of the i
th
feature type;
-S
= empty set
do
-Find the correspond index i of the maximum of R
if add
to pool( f
i
, S) return true
-select f
− P
s
) >T
return true;
else
return false;
Algorithm 1: The pseudocode of automatic feature combining.
3. Extremely Random Clustering Forests
The goal of this section is to describe a fast and effective clas-
sifier, Extremely Randomized Clustering Forests (ERCFs),
which are ensembles of randomly created clustering trees.
These ensemble methods can improve an existing learning
algorithm by combining the predictions of several models.
The ERCFs algorithm provides much faster training and
testing and comparable accurate results with the state-of-the-
art classifier.
The traditional Random Forests (RFs) algorithm was
firstly introduced in machine learning community by
Breiman [17] as an enhancement of Tree Bagging. It is a
combination of tree classifiers in a way that each classifier
depends on the value of a random vector sampled indepen-
dently and having same distribution for all classifiers in the
forests and each tree casts a unit vote for the most popular
class at input. To build a tree it uses a bootstrap replica
of the learning sample and the CART algorithm (without
pruning) together with the modification used in the Random
Subspace method. At each test node the optimal split is
derived by searching a random subset of size K of candidate
attributes (selected without replacement from the candidate
attributes). RF contains N forests, which can be any value.
= Pick a random split(S
i
t
);
-split S according s
i
, and calculate the score;
until (score
≥ S
min
)or(tries ≥ T
max
)
return the split s
∗
that achieved highest score;
end if
Pick
a random split(S
i
t
)
Input:anattributeS
i
t
Output:asplits
i
-Let s
min
and s
method. However, most of these techniques just make litter
perturbations in the search of the optimal split during tree
growing, and they are still far from building totally random
trees [18].
Compared with RF, the ERCFs [18] use consists in
building many extremely randomized trees, which randomly
pick attributes and cut thresholds at each node. The tree
growing algorithm of ERCFs is shown as Algorithm 2.The
main differences between ERCFs and RF are that it splits
nodes by choosing cut-points fully at random and that it
uses the whole learning sample (rather than a bootstrap
replica) to grow the trees. At each node, the Extremely
Clustering Trees splitting procedure is processing recursively
until further subdivision is impossible, and the resulting
node is scored over the surviving points by using the
Shannonentropyassuggestedin[18]. For a sample S and
a split s
i
, this measure is given by
Score
(
s
i
, S
)
=
2I
s
i
C
max
,andn
min
have different effects:
S
min
determines the balance of the grown tree; T
max
deter-
mines the strength of the attribute selection process, and it
denotes the number of random splits screened at each node
to develop. In the extreme, for T
max
= 1, the splits (attributes
and cut-points) are chosen in a totally independent way of
the output variable. On the other extreme, when T
max
= N
s
,
the attribute choice is not explicitly randomized anymore,
and the randomization effect acts only through the choice
of cut-points. n
min
is the strength of averaging output noise.
Larger values of n
min
lead to smaller trees, higher bias,
and smaller variance. In the following experiments, we set
n
and the Land Use Land Cover (LULC) ground truth image
(USGS) are used for feature analysis and comparison. The
selected POLSAR image has 1236
× 1070 pixels with 8
looks and 30 m
×30 m resolution. According to the LULC
image data, the land cover mainly includes four classes:
water, wetland, woodland, and farmland. Only the above
four classes are considered in training and testing; the pixels
of other classes are ignored. The classification accuracy
on each terrain is used to evaluate the different feature
types.
4.2. Evaluation of Single Polarime tric Descriptor. We firstly
represent PolSAR images as rectangular grids of patches at a
single scale with the block size 12
×12 and the overlap step 6.
In the training stage, 500 patches of each class are selected
as training data. Then, all the features are normalized to
[0 1] by their corresponding maximum and minimum values
across the image. We finally use the KNN and SVM classifier
for evaluation of single polarimetric feature. KNN is a linear
classifier. It selects the K nearest neighbours of the test patch
within the training patches. Then it assigns to the new patch
the label of the category which is most represented within
6 EURASIP Journal on Advances in Signal Processing
Table 2: Classification accuracies of single polarimetric descriptor
using KNN and SVM classifier(%).
Feature
Classifier Water Wetland Woodland Farmland Ave.acc
(dim)
F
7
(3)
KNN 86.3 63 69 71.9 72.5
SVM 87.6 53.4 77.7 74.1 73.2
F
8
(3)
KNN 71.3 61.9 66.6 67.1 66.7
SVM 77.8 56.3 75.8 73.3 70.8
F
9
(9)
KNN 87.5 60.6 66.29 72.9 71.8
SVM 88.6 61.9 73.8 76.4 75.2
Table 3: Classification performances(%) of KNN and SVM with
selected feature set and all features.
Classifier
Features
Water Wetland Woodland Farmland Ave.acc
KNN
Selected
features
87.6 67.1 69.9 74.1 74.7
All
features
86.1 67.7 68.1 74.8 74.2
SVM
Selected
features
F
4
F
5
F
6
F
7
F
8
F
9
KNN 1.39 1.6 1.03 1.27 0.67 0.69 0.75 0.69 0.75
SVM 0.76 0.77 0.73 0.73 0.76 0.72 0.75 0.74 0.78
Table 5: Classification performances(%) of SVM and ERCFs with
Pset1, Pset2, and Pset3.
Classifier Features Water Wetland Woodland Farmland Ave.acc
SVM
Pset1 88.3 63.0 73.9 78.3 75.9
Pset2 89.4 67.5 72.2 81.2 77.6
Pset3 91.5 69.2 75.9 80.4 79.3
ERCFs
Pset1 89.3 64.2 74.5 78.7 76.7
Pset2 89.8 69.1 72.3 80.9 78.0
Pset3 91.5 69.6 76.4 80.9 79.6
Table 6: Time comsuming of SVM and ERCFs.
Classifier Training time (s) Testing time (s)
SVM 986.35 22.97
ERCFs 22.95 0.44
From Ta ble 2 , some conclusions can be drawn.
We tl an d
Woodland
Farmland
(e) ERCFs classification results
Figure 1: (a) ALOS PALSAR polarimetric SAR data of Washington County, North Carolina (1236 × 1070 pixels, R: HH, G: HV, B: VV). (b)
The corresponding Land use Land cover (LULC) ground truth. (c) Classification result using ML. (d) Classification result using SVM. (e)
Classification result using ERCFs.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Water Wetland Woodland Farmland
Average
accuracy
ML
KNN
SVM
ERC-forests
Figure 2: The quantitative comparison of different classifiers with
features Pset3.
{F
1
, F
The features with higher selection metric have higher priority
to be selected, and the feature is finally selected only if it
can improve the classification accuracy based on the selected
features with a predefined threshold. According to the
selection metric in Ta bl e 2 and automatic feature combining
as shown in Ta b le 3 , if threshold T
= 0.5, automatic
combination can get the same feature combination as the
heuristic feature combination.
8 EURASIP Journal on Advances in Signal Processing
In the following experiment some intermediate feature
combination states are selected to illustrate that the feature
combination strategy can improve the classification perfor-
mance step by step. The intermediate feature combination
states include the following.
Pset1: select 1 feature type in the first category and 1 feature
type in the second category; the combination features
include F
2
and F
9
.
Pset2: select 2 feature type in the first category and 1 feature
type in the second category; the combination features
include F
1
, F
2
and F
9
mend to use automatic combining since it is more flexible.
When mapping the patch-level classification result to pixel-
level, we take a smoothing postprocessing method based
on the patch-level posteriors (the probability soft output
of ERCFs or SVM classifier) [38]. We first assign each
pixel posterior label probability by linearly interpolating of
the four adjacent patch-level posteriors to produce smooth
probability maps. Then we apply a Potts model Markov
Random Field (MRF) smoothing process using graph cut
optimization [39] on the final pixels labels to obtain final
classification result. The classification results of ML classifier
based on Wishart distribution, SVM, and ERCFs are shown
in Figure 1.
Figure 2 is a quantitative comparison of the results based
on the ground truth-LULC. It can be learned that ERCFs
can get slightly better classification accuracy than SVM, and
they both have much better performance than traditional ML
classifier based on complex Wishart distribution.
In addition, ERCFs require less computational time
compared to SVM classifier, which could be learned from the
Ta bl e 6. SVM training time includes the time for searching
the optimal parameters with a 10
× 10 grid search. ERCFs
include 20 extremely clustering trees and we selected 50
attributes every time when making node splitting.
5. Conclusion
We addressed the problem of classifying PolSAR image with
multifeatures combination and ERCFs classifier. The work
started by testing the widely used polarimetric descriptors
for classification, and then considering two strategies for
pp. 68–78, 1997.
[5] J. S. Lee, M. R. Grunes, T. L. Ainsworth, L. J. Du, D.
L. Schuler, and S. R. Cloude, “Unsupervised classification
using polarimetric decomposition and the complex Wishart
classifier,”IEEE Transactions on Geoscience and Remote Sensing,
vol. 37, no. 5, pp. 2249–2258, 1999.
[6] E. Pottier and J. S. Lee, “Unsupervised classification scheme
of PolSAR images based on the complex Wishart distribution
and the H/A/α. Polarimetric decomposition theorem,” in
Proceedings of the 3rd European Conference on Synthetic
Aperture Radar (EUSAR ’00), Munich, Germany, May 2000.
[7] J. S. Lee, M. R. Grunes, E. Pottier, and L. Ferro-Famil,
“Unsupervised terrain classification preserving polarimetric
scattering characteristics,” IEEE Transactions on Geoscience and
Remote Sensing, vol. 42, no. 4, pp. 722–731, 2004.
[8] A. Freeman and S. Durden, “A three-component scattering
model for polarimetric SAR data,” IEEE Transactions on
Geoscience and Remote Sensing, vol. 36, no. 3, pp. 963–973,
1998.
EURASIP Journal on Advances in Signal Processing 9
[9] Y. Yamaguchi, T. Moriyama, M. Ishido, and H. Yamada, “Four-
component scattering model for polarimetric SAR image
decomposition,” IEEE Transactions on Geoscience and Remote
Sensing, vol. 43, no. 8, pp. 1699–1706, 2005.
[10] E. Pottier and J. Saillard, “On radar polarization target decom-
position theorems with application to target classification by
using network method,” in Proceedings of the International
Conference on Antennas and Propagation (ICAP ’91), pp. 265–
268, York, UK, April 1991.
[11] M. Hellmann, G. Jaeger, E. Kraetzschmar, and M. Habermeyer,
[19] F. Moosmann, E. Nowak, and F. Jurie, “Randomized clustering
forests for image classification,” IEEE Transactions on Pattern
Analysis and Machine Intelligence, vol. 30, no. 9, pp. 1632–
1646, 2008.
[20] R. Touzi, S. Goze, T. Le Toan, A. Lopes, and E. Mougin,
“Polarimetric discriminators for SAR images,” IEEE Transac-
tions on Geoscience and Remote Sensing, vol. 30, no. 5, pp. 973–
980, 1992.
[21] M. Molinier, J. Laaksonent, Y. Rauste, and T. H
¨
ame, “Detect-
ing changes in polarimetric SAR data with content-based
image retrieval,” in Proceedings of the IEEE International
Geoscience and Remote Sensing Symposium (IGARSS ’07),pp.
2390–2393, Barcelona, Spain, July 2007.
[22] S. Quegan, T. Le Toan, H. Skriver, J. Gomez-Dans, M. C.
Gonzalez-Sampedro, and D. H. Hoekman, “Crop classifica-
tion with multi temporal polarimetric SAR data,” in Proceed-
ings of the 1st Workshop on Applications of SAR Polarimetry and
Polarimetric Interferometry (POLinSAR ’03), Frascati, Italy,
January 2003, (ESA SP-529).
[23] H. Skriver, W. Dierking, P. Gudmandsen, et al., “Applications
of synthetic aperture radar polarimetry,” in Proceedings of the
1st Workshop on Applications of SAR Polarimetry and Polari-
metric Interferometry (POLinSAR ’03), pp. 11–16, Frascati,
Italy, January 2003, (ESA SP-529).
[24] W. Dierking, H. Skriver, and P. Gudmandsen, “SAR polarime-
try for sea ice classification,” in Proceedings of the 1st Workshop
on Applications of SAR Polarimetry and Polarimetric Interfer-
ometry (POLinSAR ’03), pp. 109–118, Frascati, Italy, January
IEEE International Geoscience and Remote Sensing Symposium
(IGARSS ’06), pp. 493–496, Denver, Colo, USA, August 2006.
[32] J. Chen, Y. Chen, and J. Yang, “A novel supervised classification
scheme based on Adaboost for Polarimetric SAR Signal
Processing,” in Proceedings of the 9th International Conference
on Signal Processing (ICSP ’08), pp. 2400–2403, Beijing, China,
October 2008.
[33] A. L. Blum and P. Langley, “Selection of relevant features and
examples in machine learning,” Artificial Intelligence, vol. 97,
no. 1-2, pp. C245–C271, 1997.
[34] Y. W. Chen and C. J. Lin, “Combining SVMs with various
feature selection strategies,” in Feature Extraction, Foundations
and Applications, Springer, Berlin, Germany, 2006.
[35] F. Schroff, A. Criminisi, and A. Zisserman, “Object class
segmentation using random forests,” in Proceedings of the 19th
British Machine Vision Conference (BMVC ’08), Leeds, UK,
September 2008.
[36] J.M.Keller,M.R.Gray,andJ.A.GivensJr.,“AfuzzyK-nearest
neighbor algorithm,” IEEE Transactions on Systems, Man, and
Cybernetics, vol. 15, no. 4, pp. 580–585, 1985.
[37] C. C. Chang and C. J. Lin, “LIBSVM : a library for support vec-
tor machines,” Software, 2001, http://www.csie.ntu.edu.tw/
∼
cjlin/libsvm.
[38] W. Yang, T. Y. Zou, D. X. Dai, and Y. M. Shuai, “Supervised
land-cover classification of TerraSAR-X imagery over urban
areas using extremely randomized forest,” in Proceedings of
the Joint Urban Remote Sensing Event (JURSE ’09), Shanghai,
China, May 2009.
[39] Y. Boykov, O. Veksler, and R. Zabih, “Fast approximate energy