1. Introduction
High-resolution synthetic aperture radar (SAR) systems have been widely used to monitor various regions for civilian and military purposes. In particular, because radar can monitor vast areas, regardless of day or night or weather conditions [
1], the SAR system is very effective in surveilling targets of interest, such as tanks and transporter erector launchers (TEL), within a short time. However, SAR images have raised numerous false alarms owing to natural and cultural clutter as well as the targets of interest. Recently, a neural-network-based approach has been proposed for distinguishing only the objects of interest from wide-area SAR images [
2]. However, teaching the neural network requires large amounts of training with SAR images against the objects of interest, which is expensive and time-consuming compared to the relatively easy-to-obtain optical images. Therefore, it is challenging to obtain sufficient training SAR images for the targets of interest in real-life situations, which demands a new paradigm for detecting the targets of interest from SAR images.
To discriminate targets from SAR images with heavy clutter responses, it is essential to remove false alarms owing to noise and speckle, as well as to discriminate the targets of interest from natural and artificial clutter. In general, the target detection process for SAR images consists of speckle reduction, constant false-alarm rate (CFAR) detection, clustering, and discrimination. Various techniques have been devised for preprocessing (i.e., speckle reduction, CFAR detection, and clustering) prior to the discriminator. In the discriminator, various features have been proposed and evaluated in terms of the detection performance of targets of interest from a SAR image with a reduced number of false alarms after preprocessing.
The speckle phenomenon causes many irregular pixels with high-intensity fluctuations, owing to interference between multiple scatterers in a single resolution cell. Strong speckles in a SAR image can produce many false alarms for conventional detectors that base their identifications on intensity [
3]. Thus, speckle reduction must precede the detection of target pixels, and so various filters have been devised to reduce the speckles in SAR images. Local filters, such as the mean filter, median filter, Lee filter, and enhanced Lee filter, adjust the value of the target pixels by referring to the values of the neighboring pixels within a short calculation time [
4,
5]. In addition, a nonlocal mean filter or deep learning-based method enables precise speckle reduction by considering all pixels in the entire SAR image [
6]. After speckle reduction, the pixels of the potential target can be detected. Scatterers on the desired targets in the SAR image are assumed to be stronger than those in background clutter scenes. Thus, these target pixels can be detected by the CFAR detector, based on pixel intensity. To date, various CFAR detectors have been proposed, including fixed threshold (FT), cell-averaging (CA), and ordered statistic (OS) CFAR, which can be easily extended to two dimensions for SAR images [
7,
8].
The detected pixels must be clustered into individual objects to calculate the features for discriminating targets from clutter. Therefore, the clustering process aims to construct a single cluster for each target. Various clustering algorithms have been developed, such as the K-nearest neighbor (KNN) and K-means algorithms. However, using them for SAR target detection is impractical because the number of clusters (i.e., targets) must be determined a priori [
9,
10]. Because the number of targets of interest is generally unknown in a real-life situation, the clustering stage should be able to identify the desired target clusters, even without such information. Hence, a suitable approach for target detection in SAR images may involve mean-shift algorithms or density-based spatial clustering of applications with noise (DBSCAN) [
11,
12].
To determine whether potential target clusters identified via a clustering algorithm originate from the desired targets of interest or from false alarms due to natural or artificial clutter, the discrimination stage should exploit some suitable features for this task. Features refer to the various qualities that can separate the target and clutter in the feature space. In other words, useful target-discriminating features should have similar values between targets, while targets and clutter should have different values. However, the features of artificial clutter are often similar to those of the targets, making them difficult to distinguish. To overcome this problem, several features with a good discrimination capability between targets and clutter have been introduced at the MIT Lincoln Laboratory [
13]. In [
14], some additional features based on the projected length (PL) of a target were also presented for a moving and stationary target acquisition and recognition (MSTAR) dataset. Even though all these features can be effective in terms of target detection, their performance significantly fluctuates depending on the specifications and configuration of the SAR system, such as the resolution and look-angle. Therefore, a detection framework that applies specifically to SAR images is required.
In this study, we propose a two-stage detection framework to ensure efficient and high detection performance in TSX images. We analyzed the performance of the features for distinguishing targets and clutter in TerraSAR-X (TSX) images and selected those features suitable for target discrimination. Moreover, we propose an efficient two-stage target discrimination scheme for clustered pixels in TSX images. In the proposed scheme, speckle reduction, based on the Lee filter, and pixel detection, based on FT-CFAR, are performed a priori, followed by building clusters of targets using the DBSCAN algorithm in the SAR image. Although the SAR image is designed, there are many clusters owing to the false alarms related to the remaining speckles, background noise, and clutter. Subsequently, a discrimination stage was required to reduce the number of false alarms. The proposed discrimination stage consists of two sequential steps: a coarse discrimination step (CDS) and a fine discrimination step (FDS). The CDS quickly finds the desired clusters corresponding to the targets of interest and results in reducing the majority of the false-alarm clusters. Then, FDS is performed only against those clusters that passed the CDS based on the selected features with good discrimination performance. In FDS, feature generation based on the Karhunen–Loève (KL) transform is adopted to maximize the discriminatory performance in the feature space [
15].
The remainder of this paper is organized as follows. In
Section 2, the proposed two-stage target-discrimination scheme is presented. In
Section 3, the experimental results are provided using real TSX images, and they are analyzed in terms of detection performance. Finally,
Section 4 and
Section 5 present the discussion and conclusions, respectively.
2. Proposed Method
The overall process of the proposed method is illustrated in
Figure 1. The preprocessing stage aims to form clusters of target candidates as soon as possible and consists of three steps: speckle reduction, pixel detection, and clustering. Any speckle reduction filter is applicable, and the Lee filter [
4], which does not require much computation, was used in this study. Two-dimensional (2D) CFAR methods, such as CA-CFAR and OS-CFAR, are generally used to find the peaks in SAR images. However, these 2D CFAR methods are relatively time-consuming, and many targets are irregularly present in the image of interest. Therefore, applying a fixed threshold value to an entire scene is the most efficient solution.
TCFAR, the threshold for the FT-CFAR method, is obtained based on the Rayleigh distribution, as follows [
15]:
where
PFA is the false alarm probability and
E(
I) is the average magnitude of the image. In addition, because DBSCAN does not require the number of clusters in advance, it is suitable as a clustering technique [
12].
After the preprocessing stage, the proposed target detection scheme consists of CDS and FDS. As shown in
Figure 1, the targets of interest are represented by large clusters. Speckle and natural clutter clusters, consisting of a small number of pixels, exist in the entire scene. Therefore, in the first step of the target detection scheme, the number of pixels constituting the cluster feature mass is used to filter the false alarms from the clusters. The threshold of mass,
is determined by considering the resolution of the image and the size of the target of interest, and the clusters consisting of fewer pixels than the threshold
are excluded from the target candidate:
where
is the expected ratio of detected pixels in the target of interest,
lh and
lv are the horizontal and vertical lengths of the target, respectively, and
Rr and
Ra are the range and azimuth resolution of the SAR image, respectively. For example, if the smallest target of interest is a D7 Caterpillar bulldozer (2.4 m × 4.1 m), if the image resolution is 0.5 m × 0.2 m, and
is 0.3, then
Tmass becomes 29.5. Because the mass of a cluster is measured without a separate calculation process, the amount of computation required to measure the features of the incorrectly detected clusters can be significantly reduced.
Even though speckle reduction is performed before detecting target pixels using the CFAR detector, many pixels are incorrectly detected because of the natural clutter. Therefore, not only target clusters but also numerous clusters of false detections are formed due to clustering the detected pixels using DBSCAN. Because the clutter has different scattering characteristics and shapes from the targets, these clusters of false alarms can be distinguished using certain features. According to [
13,
14], various features for distinguishing the target of interest from the clutter have been studied, and the features considered in this study are standard deviation (STD), weighted-rank fill-ratio (WRFR), fractal dimension, mass, diameter, normalized rotational inertia, max CFAR, mean CFAR, percentage of bright CFAR, count, minimum projected length (MINPL), maximum PL (MAXPL), contrast of PL (CPL), average of min and max PL (AMMPL), average of PL (APL), error between the reference and PL (ERPL), squared error between the reference and PL (SERPL), energy of PL in the frequency domain (EPLF), squared energy of PL in the frequency domain (SEPLF), average of detected pixels (ADP), sum of detected pixels (SDP), and standard deviation of detected pixels (STDDP).
For the coarse discrimination result, clusters larger than
Tmass are distinguished using FDS. In this step, the targets of interest and the clutter are discriminated against, based on the features introduced in [
13,
14]. However, the distribution of each feature differs according to the characteristics of the image. Thus, those features that effectively discriminate the targets of the TSX image should be selected. Therefore, to evaluate the discriminating performance of each feature, the overlap between the distributions of feature values for targets and the distribution of feature values for clutter is measured, as shown in
Figure 2. The smaller the overlapped area between the two distributions, the better the distinguishing performance. The overlap
l can be obtained as follows:
where
is the feature value of class
j, and
c is a feature value in which the probabilities of the two distributions are equal.
When multiple features are used, the separation ability of the features deteriorates if there is a dependency between them. In this situation, the KL transform is used to improve the separation performance by selecting independent features and removing the redundancy between the selected features [
16,
17]. For the feature set
x, which was extracted from the training data, the feature set
y, transformed to be mutually uncorrelated, is expressed as follows:
where
A is an
N ×
N transformation matrix. In addition, the correlation matrix of
y,
Ry, is given by:
where
Rx is the correlation matrix of
x. Because
Rx is a symmetrical matrix, if the columns of matrix
A are chosen as the orthonormal eigenvectors of
Rx,
ai,
i = 0, 1, …,
N − 1, then the resulting
Ry is:
is a diagonal matrix with eigenvalues
, corresponding to
ai as the elements. Additionally, to exclude redundant features, the normalized sum of the top
r eigenvalues
Er is calculated thus, as in a previous paper [
18]:
Then, the number of features utilized,
, is determined as follows:
where
is a constant between zero and one. Then, a generated feature set
yr, composed of the features corresponding to the upper
eigenvalues can be obtained. The feature values extracted from the detected clusters,
xtest, can also be transformed by
A to
ytest on the same axis as
y:
In addition,
, which is composed of the features corresponding to the upper
eigenvalues in
, exists in the same feature space as
ytest, allowing the discrimination of the target and clutter to be performed using a classifier. The process of performing KL transform and obtaining
is described in detail in
Figure 3.