Báo cáo hóa học: " Research Article Synthesis of an Optimal Wavelet Based on Auditory Perception Criterion" potx - Pdf 14

Hindawi Publishing Corporation
EURASIP Journal on Advances in Signal Processing
Volume 2011, Article ID 170927, 13 pages
doi:10.1155/2011/170927
Research Article
Synthesis of an Optimal Wavelet Based on
Auditory Perception Criterion
Abhijit Karmakar,
1
Arun Kumar,
2
and R. K. Patney
3
1
Integrated Circuit Design Group, Central Electronics Engineering Research Institute/Council of Scientific and Industrial Research,
Pilani 333031, India
2
Centre for Applied Research in Electronic s, Indian Institute of Technology Delhi, New Delhi 110016, India
3
Department of Electrical Engineering, Indian Institute of Technology Delhi, New Delhi 110016, India
Correspondence should be addressed to Abhijit Karmakar, [email protected]
Received 2 July 2010; Revised 3 November 2010; Accepted 4 February 2011
Academic Editor: Antonio Napolitano
Copyright © 2011 Abhijit Karmakar et al. This is an open access article distributed under the Creative Commons Attribution
License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly
cited.
A method is proposed for synthesizing an optimal wavelet based on auditory perception criterion for dyadic filter bank
implementation. The design method of this perceptually optimized wavelet is based on the critical band (CB) structure and
the temporal resolution of human auditory system (HAS). The construction of this compactly supported wavelet is done by
designing the corresponding optimal FIR quadrature mirror filter (QMF). At first, the wavelet packet (WP) tree is obtained that
matches optimally with the CB structure of HAS. The error in passband energy of the CB channel filters is minimized with respect

The next important thing in these WP-based speech and
audio applications is the choice of suitable wavelet and its
synthesis. A systematic framework for obtaining orthogonal
wavelet was developed by Mallat [6]. Daubechies gave a
construction technique for obtaining compactly supported
wavelets with arbitrarily high regularity [7]. The requirement
of regularity of a wavelet is an important consideration
for some applications but their importance is unknown for
many other applications [8]. It is evident that appropriate
design of wavelet based on the perceptual frequency scale and
temporal resolution of the human auditory system is of inter-
est. In the literature, we do find methods of designing mother
2 EURASIP Journal on Advances in Signal Processing
wavelet based on the perceptual frequency scale of human
auditory system, such as in [9, 10] for continuous wavelet
transform. These methods do not provide the requisite filter
bank structure for the dyadic multiresolution analysis.
In this paper, we have proposed a design method for
synthesizing an optimal mother wavelet for auditory per-
ception-based dyadic filter bank implementation. The design
method optimally exploits the CB structure and temporal
resolution of human auditory system. The proposed method
for synthesizing this compactly supported wavelet is by de-
signing the corresponding optimal wavelet-generating FIR
quadrature mirror filter (QMF). The approach followed for
the construction of this wavelet is to first obtain the WP tree
which closely mimics the CB structure of the human auditory
system. This is followed by obtaining the error in passband
energy of the CB channel filter responses with respect to the
case where the QMFs in the WP tree are replaced by the

z
= F

f

=
13 arctan

0.76 × f
10
3

+3.5 arctan



10
−3
f
7.5

2


,
(1)
B

f


components in each channel are minimized with respect to
the constraints of QMF. The multiple-objective constrained
global optimization problem is converted into a single-
objective constrained global optimization problem by taking
a suitably weighted average of the energy error terms,
denoted as the performance measure of optimization.
The optimization problem is reformulated into an
unconstrained optimization problem by converting the QMF
constraints in lattice QMF domain [14, 15]. In lattice
QMF domain, the performance measure is expressed in
terms of Givens rotations [14, 15] which absorb the QMF
constraints of the optimization problem. Using the 2π-
periodicity of Givens rotations, the problem is converted into
a bounded value optimization problem [16]. The solution of
the global optimization problem is obtained using multilevel
coordinate search (MCS) [17]. Using the cascade algorithm
(also known as the successive approximation algorithm)
[18], the desired wavelet is synthesized. The support of
the wavelet is selected in accordance with the temporal
resolution of the human ear [19]. This is done by choosing
the support of the wavelet so that its time dura tion is less
than the temporal resolution of human auditory system.
Thus, the wavelet synthesized as above is optimal with
respect to the critical band structure and temporal resolution
of the human auditory system. The design process of the
wavelet is elaborated for the case of sampling frequency of
f
s
= 16 kHz.
3. The Optimal WP Tree Based on CB Structure

)
+


k=−∞
J−1

j=0
d
j
[
k
]
ψ
j,k
(
t
)
,(3)
where φ
j,k
(t)andψ
j,k
(t) are the two-dimensional families of
functions generated from the scaling function φ(t) and the
wavelet ψ(t)as
φ
j,k
(
t

j
[k] are the approximation and detail coefficients
of the DWT at scale j.
The scaling function and the wavelet are recursively
defined as:
φ
(
t
)
=

2


k=−∞
h
[
k
]
φ
(
2t −k
)
,
(5)
ψ
(
t
)
=

[k]in(3) can be obtained by passing through the
approximation coefficients of the next higher scale, c
j+1
[k],
to the filters h[k]andg[k] and downsampled by a factor of
two for j
= 0, 1, 2, , J − 1. The filters h[k]andg[k]form
a quadrature mirror filter (QMF) pair [18]. In the Fourier
transform domain they are related by



H

e





2
+



G

e



k
h
[
K − k
]
. (9)
After the signal is processed by the tree-structured
analysis filter bank, the inverse process of interpolation and
filtering can be used to reconstruct the signal. The perfect
reconstruction of a signal can be achieved using a realizable
orthogonal filter bank [14, 20]. The perfect reconstruction
lowpass and highpass synthesis QMF pair, h
1
[k]andg
1
[k], is
related to the analysis filters by [20]
h
1
[
k
]
= h
[
K − k
]
,
g
1
[

[
k
]
ν
n
(
2t
− k
)
,
ν
2n+1
(
t
)
=

2


k=−∞
g
[
k
]
ν
n
(
2t
− k


. (12)
Denoting the space formed by the basis ν
n,j,k
(t)byW
n,j
, the
signal s(t) limited to a scale J, that is, s(t)
∈ W
0,J
,canbe
decomposed in a manner similar to (3), as follows:
s
(
t
)
=


k=−∞
J−1

j=0

n⊆I
j
d
n,j
[
k

p as the depth of decomposition given by p
= J − j.The
input signal sampled at Nyquist rate is taken as the scaling
coefficients at the Jth scale. As the signal is decomposed
through all the levels, the depth of decomposition varies from
p
= 0toJ. The bandwidth available at decomposition
depth p is given by
Δ f
WP

p

=
f
s
2
p+1
, (14)
where f
s
is the sampling frequency of the input signal.
For a dyadic WP tree with maximum depth of decom-
position L, minimum depth of decomposition M, and
n
p
number of terminal nodes at decomposition depth p,
4 EURASIP Journal on Advances in Signal Processing
10
3

M
bands
f
l
(L)
f
h
(L) =
f
l
(L − 1)
f
h
(L − 1) =
f
1
(L − 2)
f
l
(M) =
f
h
(M − 1)
f
h
(M)
···
···
···
Figure 1: Illustration of WP bandwidths and number of terminat-

l
(p) ≤ f ≤ f
h
(p), where
f
h
(p) =

L
m
=p
n
m
Δ f
WP
(m), f
l
(p) =

L+1
m
=p+1
n
m
Δ f
WP
(m),
n
L+1
= 0, and M ≤ p ≤ L.Here,f

B

F

f

= B

F
−1
(
z
)

= B

f

. (16)
At the pth decomposition depth, the integral squared error
in critical bandwidth in the Bark domain can be obtained as
q
e

p

=

z
h

(p)). The total error
Q
E
, in quantizing

B(z), for the complete frequency range 0 ≤
f ≤ fs/2, can be given by Q
E
=

L
p=M
q
e
(p). Substituting the
expression of q
e
(p) and replacing z by F( f ) in the expression
of Q
E
, we obtain
Q
E
=
L

p=M

f
h

criterion for obtaining the optimal WP tree is to minimize
the cost function Q
E
, that is,

L
opt
, M
opt
, n
opt
M
, n
opt
M+1
, , n
L
opt

=
arg min
(L,M,n
M
,n
M+1
, ,n
L
)
{Q
E

= 8. Thus, the
minimum depth of decomposition is M
= 3, the maximum
depth of decomposition is L
= 6, and the total number of
decomposition depths is L
− M +1= 4. Figure 2 compares
the WP tree with Zwicker’s critical band structure. For this
case, the signal can be decomposed as in (13) as follows:

|
s
(
t
)
|
2
dt
=

k

n=0,1,3,2,6,7,5,4


d
n,0
[
k
]



2
+

k

n=6,7,5,4


d
n,3
[
k
]


2
.
(20)
In (20), n denotes the position of the WPT coefficients
in the WP tree and assumes the appropriate values at the
various scales such that the frequency bands are ordered in
an ascending manner for the WPT. It is noticed that n is not
in ascending order with respect to the band-ordered WPT
coefficients at the various scales. This is because of the fact
that, in a dyadic filter bank implementation, when a highpass
region is decomposed by a QMF bank, the highpass and
lowpassfrequencyregionsswapwitheachother[22].
4. Design Procedure of the Perceptually

10
2
10
2
10
3
10
3
10
4
Center frequency (Hz)
Critical bandwidth (Hz)
Zwicker’s model
Optimal WP tree
(a)
10
2
10
3
10
4
Center frequency (Hz)
0
5
10
15
20
25
Critical band rate (z)
Zwicker’s model

following a decimator M is equivalent to A(z
M
) preceding
the same decimator. Using the noble identity, the nontree
filter structure is obtained for the optimal WP tree. As an
illustration, the nont ree filter struc ture corresponding to
Figure 3 is shown in Figure 5. In this figure, H
i
(z) represents
the equivalent filtering at the ith critical band, and the
combined decimators follow them. Note that i in H
i
(z)
denotes the critical band numbers in ascending order of
center frequencies of the respective bands.
The lower and upper passband edges of H
i
(e

)are
denoted as ωl
i
and ωh
i
, respectively, and can be expressed
as
ωl
i
=



f
s

f
h
(
L+1
−m
)
+
f
h
(
L +1
− m
)
− f
l
(
L +1
− m
)
n
L+1−m
i

,
(22)
where i is the index of critical bands in ascending order as

The integral squared error in passband energy of the
individual CB channel filters with respect to the ideal case
can be expressed as
E
i
=2

ωh
i
ωl
i




H
i
Ideal

e





2








H
i
Ideal

e





2


= 1,
2

ωh
i
ωl
i



H
i

e

h
CB 1
CB 2
CB 3
CB 4
CB 5
CB 6
CB 7
CB 8
CB 10
CB 9
CB 11
CB 12
CB 13
CB 14
CB 15
CB 16
CB 17
CB 18
CB 19
CB 20
CB 21
g
g
g
g
g
g
g
g

↓ 2
↓ 2
↓ 2
↓ 2
↓ 2
↓ 2
↓ 2
↓ 2
↓ 2
↓ 2
↓ 2
↓ 2
↓ 2
↓ 2
↓ 2
↓ 2
↓ 2
↓ 2
↓ 2
↓ 2
↓ 2
↓ 2
↓ 2
↓ 2
↓ 2
↓ 2
Figure 3: The filter bank implementation of the WP tree for f
s
=
16 kHz.



. (25)
Also, E
i
≥ 0. The optimal lowpass QMF is defined as the
solution of the following multi-objective function:
h
opt
[
n
]
= arg min
{h[n]}
{E
1
, E
2
, E
3
, , E
N
}, (26)
where h[n] represents all possible wavelet-defining, lowpass
QMFs.
This multi-objective optimization problem can be sim-
plified however, if we convert it into a conventional single-
objective optimization problem using the average of suitably
weighted objective functions as the performance measure of
optimization. We have used the weighting function of the

−3
f − 3.3

2


10
−3

10
−3
f

4
,
(27)
where W
dB
( f ) is the weighting in dB scale as a function
of frequency f in Hz [23]. The OME weighting function
W
dB
( f ) is shown as a function of the frequency f in Figure 7.
Now, the single objective performance measure is
obtained as
P
=
1
N
N

N

i=1

1 − 2

ωh
i
ωl
i



H
i

e





2



w
(
i
)

and G(z)iswritteninamatrixformas[14]


H
(
z
)
G
(
z
)


=
R
J
Λ

z
2

R
J−1
Λ

z
2

···Λ


CB 17
CB 18
CB 19
CB 20
CB 21
H
1
(z) = H(z)H(z
2
)H(z
4
)H(z
8
)H(z
16
)H(z
32
)
H
2
(z) = H(z)H(z
2
)H(z
4
)H(z
8
)H(z
16
)G(z
32

2
)H(z
4
)G(z
8
)G(z
16
)H(z
32
)
H
6
(z) = H(z)H(z
2
)H(z
4
)G(z
8
)G(z
16
)G(z
32
)
H
7
(z) = H(z)H(z
2
)H(z
4
)G(z

H
10
(z) = H(z)H(z
2
)G(z
4
)G(z
8
)G(z
16
)
H
11
(z) = H(z)H(z
2
)G(z
4
)H(z
8
)G(z
16
)
H
12
(z) = H(z)H(z
2
)G(z
4
)H(z
8

4
)G(z
8
)
H
16
(z) = H(z)G(z
2
)H(z
4
)G(z
8
)
H
17
(z) = H(z)G(z
2
)H(z
4
)H(z
8
)
H
18
(z) = G(z)G(z
2
)H(z
4
)
H

↓ 32
↓ 32
↓ 32
↓ 64
↓ 64
↓ 64
↓ 64
↓ 64
↓ 64
↓ 64
↓ 64
Input signal
( f
s
= 16kHz)
Figure 5: Equivalent nontree filter structure of Figure 3.
In (32), J relates to the QMF length M via M = 2J +2and
R
m
,0≤ m ≤ J,isa2×2 unitary matrix (i.e., R
T
m
R
m
= I)and
is expressed as
R
m
=



10
0 z
−2


. (34)
The filter pair H(z)andG(z) must also satisfy the
additional constr aint that H(z) is lowpass or, equivalently,
G(z) is highpass
H
(
1
)
= 1. (35)
The constraint of (35)onh[n] can be transformed to Givens
rotations by evaluating (32)forz
= 1, that is, ω = 0, and
obtained as
θ
J
+ θ
J−1
+ ···+ θ
0
=−
π
4
. (36)
Thus, any one of the θ

(e

)|
2
|H
Ideal
15
(e

)|
2
|H
Ideal
17
(e

)|
2
|H
Ideal
18
(e

)|
2
|H
Ideal
21
(e


6π/2
4
7π/2
4
4π/2
3
5π/2
3
7π/2
3
π
Figure 6: Magnitude squared frequency response and passband edges of the CB channel filters for the ideal case.
10
2
10
3
10
4
−20
−15
−10
−5
0
5
Frequency (Hz)
Weighting (dB)
Figure 7: Outer and middle ear transfer function W
dB
( f ).
···

, θ
1
opt
, , θ
J−1
opt

=
arg min
θ
0

1
, ,θ
J−1
P


θ
0
, θ
1
, , θ
J−1
, −
π
4

J−1


R
m
(
θ
m
)
= R
m
(
θ
m
+2π
)
. (38)
Thus, instead of searching for a globally optimized solution,
the optimal solution of (37) can be obtained by using
a bounded value global optimization program where the
bounds on θ
m
are 0 ≤ θ
m
≤ 2π,0≤ m ≤ J − 1.
The solution of the above optimization problem as in
(37) is achieved by using multilevel coordinate search (MCS),
a “bound constrained global optimization” program [16,
17]. Bound constrained global optimization problem can be
formalized as
min f
(
x

QMF of length
= 6QMFoflength= 8
Decomposition Reconstruction Decomposition Reconstruction
h[n]g[n] h
1
[n]g
1
[n] h[n]g[n] h
1
[n]g
1
[n]
0 0.49985 0.11923 0.11923 −0.49985 0.4083 −0.0501 −0.0501 −0.4083
1 0.73350 0.17497
−0.17497 0.73350 0.7608 −0.0934 0.0934 0.7608
2 0.38225
−0.14563 −0.14563 −0.38225 0.4296 0.0631 0.0631 −0.4296
3
−0.14563 −0.38225 0.38225 −0.14563 −0.0668 0.2241 −0.2241 −0.0668
4
−0.17497 0.73350 0.73350 0.17497 −0.2241 −0.0668 −0.0668 0.2241
5 0.11923
−0.49985 0.49985 0.11923 0.0631 −0.4296 0.4296 0.0631
6 0.0934 0.7608 0.7608
−0.0934
7
−0.0501 −0.4083 0.4083 −0.0501
Multiple Coordinate Search (MCS) algorithm as proposed
in [16, 17]. The MCS method combines both global search
and local search into one unified framework via multilevel

]


k
(
2t
− n
)
, (40)
where M is the support of the scaling function, k is the
iteration number, and φ
k+1
(t) denotes the kth iteration of the
scaling function with φ
0
(t) being the initial value of iteration.
From the scaling function, the wavelet ψ( t) is obtained by
using the two-scale equation for the wavelet as given in (6).
The order of the FIR filter and in turn, the support of
the wavelet, is taken into consideration from the temporal
resolution of the human auditory system. The time duration
of a wavelet ψ(t)isdefinedby[25]
Δt
=


−∞
(
t
− t


ψ
(
t
)


2
dt


−∞


ψ
(
t
)


2
dt
. (42)
Here, t
0
is the first moment of the wavelet and provides the
measure of where ψ(t) is centered along the time axis. The
time duration of wavelet Δt (41) is the root mean square
(RMS) measure of duration and gives the spread of wavelet
in time. This definition of time-duration gives a measure of

m
values, 0 ≤ θ
m
≤ 2π,0≤ m ≤
J − 1, J = (M − 2)/2 for the desired support length of M,
the coefficients of the QMF pair are obtained using (32). The
perfect reconstruction QMF pair is obtained from (10). The
filter coefficients of decomposition and synthesis QMF bank
are shown in Tabl e 1 for filter lengths M
= 6 and 8. In Figures
10(a) and 11(a), the magnitude squared frequency responses
10 EURASIP Journal on Advances in Signal Processing
0 0.2 0.4 0.6 0.8
0
0.4
0.8
1.2
1.6
2
Normalized frequency (×π rad/sample)
Magnitude squared frequency response
(a)
0 0.2 0.4 0.6 0.8
0
0.4
0.8
1.2
1.6
2
Normalized frequency (×π rad/sample)

6
7
(b)
Figure 11: The perceptually optimized wavelet and Daubechies wavelet of length M = 8; (a) the perceptually optimized wavelet, (b)
Daubechies wavelet.
of the optimal lowpass QMF and the corresponding mother
wavelet are shown for the case of M
= 8. In Figures 10(b)
and 11(b), the magnitude squared frequency responses of the
Daubechies QMF and the corresponding wavelet are shown
for the same value of M for comparison.
The magnitude squared frequency responses
|H
i
(e

)|
2
of the CB channel filters are shown in Figure 12 for the
optimal QMF withlength, M
= 8. The responses are
grouped in Figures 12(a) to 12(d) according to the depth
of decomposition of the CB channel filters. The RMS time-
duration of this optimal wavelet is found to be
2.8 ms.
The perceptually optimized wavelet is compared with
Daubechies wavelet, Symlet, and Coiflet in terms of the
energy error in CB channel impulse response. In Figure 13,
we show the energy error E
i

(a)
0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Normalized frequency (
×π rad/sample)
0
5
10
15
20
25
30
a: Critical band 9
b: Critical band 10
c: Critical band 11
d: Critical band 12
e: Critical band 13
f: Critical band 14
Magnitude squared
frequency response
f
a
b
c
d
e
(b)
0
2
4

8
a: Critical band 18
b: Critical band 19
c: Critical band 20
d: Critical band 21
a
b
c
d
(d)
Figure 12: Magnitude squared frequency response of the CB channel filters with perceptually optimized QMF of length M = 8; (a) critical
bands 1 to 8, (b) critical bands 9 to 14, (c) critical bands 15 to 17, (d) critical bands 18 to 21.
from (25) by using Daubechies QMF, Symlet QMF, Coiflet
QMF, and the perceptually motivated QMF of length M
= 6.
Figure 14 shows the energy error E
i
for Daubechies QMF,
Symlet and the perceptually motivated QMF for M
= 8. The
advantage of the perceptually optimized wavelet is clearly
visible from the Figures 13 and 14 in terms of the reduction
in energy error in the CB channel filter impulse responses in
comparison to the other wavelets.
Further, it is noticed from Figures 13 and 14 that the
energy errors in the CB channel filters for M
= 8 are less
than those for M
= 6. This is because of the better frequency
selectivity of the CB filters for the case with higher support

0.45
0.5
0.55
0.6
Critical bands
a
b
c
a: Optimal wavelet based on auditory perception
b: Daubechies wavelet and Symlet
c: Coiflet
Error in passband energy of the ith CB filter, E
i
Figure 13: Comparison of CB channel filter passband energy error,
E
i
, in the critical bands with the perceptually optimized wavelet,
Daubechies wavelet, Symlet and Coiflet of support length M
=
6.
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45

126–137, 1999.
[3] M. D. Swanson, B. Zhu, A . H. Tewfik, and L. Boney, “Robust
audio watermarking using perceptual masking,” Signal Pro-
cessing, vol. 66, no. 3, pp. 337–355, 1998.
[4] D. Sinha and A. H. Tewfik, “Low bit rate transparent audio
compression using adapted wavelets,” IEEE Transactions on
Signal Processing, vol. 41, no. 12, pp. 3463–3479, 1993.
[5] B. Carnero and A. Drygajlo, “Perceptual speech coding and
enhancement using frame-synchronized fast wavelet packet
transform algorithms,” IEEE Transactions on Signal Processing,
vol. 47, no. 6, pp. 1622–1635, 1999.
[6] S. G. Mallat, “A theory for multiresolution signal decomposi-
tion: the wavelet representation,” IEEE Transactions on Pattern
Analysis and Machine Intelligence, vol. 11, no. 7, pp. 674–693,
1989.
[7] I. Daubechies, “Orthonormal bases of compactly supported
wavelets,” Communications on Pure and Applied Mathematics,
vol. 41, no. 7, pp. 909–996, 1988.
[8] O. Rioul and M. Vetterli, “Wavelets and signal processing,”
IEEE Signal Processing Magazine, vol. 8, no. 4, pp. 14–38, 1991.
[9] I. Pinter, “Perceptual wavelet-representation of speech signals
and its application to speech enhancement,” Computer Speech
and Language, vol. 10, no. 1, pp. 1–22, 1996.
[10] J. Yao and Y. T. Zhang, “Bionic wavelet transform: a new
time-frequency method based on an auditory model,” IEEE
Transactions on Biomedical Engineering, vol. 48, no. 8, pp. 856–
863, 2001.
[11] E. Zwicker and H. Fastl, Psychoacoustics: Facts and Models,
Springer, New York, NY, USA, 2nd edition, 1999.
[12] E. Zwicker and E. Terhardt, “Analytical expression for critical-

[22] D. B. Percival and A. T. Walden, Wavelet Methods for Time
Series Analysis, Cambridge University Press, Cambridge, UK,
2000.
[23] ITU-R Recommendation BS.1387, “Method for objective
measurements of perceived audio quality,” 1998.
[24] J. M. Morris and V. Akunuri, “Minimum duration orthonor-
mal wavelets,” Optical Engineering, vol. 35, no. 7, pp. 2079–
2087, 1996.
[25] R. M. Rao and A. S. Bopardikar, Wavelet Transforms: Intro-
duction to Theory and Applications, Addison-Wesley, Reading,
Mass, USA, 1998.


Nhờ tải bản gốc
Music ♫

Copyright: Tài liệu đại học © DMCA.com Protection Status