Nh n D ng Đ i Tậ ạ ố
Nh n D ng Đ i Tậ ạ ố
ượ
ượ
ng
ng
S d ng thu t toán Adaptive Boostingử ụ ậ
S d ng thu t toán Adaptive Boostingử ụ ậ
Original Author
Original AuthorPaul Viola & Michael Jones
Paul Viola & Michael Jones
Ng iườ Trình Bày: Nguy n Đễ ăng Bình
(moving or acting with great speed)
(moving or acting with great speed)
(increase the strength or value of Sth)
(increase the strength or value of Sth)
Outline
Outline
Gi i thi u (Introduction)ớ ệ
Gi i thi u (Introduction)ớ ệ
Thu t toán Boosting cho h c phân l p ậ ọ ớ
Thu t toán Boosting cho h c phân l p ậ ọ ớ
(The Boost algorithm for classifier learning)
(The Boost algorithm for classifier learning)
L a ch n ự ọ
K t lu n (Conclusion)ế ậ
K t lu n (Conclusion)ế ậ
What had we done?
What had we done?
Ti p c n máy h c ế ậ ọ
Ti p c n máy h c ế ậ ọ
cho phát hi n và nh n d ng ệ ậ ạ
cho phát hi n và nh n d ng ệ ậ ạ
đ iố
đ iốt
t
ượ
ượ
ng tr c quanự
ng tr c quanự
Kh nả
Kh nả
ă
ă
ng x lý nh ử ả
ng x lý nh ử ả
c c kỳ nhanh (extremely rapidly)ự
c c kỳ nhanh (extremely rapidly)ự
Achieving
Speed up the feature evaluation
Discard the background regions
of the image
Discard the background regions
of the image
Working only with a single grey scale image
Working only with a single grey scale image
A demonstration on face detection
A demonstration on face detection
A frontal face detection system
A frontal face detection system
The detector run at 15 frames per second without resorting to image differencing or
The detector run at 15 frames per second without resorting to image differencing or
skin color detection
skin color detection
Image difference in video sequences
Image difference in video sequences
384 x 288 on a PentiumIII 700 MHz
384 x 288 on a PentiumIII 700 MHz
The broad practical
The broad practical
applications
applicationsfor a
for a
extremely fast face detector
Cascaded detection process
Cascaded detection process
The sub-windows are processed by a sequence
The sub-windows are processed by a sequence
of classifiers
of classifiers
each slightly more complex than the last
Any classifier rejects the sub-window,
no further processing is performed
Any classifier rejects the sub-window,
no further processing is performed
The process is essentially that of a degenerate
The process is essentially that of a degenerate
decision tree
decision tree
Our object detection framework
Our object detection framework
Original Image
Integral Image
Integral Image
In order to computing
features rapidly at many
scales
Haar Basis Functions
Haar Basis Functions
Haar Basis Functions
Haar Basis Functions
Haar Basis Functions
The Haar basis functions which have been used by
Papageorgiou et al.[9]
Three kinds of features
Three kinds of featuresFeature Selection
Feature Selection
The difference between the sum of pixels
within two rectangular regions
The difference between the sum of pixels
within two rectangular regions
Two-Rectangle Feature
Two-Rectangle Feature
The region have the same size and shape
And are horizontally or vertically adjacent
The region have the same size and shape
And are horizontally or vertically adjacent
The base resolution is 24x24
The exhaustive set of rectangle is large,
over 180,000.
The base resolution is 24x24
The exhaustive set of rectangle is large,
over 180,000.
Three-
Rectangle
Feature
Three-
Rectangle
Feature
=−
=−
+−=
+−=
yii
xs
yxsyxiiyxii
yxiyxsyxs
Integral Image
Integral Image
A intermediated representation
for rapidly computing the
rectangle features
A intermediated representation
for rapidly computing the
rectangle features
∑
≤≤
=
yyxx
yxiyxii
''
,
''
),(),(
∑
≤≤
=
yyxx
yxiyxii
1 3 8
4 10 21
11 25 45
ii
ii
+
+
3
1
4
9
Calculating any rectangle sum with
Calculating any rectangle sum with
integral image
integral image
1 A
2 A + B
3 A + C
4 A + B + C + D
1 A
2 A + B
3 A + C
4 A + B + C + D
Rectangle Sum
D = 4 - 3 - 2 + 1
Rectangle Sum
D = 4 - 3 - 2 + 1
AdaBoost learning algorithm
Is used to do the feature selection task
Learning Classification
Learner 2
The final strong classifier
The Boost
The Boost
algorithm for
algorithm for
classifier learning
classifier learning
),(, ),,(),,(
2211 nn
yxyxyx
),(, ),,(),,(
2211 nn
yxyxyx
Image
Positive =1
Negative=0
Step 1: Giving example images
Step 2: Initialize the
weights
positives. and negatives of # theare and
,1,0for
2
1
,
2
1
,1
lm
y
,
,
, t
n
j
jt
it
it
w
w
w
w
∑
=
←
ondistributiprobabity a is that so ,
1
,
,
, t
n
j
jt
it
it
w
w
w
w
∑
∑
−=
==
−
+
otherwise
correctly classified is if,
,
,
1
,,1
it
itit
e
titit
w
xw
ww
i
β
β
==
−
t
ε
ε
β
−
=
1
Training set
Weak learner constructor 圖示解說
1
w
1
w
2
w
2
w
n
w
n
w
j
f
j
f
j
f
j
f
j
−=
i
iijij
yxhw |)(|
ε
∑
−=
i
iijij
yxhw |)(|
ε
Errors
Errors
min
ε
min
ε
1
h
1
h
2
h
2
h
3
h
3
h
000,180
i
w
i
w
miss correct correct miss
t
t
itit
ww
ε
ε
−
=
+
1
,,1
t
t
itit
ww
ε
ε
−
=
+
1
,,1
Update the weights
Training the weak learner
Training the weak learner
j
j
j
jjjj
j
f
P
where
otherwise
PxfP
xh
θ
θ
<
=
feature a is
sign, inequality theofdirection theindicating
, thresholda is
0
)( if,1
)(
j
j
j
jjjj
j
f
The Boost algorithm for classifier learning
The Boost algorithm for classifier learning
),(, ),,(),,(
2211 nn
yxyxyx
),(, ),,(),,(
2211 nn
yxyxyx
Step 1: Giving example images
Step 2: Initialize the
weights
positives. and negatives of # theare and
,1,0for
2
1
,
2
1
,1
lm
y
lm
w
ii
==
positives. and negatives of # theare and
,1,0for
2
1
,
h
3+t
h
Selected the weaker classifiers
Selected the weaker classifiers
t
t
t
ε
ε
β
−
=
1
t
t
t
ε
ε
β
−
=
1
∑
−=
i
iijij
yxhw |)(|
ε
∑
2
h
10
h
10
h
Pass
False (Reject)
Ada Boosting Learner
Ada Boosting Learner
Stage 3
Stage 3
1
h
1
h
2
h
2
h
more
more
Pass
False (Reject)
Reject as many negatives as possible (minimize the false negative)
Reject as many negatives as possible (minimize the false negative)
100% Detection Rate
50% False Positive
A tremendously difficult problem
A tremendously difficult problem
Ada Boosting Learner
Stage 2
Stage 2
1
h
1
h
2
h
2
h
10
h
10
h
Pass
False (Reject)
Result
Result
A 38 layer cascaded classifier was trained to
A 38 layer cascaded classifier was trained to
detect frontal upright faces
detect frontal upright faces
Training set:
Training set:
Face
Face
Face :
Face :
4916 + the vertical mirror image
4916 + the vertical mirror image
9832 images
9832 images
Non-face sub-windows: 10,000
Non-face sub-windows: 10,000
(size=24x24)
(size=24x24)
Outline
Outline Result
Result
Speed of the final Detector
Speed of the final Detector
Image Processing
Image Processing
Scanning the Detector
Scanning the Detector
Integration of Multiple Detector
image in about
image in about
.067
.067
seconds
seconds
(using a staring scale of
(using a staring scale of
1.25 and a step size of 1.5)
1.25 and a step size of 1.5)