Phát hiện và nhận dạng đối tượng 3-D hỗ trợ sinh hoạt của người khiếm thị 3-D - Pdf 51

HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY

LE VAN HUNG

3-D OBJECT DETECTIONS AND
RECOGNITIONS: ASSISTING VISUALLY
IMPAIRED PEOPLE

Major: Computer Science
Code: 9480101

DOCTORAL DISSERTATION OF
COMPUTER SCIENCE

SUPERVISORS:
1. Dr. Vu Hai
2. Assoc. Prof. Dr. Nguyen Thi Thuy

Hanoi − 2018

HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY

LE VAN HUNG

3-D OBJECT DETECTIONS AND
RECOGNITIONS: ASSISTING VISUALLY
IMPAIRED PEOPLE

Major: Computer Science
Code: 9480101

Le Van Hung

SUPERVISORS

Dr. Vu Hai

Assoc. Prof. Dr. Nguyen Thi Thuy

i

ACKNOWLEDGEMENT
This dissertation was written during my doctoral course at International Research
Institute Multimedia, Information, Communication and Applications (MICA), Hanoi
University of Science and Technology (HUST). It is my great pleasure to thank all the
people who supported me for completing this work.
First, I would like to express my sincere gratitude to my advisors Dr. Hai Vu
and Assoc. Prof. Dr. Thi Thuy Nguyen for their continuous support, their patience,
motivation, and immense knowledge. Their guidance helped me all the time of research
and writing this dissertation. I could not imagine a better advisor and mentor for my
Ph.D. study.
Besides my advisors, I would like to thank to Assoc. Prof. Dr. Thi-Lan Le,
Assoc. Prof. Dr. Thanh-Hai Tran and members of Computer Vision Department at
MICA Institute. The colleagues have assisted me a lot in my research process as well
as they are co-authored in the published papers. Moreover, the attention at scientific
conferences has always been a great experience for me to receive many the useful
comments.
During my PhD course, I have received many supports from the Management
Board of MICA Institute. My sincere thank to Prof. Yen Ngoc Pham, Prof. Eric
Castelli and Dr. Son Viet Nguyen, who gave me the opportunity to join research

SYMBOLS

vi

LIST OF TABLES

viii

LIST OF FIGURES

xvii

1 LITERATURE REVIEW
1.1 Aided-systems for supporting visually impaired people . . . . . .
1.1.1 Aided-systems for navigation services . . . . . . . . . . . .
1.1.2 Aided-systems for obstacle detection . . . . . . . . . . . .
1.1.3 Aided-systems for locating the interested objects in scenes
1.1.4 Discussions . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2 3-D object detection, recognition from a point cloud data . . . . .
1.2.1 Appearance-based methods . . . . . . . . . . . . . . . . .
1.2.1.1 Discussion . . . . . . . . . . . . . . . . . . . . . .
1.2.2 Geometry-based methods . . . . . . . . . . . . . . . . . . .
1.2.3 Datasets for 3-D object recognition . . . . . . . . . . . . .
1.2.4 Discussions . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3 Fitting primitive shapes . . . . . . . . . . . . . . . . . . . . . . .
1.3.1 Linear fitting algorithms . . . . . . . . . . . . . . . . . . .
1.3.2 Robust estimation algorithms . . . . . . . . . . . . . . . .
1.3.3 RANdom SAmple Consensus (RANSAC) and its variations
1.3.4 Discussions . . . . . . . . . . . . . . . . . . . . . . . . . .

.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

8
8
8
9
11
12
13
13
16

2.2.3.3 Table plane detection and extraction . . .
2.2.4 Experimental results . . . . . . . . . . . . . . . . .
2.2.4.1 Experimental setup and dataset collection
2.2.4.2 Table plane detection evaluation method .
2.2.4.3 Results . . . . . . . . . . . . . . . . . . .
Separating the interested objects on the table plane . . . .
2.3.1 Coordinate system transformation . . . . . . . . . .
2.3.2 Separating table plane and the interested objects .
2.3.3 Discussions . . . . . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

3.1.5 Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 Fitting objects using the context and geometrical constraints . . . . . .
3.2.1 The proposed method of finding objects using the context and
geometrical constraints . . . . . . . . . . . . . . . . . . . . . . .
3.2.1.1 Model verification using contextual constraints . . . .
3.2.2 Experimental results of finding objects using the context and
geometrical constraints . . . . . . . . . . . . . . . . . . . . . . .
3.2.2.1 Descriptions of the datasets for evaluation . . . . . . .
3.2.2.2 Evaluation measurements . . . . . . . . . . . . . . . .
3.2.2.3 Results of finding objects using the context and geometrical constraints . . . . . . . . . . . . . . . . . . .
3.2.3 Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

iv

29
30
30
32
34
36
36
37
40
46
46
48
48

51
52

4.1.3.2 Combination of Clustering objects and Viewpoint Features
Histogram, GCSAC for estimating 3-D full object models (CVFGS) . . . . . . . . . . . . . . . . . . . . . . . 91
4.1.3.3 Combination of Deep Learning based and GCSAC for
estimating 3-D full object models (DLGS) . . . . . . . 93
4.1.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
4.1.4.1 Data collection . . . . . . . . . . . . . . . . . . . . . . 95
4.1.4.2 Evaluation method . . . . . . . . . . . . . . . . . . . . 98
4.1.4.3 Setup parameters in the evaluations . . . . . . . . . . 101
4.1.4.4 Evaluation results . . . . . . . . . . . . . . . . . . . . 102
4.1.5 Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
4.2 Deploying an aided-system for visually impaired people . . . . . . . . . 109
4.2.1 Environment and material setup for the evaluation . . . . . . . 111
4.2.2 Pre-built script . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
4.2.3 Performances of the real system . . . . . . . . . . . . . . . . . . 114
4.2.3.1 Evaluation of finding 3-D objects . . . . . . . . . . . . 115
4.2.4 Evaluation of usability and discussion . . . . . . . . . . . . . . . 118
5 CONCLUSION AND FUTURE WORKS
121
5.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
5.2 Future works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
Bibliography

125

PUBLICATIONS

139

v

False Negative

5

FP

False Positive

6

FPFH

Fast Point Feature Histogram

7

fps

f rame per second

8

GCSAC

Geometrical Constraint SAmple Consensus

9

GPS

Jaccard Index

15

KDES

Kernel DEScriptors

16

KNN

K Nearest Neighbors

17

LBP

Local Binary Patterns

18

LMNN

Large Margin Nearest Neighbor

19

LMS

Maximum Likelihood Estimation SAmple Consensus

25

MS

MicroSoft

26

MSAC

M-estimator SAmple Consensus

27

MSI

Modified Plessey

28

MSS

Minimal Sample Set

29

NAPSAC

34

OPENCV

OPEN source Computer Vision Library

35

PC

Persional Computer

36

PCA

Principal Component Analysis

37

PCL

Point Cloud Library

38

PROSAC

PROgressive SAmple Consensus

44

SDK

Software Development Kit

45

SHOT

Signature of Histograms of OrienTations

46

SIFT

Scale-Invariant Feature Transform

47

SQ

SuperQuadric

48

SURF

Speeded Up Robust Features

54

URL

Uniform Resource Locator

55

USAC

A Universal Framework for Random SAmple Consensus

56

VFH

Viewpoint Feature Histogram

57

VIP

Visually Impaired Person

57

VIPs

Visually Impaired People

results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

75

Table 3.3 Experimental results on the ’second cylinder’ dataset. The experiments were repeated 20 times, then errors are averaged. . . . . . . . .

75

Table 3.4 The average evaluation results on the ’second sphere’, ’second
cone’ datasets. The real datasets were repeated 20 times for statistically
representative results. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

76

Table 3.5 Average results of the evaluation measurements using GCSAC and
MLESAC on three datasets. The fitting procedures were repeated 50
times for statistical evaluations. . . . . . . . . . . . . . . . . . . . . . .

83

Table 4.1

The average result detecting spherical objects on two stages. . . . 102

Table 4.2 The average results of detecting the cylindrical objects at the first
stage in both the first and second datasets. . . . . . . . . . . . . . . . . 103
Table 4.3 The average results of detecting the cylindrical objects at the
second stage in both the first and second datasets. . . . . . . . . . . . . 106
Table 4.4 The average processing time of detecting cylindrical objects in
both the first and second datasets. . . . . . . . . . . . . . . . . . . . . 106

feature based method [53]. . . . . . . . . . . . . . . . . . . . . . . . . .

13

Figure 1.2 Illustration of primitive shapes extraction from the point cloud
[144] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

17

Figure 1.3

Illustration of the Least squares process. . . . . . . . . . . . . .

19

Figure 1.4

Line presentation in image space and in Hough space [126]. . . .

20

Figure 1.5

Illustration of line estimation by RANSAC algorithm. . . . . . .

21

Figure 1.6

Diagram of RANSAC-based algorithms. . . . . . . . . . . . . . .

30

Figure 2.6 (a) Computing the depth value of the center pixel based on its
neighborhoods (within a (3 × 3) pixels window); (b) down sampling of
the depth image. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

32

ix

Figure 2.7 Illustration of estimating the normal vector of a set point in the
3-D space. (a) a set of points; (b) estimation of the normal vector of a
black point; (c) selection of two points for estimating a plane; (d) the
normal vector of a black point. . . . . . . . . . . . . . . . . . . . . . . .

33

Figure 2.8

Illustration of point cloud segmentation process. . . . . . . . . .

33

Figure 2.9 Example of plane segmentation (a) color image of the scene; (b)
plane segmentation result with PROSAC in a our publication; (c) plane
segmentation result with the organized point cloud. . . . . . . . . . . .

35

Figure 2.16 Detailed results for each scene of the three plane detection methods on our dataset: (a) using the first evaluation measure; (b) the second
evaluation measure and (c) the third evaluation measure. . . . . . . . .

42

Figure 2.17 Illustration of floor plane is segmented to the multiple planes. .

43

Figure 2.18 Results of table detection with our dataset (two first rows) and
the dataset in [117] (two bottom rows). Table plane is limited by the
red color boundary in image and by green color points in point cloud.
Arrow with red color is normal vector of detected table. . . . . . . . .

44

x

Figure 2.19 Top line is an example detection that is defined as true detection
if using the two first evaluation measures and as false detection if using
the third evaluation measure: (a) color image; (b) point cloud of the
scene; (c) the overlap area between the 2-D contour of detected table
plane and the table plane ground-truth. Bottom line is an example of
missing case with our method (a) color image, (b) point cloud of the
scene. After down sampling, the number of points belonging to table is
276 that is lower than our threshold. . . . . . . . . . . . . . . . . . . .

45

56

Figure 3.3 Geometrical parameters of a cylindrical object. (a)-(c) Explanation of the geometrical analysis to estimate a cylindrical object. (d)-(e)
Illustration of the geometrical constraints applied in GCSAC. (f) Result
of the estimated cylinder from a point cloud. Blue points are outliers,
red points are inliers. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

59

Figure 3.4 (a) Setting geometrical parameters for estimating a cylindrical
object from a point cloud as described above. (b) The estimated cylinder
(green one) from an inlier p1 and an outlier p2 . As shown, it is an
incorrect estimation. (c) Normal vectors n1 and n∗2 on the plane π are
specified. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

60

xi

Figure 3.5 Estimating parameters of a sphere from 3-D points. Red points
are inlier points. In this figure, p1 , p2 are the two selected samples for
estimating a sphere (two gray points), they are outlier points. Therefore,
the estimated sphere is with wrong centroid and radius (see green sphere
(left bottom panel)). . . . . . . . . . . . . . . . . . . . . . . . . . . . .

62

Figure 3.6 Estimating parameters of a cone from 3-D points using the geometrical analysis proposed in [131]; (a) Point cloud with three samples

Figure 3.11 The average number of iterations of GCSAC and MLESAC on
the synthesized dataset when were repeated 50 times for statistically
representative results. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

70

Figure 3.12 Decomposition of residual density distribution: inlier (blue) and
outlier (red) density distributions of a synthesized point cloud with 50
inliers. (a) Noises are added by a uniform distribution. (b) Noises are
added by a Gaussian distribution µ = 0, σ = 1.5. In each subfigure, leftpanel shows the distribution of an axis (e.g., x-axis), right-panel shows
the corresponding point cloud . . . . . . . . . . . . . . . . . . . . . . .

70

xii

Figure 3.13 An illustration of GCSAC’s at a k th iteration to estimate a coffee
mug in the second dataset. Left: the fitting result with a random MSS.
Middle: the fitting result where the random samples are updated due to
the application of the geometrical constrains. Right: the current best
model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

71

Figure 3.14 The best estimated model using GCSAC (a) and MLESAC (b)
with 50% inlier synthesized point cloud. In each sub-figure, two different
view-points are given. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

71

Figure 3.20 Illustrating the fitting cone of GCSAC and some RANSAC variations on the synthesized datasets, which have 15% inlier ratio. Red
points are inlier points, blue points are outlier points. The estimated
cones is the green cones. . . . . . . . . . . . . . . . . . . . . . . . . . .

74

Figure 3.21 Result fitting of some instances collected from the real datasets.
(a) A coffee- mug; (b) A toy ball; (c) A cone object. In each subfigure: left-panel is RGB image for a reference, right-panel is fitting
result. Ground-truths are marked as red points; the estimated objects
are marked as green points. . . . . . . . . . . . . . . . . . . . . . . . .

76

xiii

Figure 3.22 Illustrations of correct a correct (a) and incorrect estimation without using the verification scheme. On each sub-figure: Left panel: point
cloud data; Middle panel: the normal vector of each point; Right panel:
the estimated model. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

79

Figure 3.23 The histogram of deviation angle with the x-axis (1, 0, 0) of a real
dataset in the bottom panel of Fig. 3.22; (b) the histogram of deviation
angle with the x-axis (1, 0, 0) of a generated cylinder dataset in the top
panel of Fig. 3.22. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

79

Figure 3.30 Angle errors Ea of the fitting results using GCSAC with and
without using the context’s constraint. . . . . . . . . . . . . . . . . . .

84

Figure 3.31 Extracting the fitting results of the video on the scene 1th of the
first dataset. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

84

Figure 4.1 Top-panel: the procedures of PSM method. Bottom-panel illustrated the result of each step. . . . . . . . . . . . . . . . . . . . . . . .

91

xiv

Figure 4.2 A result of object clustering when using the method of Qingming
et al. [110]. (a) RGB image; (b) the result of objects clustering projected
to the image space. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

92

Figure 4.3 (a): Presentation of neighbors of Pq [123]. (b): Estimating parameters of PFH descriptor [124] are as Eq. 4.2. . . . . . . . . . . . . .

93

Figure 4.4

Illustrating of training phase of CVFGS method. . . . . . . . . .

97

Figure 4.11 Illustration mounted a Laptop on the VIPs. . . . . . . . . . . .

98

Figure 4.12 Illustration of type objects and scenes in our dataset and published dataset [68]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

98

Figure 4.13 Illustration of the size of table. . . . . . . . . . . . . . . . . . . .

99

Figure 4.14 Illustrating of detecting spherical objects evaluation. The left
column is the result of detecting spherical objects on the RGB image by
YOLO CNN. The middle column is the result of the estimated spherical
object from the point cloud of the detected object in the left column.
The right column is the result of spherical object detection when project
the estimated sphere to the RGB image. . . . . . . . . . . . . . . . . .

99

Figure 4.15 Illustration of computing the deviation angle of the normal vector
of table plane yt and the estimated cylinder axis γc . . . . . . . . . . . . 101

xv

xvi

111

Figure 4.25 Illustration of the set up system on the VIPs. . . . . . . . . . . 112
Figure 4.26 Illustrating of the VIPs come to the sharing-room or Kitchen to
find the spherical objects or cylindrical objects on the table. . . . . . . 113
Figure 4.27 Illustration of trajectory of the visually impaired. . . . . . . . . 113
Figure 4.28 Illustration of object detection in the RGB images. . . . . . . . 114
Figure 4.29 Full trajectory of a volunteer in the experiments. First row:
scene collected by a surveillance camera Frame ID is noted above each
panel. Second row: Results of YOLO detection on the RGB images.
Last row: Results of the model fitting and object’s descriptions(e.g.,
position, radius) are given. There are five spherical-like objects in this
scene. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
Figure 4.30 Illustration of scenes and objects when VIPs move in the real
environment of sharing-room, kitchen. . . . . . . . . . . . . . . . . . . 115
Figure 4.31 Computation the occluded data. Rg is the area of objects; Rs is
the area of visible area. . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
Figure 4.32 Illustration of results on the occluded data. . . . . . . . . . . . . 117
Figure 4.33 The table plane is a wrong detection. . . . . . . . . . . . . . . . 117
Figure 4.34 The estimated spherical object is a wrong. . . . . . . . . . . . . 118
Figure 4.35 Illustration of using GCSAC estimated the spherical object on
the missing data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
Figure 4.36 Illustration of the inclined cylinder. . . . . . . . . . . . . . . . . 119
Figure 5.1

Illustration the problem solved in the our dissertation [93], [131]. 123

complex objects are out-of-scope-of the dissertation. In addition, we observe that the
prior knowledge observed from the current scene such as a cup normally stands on
the table, contextual information such as walls in the scene to be perpendicular to the
table plane; the size/height of the queried objects is limited, would be valuable cues to
improve the system performances.

1

Figure 1 Illustration of a real scenario: a VIP comes to the Kitchen and gives a
query: ”Where is a coffee cup? ” on the table. Left panel shows a Kinect mounted on
the human’s chest. Right panel: the developed system is build on a Laptop PC.
More generic, we realize that the queried-objects could be located through simplifying geometric shapes: planar segments (boxes), cylinders (coffee mugs, soda cans),
sphere (balls), cones, but not utilizing conventional 3-D features. Approaching these
ideas, a pipeline of the work ”3-D Object Detection and Recognition for Assisting Visually Impaired People” is proposed in this dissertation. The proposed framework consists
of several tasks including: (1) separating the queried objects from a table plane; (2)
detecting candidates of the interested objects using appearance features; and (3) estimating a model of the queried-object from a 3-D point cloud. Instead of matching
the queried-objects into 3-D models as conventional learning-based approaches do, this
research work focuses on constructing a simplified geometrical model of the queriedobject from an unstructured set of point clouds collected by a RGB and depth sensor,
wherein the last step plays the most important role.

Objective
In this dissertation, we aim to propose a robust 3-D object detection and recognition system. As a feasible solution to deploy a real application, the proposed framework
should be simple, robust and friendly to the VIPs. However, it is necessary to notice
that there are critical issues that might affect the performance of the proposed system. Particularly, some of them are: (1) objects are queried in a complex scene where
cluster and occlusion issue may appear; (2) noises from collected data; and (3) high
computational cost due to huge number of points in a cloud data. Although in the
literature, a number of relevant works of 3-D object detection and recognition has been
attempted for a long time. In this study, we will not attempt to solve these issues
separately. Instead of that, we aim to generate an unified solution. To this end, the

around the table. This is to collect the data of the environment.
– A MS Kinect sensor captures RGB and Depth images at a normal frame
rate (from 10 to 30 fps) [95] with image resolution of 640×480 pixels for

3

both of those image types. With each frame obtained from Kinect an acceleration vector is also obtained. Because MS Kinect collects the images in a
range from 10 to 30 fps, , it fits well with the slow movements of the VIPs
(∼ 1 m/s). Although collecting image data via a wearable sensor can be
affected by subject’s movement such as image blur, vibrations in the practical situations, there are no specifically requirements for collecting the image
data. For instance, VIPs are not required to be stranded before collecting
the image data.
– Every queried object needs to be placed in the visible area of a MS Kinect
sensor, which is in a distance of 0.8 to 4 meter and an angle of 300 around
the center axis of the MS Kinect sensor. Therefore, the distance constraint
from the VIPs to the table is also about 0.8 to 4m.
❼ Interested (or queried) objects are assumed to have simple geometrical structures.

For instance, coffee mugs, bowls, jars, bottles, etc have cylindrical shape, whereas
ball(s) have spherical shape; a cube shape could be boxes, etc. They are idealized
and labeled. The modular interaction between a VIP and the system has not been
developed in the dissertation.
❼ Once a VIP wants to query an object on the table, he/she should stand in front

of the table. This ensures that the current scene is in the visible area of a MS
Kinect sensor and can move around the table. The proposed system computes and
returns the object’s information such as position, size and orientation. Sending
such information to senses (e.g., audible information, on a Braille screen, or by a
vibrating type) is out of the scope of this dissertation.

relevant task in 2-D image.

Contributions
Throughout the dissertation, the main objectives are addressed by an unified
solution. We achieve following contributions:
❼ Contribution 1: Proposed a new robust estimator that called (GCSAC - Geometrical

Constraints SAmple Consensus) for estimation of primitive shapes from the
point cloud of the objects. Different from conventional RANSAC algorithms
(RANdom SAmple Consensus), GCSAC selects the uncontaminated (so-called
the qualified or good) samples from a set data of points using the geometrical
constraints. Moreover, GCSAC is extended by utilizing the contextual constraints
to validate results of the model estimation.
❼ Contribution 2: Proposed a comparative study on three different approaches

for recognizing the 3-D objects in a complex scene. Consequently, the best one
is a combination of deep-learning based technique and the proposed robust estimator(GCSAC). This method takes recent advantages of object detection using
a neural network on RGB image and utilizes the proposed GCSAC to estimate
the full 3-D models of the queried objects.
❼ Contribution 3: Deployed a successfully system using the proposed methods for

detecting 3-D primitive shape objects in a lab-based environment. The system
combined the table plane detection technique and the proposed method of 3-D
objects detection and estimation. It achieved fast computation for both tasks of
locating and describing the objects. As a result, it fully supports the VIPs in
grasping the queried objects.

5

Fitting 3-D objects

Candidates

Figure 3 A general framework of detecting the 3-D queried objects on the table of the
VIPs.

General framework and dissertation outline
In this dissertation, we propose an unified framework of detecting the queried 3-D
objects on the table for supporting the VIPs in an indoor environment. The proposed
framework consists of three main phases as illustrated in Fig. 3. The first phase is
considered as a pre-processing step. It consists of point cloud representation from the
RGB and Depth images and table plane detection in order to separate the interested
objects from a current scene. The second phase aims to label the object candidates
on the RGB images. The third phase is to estimate a full model from the point cloud
specified from the first and the second phases. In the last phase, the 3-D objects are
estimated by utilizing a new robust estimator GCSAC for the full geometrical models.
Utilizing this framework, we deploy a real application. The application is evaluated
in different scenarios including data sets collected in lab environments and the public
datasets. Particularly, these research works in the dissertation are composed of six
chapters as following:
❼ Introduction: This chapter describes the main motivations and objectives of the

study. We also present critical points the research’s context, constraints and
challenges, that we meet and address in the dissertation. Additionally, the general
framework and main contributions of the dissertation are also presented.
❼ Chapter 1: A Literature Review: This chapter mainly surveys existing aided

systems for the VIPs. Particularly, the related techniques for developing an
aided system are discussed. We also presented the relevant works on estimation

Nhờ tải bản gốc

Tài liệu, ebook tham khảo khác

Phát hiện và nhận dạng đối tượng 3-D hỗ trợ sinh hoạt của người khiếm thị 3-D - Pdf 51

Tài liệu, ebook tham khảo khác

Học thêm