Nghiên cứu và áp dụng kỹ thuật tự động hóa tiên tiến vào tóm tắt tự động văn bản - Pdf 39

VIETNAM NATIONAL UNIVERSITY, HANOI
UNIVERSITY OF ENGINEERING AND TECHNOLOGY

DO THUY DUONG

RESEARCH AND APPLY EVOLUTIONARY
COMPUTATION TECHNIQUES ON
AUTOMATIC TEXT SUMMARIZATION

MASTER THESIS IN INFORMATION TECHNOLOGY

HANOI - 2015


VIETNAM NATIONAL UNIVERSITY, HANOI
UNIVERSITY OF ENGINEERING AND TECHNOLOGY

DO THUY DUONG

RESEARCH AND APPLY EVOLUTIONARY
COMPUTATION TECHNIQUES ON
AUTOMATIC TEXT SUMMARIZATION
Field:

Information technology

Major:

Software Engineering

Code:

Date:
……………………………………………………………………………………


4

Acknowledgements
I am heartily thankful to my supervisor, Prof. Nguyen Xuan Hoai, whose
encouragement, guidance and support from the initial to the final level have
enabled me to develop an understanding of the topic.
I would like to show my gratitude to the teachers in the University of
Engineering and Technology, Vietnam National University, Hanoi for helping
me to gain a large body of knowledge during my two years of studying.
Lastly, I offer my regards and blessings to my friends and my family, who have
always encouraged me so that I could finish this challenging research.


5

Contents
Declaration of authorship ...................................................................................... 3
Acknowledgements ............................................................................................... 4
Contents ................................................................................................................. 5
List of figures ........................................................................................................ 7
List of tables .......................................................................................................... 8
1.

Chapter 1 ....................................................................................................... 9

Introduction ........................................................................................................... 9

Types of text summarization ......................................................... 12

2.1.3.

Methodologies for automatic text summarization ........................ 15

2. 2.

Evolutionary computation ................................................................... 16

2. 3.

Differential evolution (DE) ................................................................. 19

2. 4.

Conclusion ........................................................................................... 26

Chapter 3 ..................................................................................................... 27

Automatic text summarization using differential evolution algorithm ............... 27
3. 1.

Automatic text summarization using differential evolution (DE)....... 27

3.1.1.

Document collection representation.............................................. 27

3.1.2.

Chapter 4 ..................................................................................................... 47

Conclusion and future work ................................................................................ 47

5.

4. 1.

Contributions ....................................................................................... 47

4. 2.

Future work ......................................................................................... 47

Reference ..................................................................................................... 48


7

List of figures
Figure 2.1. A typical summarization system....................................................... 12
Figure 2.2. A summarizer highlights all sentences included in an extractive
summary .............................................................................................................. 13
Figure 2.3. An example of the abstract summary ............................................... 14
Figure 2.4. Multi-document summarization ....................................................... 15
Figure 2.5. The general scheme of an Evolutionary Algorithm in pseudo-code 17
Figure 2.6. General scheme of evolutionary algorithms ..................................... 18
Figure 2.7. Correlation between number of generations and best fitness in
population ............................................................................................................ 19
Figure 2.8. Steps of differential evolution algorithm .......................................... 20

Table 3.6. Parameter settings of the second experiment ..................................... 42
Table 3.7. Summary lengths of some document collections in DUC2004 using
[MultiDE] method ............................................................................................... 44
Table 3.8. Summary lengths of some document collections in DUC2007 using
[MultiDE] method ............................................................................................... 44
Table 3.9. F-Values of three evaluation measures of method [MultiDE] on
DUC2004 and DUC2007 .................................................................................... 45


9

1. Chapter 1
Introduction
Automatic text summarization means detecting important and condensed
contents in one or more documents. This is a very challenging problem, relating
to many scientific areas such as artificial intelligence, statistics, linguistics, etc.
Many researches have been conducted world wide since 1950 and produced
some systems such as SUMMARIST, SweSUM, MEAD, SUMMON, etc.
However, this research area is still challenging and attracts more and more
attention.
In this thesis, we are going to study some evolutionary computation techniques,
then apply the differential evolution algorithm to the practical problem:
automatic text summarization, in particular, multi-document summarization.
Moreover, we also attempt to deal with constraint on the summary length that
has not been handled effectively in these stochastic popular-based methods.
1. 1. Motivation
Evolutionary computation techniques use different algorithms to evolve a
population of individuals over a certain number of generations. These
population are applied with operations on such as mutation, crossover and
selection to reproduce new offspring, which then compete with each other and

future research directions in this field.




Nhờ tải bản gốc

Tài liệu, ebook tham khảo khác

Music ♫

Copyright: Tài liệu đại học © DMCA.com Protection Status