Tài liệu Computational Intelligence In Manufacturing Handbook P19 doc - Pdf 86

Khoo, Li-Pheng et al "RClass*: A Prototype Rough-Set and Genetic Algorithms Enhanced
Multi-Concept Classification System for Manufacturing Diagnosis"
Computational Intelligence in Manufacturing Handbook
Edited by Jun Wang et al
Boca Raton: CRC Press LLC,2001

©2001 CRC Press LLC

19

RClass

*

: A Prototype
Rough-Set and Genetic
Algorithms Enhanced
Multi-Concept
Classification System
for Manufacturing

Diagnosis

19.1 Introduction

19.2 Basic Notions

19.3 A Prototype Multi-Concept Classiﬁcation System

19.4 Validation of RClass

Li-Pheng Khoo

Nanyang Technological University

Lian-Yin Zhai

Nanyang Technological University

©2001 CRC Press LLC

Many theories, techniques, and algorithms have been developed to deal with the analysis of imprecise
or inconsistent data in recent years. The most successful ones are fuzzy set theory and Dempster–Shafer
theory of evidence. On the other hand,

rough set

theory, which was introduced by Pawlak [1982] in the
early 1980s, is a new mathematical tool that can be employed to handle uncertainty and vagueness.
Basically, rough set handles inconsistent information using two approximations, namely the upper and
lower approximations. Such a technique is different from fuzzy set theory or Dempster–Shafer theory of
evidence. Furthermore, rough set theory focuses on the discovery of patterns in inconsistent data sets
obtained from information sources [Slowinski and Stefanowski, 1989; Pawlak, 1996] and can be used as
the basis to perform formal reasoning under uncertainty, machine learning, and rule discovery [Ziarko,
1994; Pawlak, 1984; Yao et al., 1997]. Compared to other approaches in handling uncertainty, rough set
theory has its unique advantages [Pawlak, 1996, 1997]. It does not require any preliminary or additional
information about the empirical training data such as probability distribution in statistics; the basic
probability assignment in the Dempster–Shafer theory of evidence; or grades of membership in fuzzy
set theory [Pawlak et al., 1995]. Besides, rough set theory is more justiﬁed in situations where the set of
empirical or experimental data is too small to employ standard statistical method [Pawlak, 1991].
In less than two decades, rough set theory has rapidly established itself in many real-life applications

, a prototype multi-concept classiﬁcation system for manufacturing diagnosis.
RClass

*

is based on a hybrid technique that combines the strengths of rough set, genetic algorithms, and
Boolean algebra. In the following sections, the basic notions of rough set theory and GAs are presented.
Details of RClass

*

, its validation, and a case study using the prototype system are also described.

19.2 Basic Notions

19.2.1 Rough Set Theory

Large amounts of applications of rough set theory have proven its robustness in dealing with uncertainty
and vagueness, and many researchers attempted to combine it with other inductive learning techniques
to achieve better results. Yasdi [1995] combined rough set theory with neural network to deal with
learning from imprecise training data. Khoo et al. [1999] developed RClass

*

, a prototype system based
on rough sets and a decision-tree learning methodology, and the predecessor of RClass

*

=

〈
U, Q, V,

ρ
〉

©2001 CRC Press LLC

where

U

is the

universe

which contains a ﬁnite set of objects, Q

:

U
×

Q

→

V

is the information function such that

ρ

(

x, q

)

∈

for every

q

and

v

∈

V

q

is called a

descriptor

in

S.

Table 19.1 shows a typical information system used for rough set analysis with

x

i

s
(

1, 2

) denoting the

condition attributes

; and

d

representing the

decision attribute

. As a result,

q

i

s and

d

form the set of attributes,

Q

.
More speciﬁcally,

Indiscernibility

is one of the most important concepts in rough set theory. It is caused by imprecise
information about the observed objects. The

indiscernibility relation

(

R

) is an equivalence relation on
the set

U

and can be deﬁned in the following manner:

If

x, y
∈
U

.

Mathematically, it can be expressed as follows
For example, using the information system given in Table 19.1, objects

x

5

and

x

7

are indiscernible by
the set of attributes

P

= {

q

1

2

111

x

3

121

x

4

000

x

5

010

x

6

021

x

{}
=
{}
=
{}{ }{}
{}
12 10
12
12
01 012 01
,
,,
,, ,,,,,,.
;
; and
ρ
xq
11
1,
()
=
{}
xPy
ˆ
if for
ρρ
xq yq q P,, .
()
=
()

=

Q

,
these

Q

-elementary sets are known as the

atoms

in

S

. In an information system,

concepts

can be represented
by the

decision

-elementary sets. For example, using the information system depicted in Table 19.1, the
{

4
,x
5
,x
6
,x
7
,x
8
,x
10
}for
ρ
(x,
q
1
) = {0}
Atoms
A
1
= {x
1
, x
9
} A
2
= {x
2
} A
3

1
,x
4
,x
5
,x
8
,x
9
,x
10
} ⇒ Class = 0 (d = 0)
C
2
= {x
2
,x
3
,x
6
,x
7
} ⇒ Class = 1 (d = 1)
Table 19.1 shows that objects
x
5

and
x
7

where U/R represents the set of all atoms in the approximation space (U, R). Elements belonging only
to the upper approximation compose the boundary region (BN
R
) or the doubtful area. Mathematically, a
boundary region can be expressed as
A boundary region contains a set of objects that cannot be certainly classiﬁed as belonging to or not
belonging to concept C by a set of attributes, R. Such a concept, C, is called a rough set. In other words,
rough sets are sets having non-empty boundary regions.
ρρ
xqq xqq
512 712
10,, ,, ,.
()
=
()
=
{}
ˆ
P
RC Y U RY C
()
=∈ ⊆
{}
U /.:
RC Y U RY C
()
=∈ ∩≠∅
{}
U / :
BN C R C R C

, and objects
x
6
and x
8
, exist in the data set. These conﬂicts cause the objects to be indiscernible and constitute doubtful
areas, which are denoted by BN
R
(0) or BN
R
(1), respectively (Figure 19.1). The lower approximation of
concept “0” is given by object set {x
1
,x
4
,x
9
,x
10
}, which forms the certain training data set of concept “0.”
On the other hand, the upper approximation is represented by object set {x
1
,x
4
,x
5
,x
6
,x
7

RC xxxxxxxx
1 145678910
()
=
{}
,,,,,,, .
BN C R C R C x x x x
R 1 1 1 5678
()
=
() ()
=
{}
–,,,.
RC x x
RC xxxxxx
BNC RC RC xxxx
R
223
2 235678
2225678
()
=
{}
()
=
{}
()
=
() ()

7
2
3
10
Generation of a random population of
chromosomes
Computation of the fitness of individual
chromosome
Selection of chromosomes with good
fitness
Reproduction of next generation of
chromosomes/population
No
Yes
End
Start
Limit on number of
generation reached?
©2001 CRC Press LLC
the next generation of chromosomes. More speciﬁcally, each chromosome is evaluated according to a
given performance criterion or ﬁtness function, and assigned a ﬁtness score. Using the ﬁtness value attained
by each chromosome, good chromosomes are selected to undergo reproduction. Reproduction involves
the creation of offspring using two operators namely crossover and mutation (Figure 19.3). By randomly
selecting a common crossover site on two parent chromosomes, two new chromosomes are produced.
During the process of reproduction, mutation may take place. For example, the binary value of bit 2 in
Figure 19.3 has been changed from 0 to 1. The above process of ﬁtness evaluation, chromosome selection,
and reproduction of next generation of chromosomes continues for a predetermined number of gener-
ations or until an acceptable performance level is reached.
19.3 A Prototype Multi-Concept Classiﬁcation System
19.3.1 Twin-Concept and Multi-Concept Classiﬁcation

New chromosome 2
1 0 0001
Before Mutation
1 1 0001
After Mutation
Crossover site
Before Crossover After Crossover
Crossover
Mutation
⇒
⇒

Nhờ tải bản gốc

Tài liệu, ebook tham khảo khác

Tài liệu Computational Intelligence In Manufacturing Handbook P19 doc - Pdf 86

Tài liệu, ebook tham khảo khác

Học thêm