Decision-theoretic rough sets - AbsoluteAstronomy.com

Decision-theoretic rough sets (DTRS) is a probabilistic extension of rough set

Rough set

In computer science, a rough set, first described by a Polish computer scientist Zdzisław I. Pawlak, is a formal approximation of a crisp set in terms of a pair of sets which give the lower and the upper approximation of the original set...

classification. First created in 1990 by Dr. Yiyu Yao, the extension makes use of loss functions to derive and region parameters. Like rough sets, the lower and upper approximations of a set are used.

Conditional Risk

Using the Bayesian decision procedure, the decision-theoretic rough set (DTRS) approach allows for minimum
risk decision making based on observed evidence. Let

be a finite set of

possible actions and let

be a finite set of

states.

is
calculated as the conditional probability of an object

being in state

given the object description

denotes the loss, or cost, for performing action

when the state is

.
The expected loss (conditional risk) associated with taking action

is given
by:

Object classification with the approximation operators can be fitted into the Bayesian decision framework. The
set of actions is given by

, where

, and

represent the three
actions in classifying an object into POS(

), NEG(

), and BND(

) respectively. To indicate whether an
element is in

or not in

, the set of states is given by

. Let

denote the loss incurred by taking action

when an object belongs to

, and let

denote the loss incurred by take the same action when the object
belongs to

Loss Functions

Let

denote the loss function for classifying an object in

into the POS region,

denote the loss function for classifying an object in

into the BND region, and let

denote the loss function for classifying an object in

into the NEG region. A loss function

denotes the loss of classifying an object that does not belong to

into the regions specified by

.

Taking individual can be associated with the expected loss

actions and can be expressed as:

,

where

, and

, or

Minimum Risk Decision Rules

If we consider the loss functions

and

, the following decision rules are formulated (P, N, B):

P: If and , decide POS();
N: If and , decide NEG();
B: If , decide BND();

where,

.

The

, and

values define the three different regions, giving us an associated risk for classifying an object. When

, we get

and can simplify (P, N, B) into (P1, N1, B1):

P1: If , decide POS();
N1: If , decide NEG();
B1: If , decide BND().

When

, we can simplify the rules (P-B) into (P2-B2), which divide the regions based solely on

P2: If , decide POS();
N2: If , decide NEG();
B2: If , decide BND().

Data mining

Data mining , a relatively young and interdisciplinary field of computer science is the process of discovering new patterns from large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics and database systems...

, feature selection

Feature selection

In machine learning and statistics, feature selection, also known as variable selection, feature reduction, attribute selection or variable subset selection, is the technique of selecting a subset of relevant features for building robust learning models...

, information retrieval

Information retrieval

Information retrieval is the area of study concerned with searching for documents, for information within documents, and for metadata about documents, as well as that of searching structured storage, relational databases, and the World Wide Web...

, and classifications are just some of the applications in which the DTRS approach has been successfully used.

External links

The source of this article is wikipedia, the free encyclopedia. The text of this article is licensed under the GFDL.