**Decision tree learning**is one of the predictive modelling approaches used in statistics, data mining and

**machine learning**. It uses a decision tree (as a predictive model) to go from observations about an item (represented in the branches) to conclusions about the item's target value (represented in the leaves).

**Principled Criterion**

Selection of an attribute to test at each node- choosing the most useful attribute for classifying examples.

Information gain

- measures how well a given attribute separates the training examples according to their target classification.

- This measure is used to select among the candidate attributes at each step while growing the tree.

- Gain is measure of how we can reduce uncertainty (Value lies between 0,1).

**Entropy :-**

A measure for -

- uncertainty

- purity

- information content

Information theory: optimal length code assigns (-log2p) bits to message having probability p

S is a sample of training examples

- p+ is the proportion of positive examples in S

- P_ is the proportion of negative examples in S

Entropy of S : average optimal number of bits to encode information about certainty/uncertainty about S

S is a sample of training examples

p+ is the proportion of positive of positive example

p_ is the proportion of negative examples

Entropy measures the impurity of S

Entropy(S) = -p+log2p+ - p_ log2p_

The entropy is 0 if the outcome is "certain".

The entropy is maximum if we have no knowledge of the system (or any outcome is equally possible).

**Information Gain**

Gain (S,A): expected reduction in entropy due to partitioning S on attribute A.

Entropy ([21+,5-]) = o.17

Entropy ([8+,30-]) = 0.74

Gain (S,A1) = Entropy(S)

-26/64*Entropy([21+,5-])

-38/64*Entropy([8+,30-])

= 0.27

Entropy ([18+,33-]) = o.94

Entropy ([8+,30-]) = 0.62

Gain (S,A2) = Entropy(S)

-51/64*Entropy([18+,33-])

-13/64*Entropy([11+,2-])

= 0.12

**Training Example :-**

**Selecting the next Attributes**

The information gain values for the 4 attributes are:

Gain(S,Outlook) = 0.247

Gain(S, Humidity) = 0.151

Gain(S,Wind) = 0.048

Gain(S,Temperature) = 0.029

where S denotes the collection of training examples

**Splitting Rule: GINI Index**

GINI Index

- Measure of node impurity

**Splitting Based on Continuous Attributes**

**Continuous Attribute - Binary Split**

For continuous attribute

- Partition the continuous value of attribute A into a discrete set of intervals

- Create a new Boolean attribute Ac , looking for a threshold c,

- Consider all possible splits and finds the best cut.

## No comments:

## Post a Comment