Trending Technology Machine Learning, Artificial Intelligent, Block Chain, IoT, DevOps, Data Science

Recent Post

Search

Sunday, 10 June 2018

Evaluation and Cross-Validation

Experimental Evaluation of Learning Algorithms :-

Evaluating the performance of learning systems is important because :
  - Learning systems are usually designed to predict the class of "future" unlabeled data points.
Typical choices for performance Evaluation
  - Error 
  - Accuracy
  - Precision/Recall
Typical choices for Sampling Methods :
  - Train/Test Sets
  - K-Fold Cross-validation

Evaluating predictions

Suppose we want to make a prediction of a value for a target feature on example x :
  - y is the observed value of target feature on example x.
  - Ŷ is the predicted value of target feature on example x.
  - How is the error measured?

Sample Error and True Error

The sample error of hypothesis f with respect to target function c and data sample S is :
       errors(f) = 1/n ⅀xεsD (f(x),C(x))
The true error (denoted error (f)) of  hypothesis f with respect to target function c and distribution D, is the probability that h will misclassify an instance drawn at random according to D.
          errors(f) = Pr xεD(f(x)≠C(x))

Difficulties in evaluating hypotheses with limited data

Bias in the estimate : The sample error is a poor estimate of true error
  - ==> test the hypothesis on an independent test set 
We divide the example into:
  - Training examples that are used to train the learner
  - Test examples that are used to evaluate the learner
Variance in the estimate : The smaller the test set, the greater the expected variance.

Validation Set



K-fold cross validation





Trade-off

In machine learning, there is always a trade-off between
  - complex hypotheses that fit the training data well
  - simpler hypotheses that may generalize better.
As the amount of training data increases, the generalization error decreases.

No comments:

Post a Comment