**Hypothesis Space :-**

The space of all

**hypothesis**that can, in principle, be output by a learning algorithm.

We can think about a

**supervised learning machine**as a device that explores a "

**hypothesis space**".

- Each setting of the parameters in the machine is a different hypothesis about the function that maps input vectors to output vectors.

**Representation :-**

**Terminology**

*Instance x with label y=f(x)*

**Example (x,y) :****Collection of examples observed by learning algorithm.**

*Training Data s :***Set of all possible objects describable by features.**

*Instance Space x :**Subset of objects from X (c is unknown).*

**Concept c :****Maps each instance x Îµ X to target label y Îµ Y**

*Target Function f :***Classifier**

**Function that approximates f.**

*Hypothesis h :***Set of functions we allow for approximating f.**

*Hypothesis Space H :*The set of hypotheses that can be produced, can be restricted further by specifying a language bias.

*Training set S ⊆ X*

**Input :***A hypothesis h ⊆ H*

**Output :****Inductive Bias**

Need to make assumptions

- Experience alone doesn't allow us to make conclusion about unseen data instances

Two types of bias :

- Restriction : Limit the hypothesis space

- Preference : Impose ordering on hypothesis space

**Inductive learning**

**Inductive Learning :**Inducing a general function from training examples

- Construct hypothesis h to agree with c on the training example.

- A hypothesis is consistent if it agrees with all training examples.

- A hypothesis said to generalize well if it correctly predicts the value of y for novel example.

**Inductive Learning is an /// Posed Problem :**

Unless we see all possible examples the data is not sufficient for an inductive learning algorithm to find a unique solution.

**Inductive Learning Hypothesis**

Any hypothesis h found to approximate the target function c well over a sufficiently large set of training examples D will also approximate the target function well over other unobserved examples.

**Learning as Refining the Hypothesis Space**

Concept learning is a task of searching an hypotheses space of possible representations looking for the representation(s) that best fits the data, given the bias.

The tendency to prefer one hypothesis over another is called

**bias.**

Given a representation, data, and a bias, the problem of learning can be reduced to one of search.

**Occam's Razor**

A classical example of Inductive Bias

the simplest consistent hypothesis about the target function is actually the best

**Some more Types of Inductive Bias**

**Minimum description length :**When forming a hypothesis, attempt to minimize the length of the description of the hypothesis.

**Minimum margin :**When drawing a boundary between two classes, attempt to maximize the width of the boundary (SVM)

**Important issues in Machine Learning**

What are good hypothesis spaces ?

Algorithms that work with the hypothesis spaces

How to optimize accuracy over future data points (overfitting)

How can we have confidence in the result ? (How much training data - statistical qs)

Are some learning problems computationally intractable ?

**Generalization**

Components of generalization error

**- Bias**: how much the average model over all training sets differ from the true model ?

Error due to inaccurate assumptions/simplifications made by the model

**- Variance**: how much models estimated from different training sets differ from each other

**Underfitting and Overfitting**

**Underfitting**: model is too "simple" to represent all the relevant class characteristics

- High bias and low variance

- High training error and high test error

**Overfitting**: model is too "complex" and fits irrelevant characteristics (noise) in the data

- Low bias and high variance

- Low training error and high test error

## No comments:

## Post a Comment