Home / Data Science / Introduction to Experimentation and Active Learning in Data Science

Introduction to Experimentation and Active Learning in Data Science

by Irawen on 09:01 in Data Science

Introduction

Data Science and analytics need data (not to mention Big-Data)
What if you don't have data
Creating Data and analyzing it (sometimes rolled into the same grand problem statement)
Online vs Offline context of crating data
Online gets covered in Reinforcement Learning
In Offline we will discuss Design of Experiments (DOE) and Active Learning
Critical difference between observational data and offline experimental data in DOE

Experimental Thinking

The operation of system can be conceptualized as a combination of some inputs, which when used together, result in outputs

Formal experimentation involves systematic, purposeful changes to input variables in an attempt to gain knowledge about the system and/or find the ideal setting that result in the best output.

Design of Experiments

The problems with adaptive One-Factor-At-a-Time (aOFAT)

The discrete case
Alternative is Orthogonal arrays. An illustration through the Full factorial.

Analysing Designed Experiments

Classical Analysis

The Take-The-Best Heuristic

- TTB would have selected A = 1, B = 1, C = 1

Where would we use Classical?

- High error/noise environments can be handled

Where would we use TTB?

- Ultra low error/noise environments

The statistical way

- Use supervised learning technique (stepwise regression is popular)

Sequential Experimentation and Active Learning

Sequential Experimentation
Active Learning as semi-supervised learning or optimal experimental design
Strategies in Active Learning:

        - Uncertainty Sampling
        - Query by committee
        - Expected model change
        - Expected error reduction and variance reduction

No comments:

Post a Comment

Subscribe to: Post Comments (Atom)

Created By SoraTemplates | Distributed By Blogger Templates