Logistic Regression in Machine Learning

Name derived from the Logit Transformation

Differences from OLS

Used for predicting the outcome of a binary dependent variable (yes or no)
i.e., the DV has to be a Nominal variable restricted to only 2 states

           - Customer will pay/default on loan (credit Risk)
           - Customer will respond/ignore the offer (Marketing Response)
           - Customer will churn/stay loyal (Telecom etc.)

Uses a Logit transformation on the DV to fit a linear regression model.

Example - Hours Studied to passing

- We can study how probability of passing changes as per the hours studies using joint Probability distribution.
- Can we train a regression model on this relationship.

Logistic Regression - Concepts

Model the PROBABILITY of an event-rather than a measure

- Probabilities range from 0 to 1 (Min probability of any event is 0, max is 1)
- Need to create a dependent variables as a probability range, requires a transformation from the binary nominal variable in dataset.

LOGIT transformation used to create the dependent variable, hence the name Logistic Regression.

All assumptions of OLS regression are still valid, however deviations are tolerated to a large extent - as end result in most cases require only a rank order.

Logistic Regression produces results in a binary format which is used to predict the outcome of a categorical dependent variable. So the outcome should be discrete/categorical such as: