Skip to content

valatwork/statistics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

83 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Statistics: theory, examples and exercises

Table of contents

1 - Introduction

1.0 Terminology and basic concepts

2 - Descriptive Statistics

2.1 - Measures of Central Tendency

2.1.1 - Mean

2.1.2 - Median

2.1.3 - Midrange

2.1.4 - Mode

2.2 - Measures of Shape

2.2.1 - Skewness

2.2.2 - Kurtosis

2.3 - Measures of Dispersion

2.3.1 - Measures of Dispersion - Intro

2.3.2 - Sample vs Population

2.3.3 - Range

2.3.4 - Variance

2.3.5 - Standard Deviation

2.3.6 - Scaling and shifting

3 - Probability

3.1 - Introduction to Probability

3.2 - Set Operations

3.3 - (Some) Rules of Probability

3.4 - Combinatorics

3.5 - Probability Exercises

4 - Probability Distributions

4.1 - Introduction to Probability Distributions

4.2 - Covariance

4.3 - Correlation

4.4 - Probability Mass Function and Probability Density Function

4.5 - Expected Value

4.6 - Marginal Probability

4.7 - Uniform Distribution

4.8 - Normal Distribution

4.9 - Central Limit Theorem

4.10 - Exponential and Laplace Distributions

4.11 - Bernoulli, Binomial, Multinomial Distributions

4.12 - Poisson Distribution

5 - Inferential Statistics

5.1 Introduction

5.1.1 - Entropy

5.1.2 - Statistics and Machine Learning

5.2 Hypothesis Testing

5.2.1 - Hypothesis Testing

5.2.2 - Z-Score

5.2.3 - P-Value

5.2.4 - Single Sample t-tests

5.2.5 - Independent t-tests

5.2.6 - Paired t-tests

5.2.7 - Confidence Intervals

5.2.8 - ANOVA

5.2.9 - Correlation (expanded): Pearson's

6 - Regression and Machine Learning

6.1 - Supervised Learning

6.1.1 - Independent vs Dependent variables

6.1.2 - Linear Regression

6.1.3 - Ordinary Least Squares (OLS)

6.1.4 - Multiple Linear Regression

6.1.5 - Cost Function, Gradient Descent, Residuals

6.1.6 - Polynomial Regression

6.1.7 - Regularization, Feature Scaling, Cross Validation

6.1.8 - Ridge Regression, Lasso Regression, Elastic Net

6.1.9 - Feature Engineering

6.1.10 - Cross Validation and Grid Search

6.1.11 - Logistic Regression

6.1.12 - k-Nearest Neighbors (kNN)

6.1.13 - Support Vector Machines (SVM) and Support Vector Regression (SVR)

6.1.14 - Decision Trees

6.1.15 - Random Forests

6.1.16 - Boosting Methods

6.1.17 - Dimensionality Reduction

6.1.18 - Principal Component Analisys (PCA)

6.1.19 - Naive Bayes and Natural Language Processing

6.2 - Unsupervised Learning

6.2.1 - K-means clustering

6.2.2 - Hierarchical clustering

6.2.3 - DBSCAN

Inferential Statistics

5.1 Introduction

5.1.1 - Entropy

5.1.2 - Statistics and Machine Learning

5.2 Hypothesis Testing

5.2.1 - Hypothesis Testing

5.2.2 - Z-Score

5.2.3 - P-Value

5.2.4 - Single Sample T-Test

6.2 - Unsupervised Learning

6.2.1 - K-means clustering

6.2.2 - Hierarchical clustering

6.2.3 - DBSCAN

6.2.4 - Dimensionality Reduction

Appendices

Appendix A - Sources

Appendix X - Python Reference

Appendix Y - Subset Selection Theory

Appendix Y - Subset Selection Theory

Appendix Z - Variable Types Examples

About

Putting together study material from various sources

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published