Mathematics

1. Calculus

1. The Fundamental Theorem of Calculus
   All across the field of Machine Learning calculus is utilized. In the world of Deep Learning, optimization via backpropagation reigns supreme, and it is often easy to end up either letting the computer handle all of the derivation and integration for you, or to forget the underlying intuition of what the equations mean in english.

2. Linear Algebra

1. Linear Algebra Introduction
   Linear algebra is frequently utilized in the implementation of machine learning algorithms, so it is very important to have an intuitive understanding of what it represents and how it is used. I recommend looking at this along side of my numpy walkthrough in the math appendix.

2. Linear Combination, Linear Transformation, Dot Product
   As we progress in our understanding of the math surrounding machine learning, AI, and DS, there will be a host of linear algebra concepts that we are forced to reckon with. From PCA and it's utilization of eigenvalues and eigenvectors, to neural networks reliance on linear combinations and matrix multiplication, the list goes on and on. Having a very solid grasp on linear algebra is crucial to realizing how and why these algorithms work.

3. The Determinant, Linear Systems Of Equations & Inverse Matrices
   Recall that when we discussed linear transformations, some will stretch space out, while others squish it on in:

3. Probability

1. Introduction to Probability Theory
   This is a post that I have been excited to get to for over a year now. Probability theory plays an incredibly interesting and unique role in the studying of machine learning and articiail intelligence techniques. It gives us a wonderful way of dealing with uncertainty, and shows up in everything from Hidden Markov Models, Bayesian Networks, Causal Path Analysis, Bayesian A/B testing, and many other areas.

2. Bayes Rule
   The main goal of this post is to dig a bit further into Bayes rule, from a purely probabilistic perspective! Before we begin I do want to make one note; a great deal of the power of Bayes Rule comes in the form of bayesian inference and bayesian statistics, which can be found in the statistics section. I would recommend reading both of those posts as well if you are interested, since they demonstrate the application of Bayes rule to real world problems. If you have caught the bayesian bug at that point then I recommend reading my posts on Bayesian AB testing, found in the Machine Learning section.

3. Probability Inequalities
   Probability inequality play a large role in determining an answer to the crucial question: Is learning feasible? Of course in this context I am referring to statistical/machine learning. You may think to yourself: "Of course it is possible! We constantly here about wonderful new algorithms and ML techniques created, computer vision systems, natural language understanding virtual assistants-many of which are discussed in depth in this blog!". In this you are most certainly correct.

4. Histogram vs. PDF vs. CDF
   This post is TODO.

4. Statistics

1. Introduction to Statistics
   I have been meaning to get to an introductory statistics post for quite some time now! Statistics play an incredibly important role in modern day machine learning. For instance, while it is a far less "sexy" description, modern day machine learning can most often be reduced to variations of statistical learning, where as statistical model can be defined as follows:

2. History of the Gaussian Distribution
   If you have read any of my other posts, worked with statistics/probability, or any sort of machine learning, there is a very good chance that you have come across the Gaussian Distribution. The gaussian distribution, also known as the Normal Distribution, has an incredibly large range of uses; we will not talk about them here, however. For that I recommend looking through my other notebooks, digging into the Central Limit Theorem, sampling, Gaussian Mixture Models, distributions in the social sciences, hypothesis testing, and so on.

3. Statistical Inference and Frequentist A/B Testing
   If you went through my Introduction to Statistics post, then you are finally arriving at the payoff for all of our work: statistical inference. Statistical inference is used a wide variety of ways, but for now we will informally describe it as follows:

4. Non Parametric Hypothesis Testing: KS-Score
   Generally in introductory statistics courses students are taught hypothesis testing via the students $t$ and the chi-squared tests. This doesn't inherently pose a problem, but in practice these tests involve many assumptions and conditions to be met and it leaves many students memorizing a set of steps to a method that can only be applied in very specific conditions. If those conditions change, then the student is either lost, or even worse, left applying an inappropriate and even misleading method.

5. Limitations of Law of Large Numbers and Central Limit Theorem
   Let's dig in to areas where the Central limit Theorem and Law of Large numbers break down.

5. Information Theory

1. Cross Entropy and Maximum Likelihood Estimation
   If you have gone through any of my other walkthroughs on machine learning, particularly those on Logistic Regression, Neural Networks, Decision Trees, or Bayesian machine learning you have definitely come across the concept of Cross Entropy and Maximum Likelihood Estimation. Now, when discussed separately, these are relatively simple concepts to understand. However, during the creation of these notebooks, particularly the sections on logisitic regression and neural networks (and the cost functions involved), I felt as though it was not clear why they were related in certain cases.

6. Functions

1. Composition of Functions
   This is a post that I have been excited to write for some time now. I realize that if you are reading this blog you most likely already have good handle on what a function is; both in the context's of mathematics and computer science. However, I recently saw just how shallow my own understanding was during my quest to understand the history of the normal distribution.

2. Exploring Inverse Functions, Exponentials, and Logarithms
   Throughout fields ranging from mathematics and computer science, to physics and engineering, as well as economics and biology, frequently you will into the concept of the exponential and the logarithm. Often you may have memorized these concepts years ago, committed the rules for manipulating them to memory, and then simply treated them with an axiomatic esteem; they are felt to be fundamental givens rather than derivable from lower level principles. As with many things, this generally presents no problem. For instance, if I asked to you to simplify the following expression:

7. Probability Theory, The Logic of Science

1. Chapter 1 - Plausible Reasoning
   As we tread further into the twenty first century, almost everyone is expected to memorize the mantra "we must make data driven decisions" (well, at least most people in the technology space, and certainly data scientists). However, I want us to pause for a moment and think about what that really means?


© 2018 Nathaniel Dake