Motivation
It is important to have an understanding of some of the more traditional approaches to function estimation and classification before delving into the trendier topics of neural networks and decision trees. Many of these methods build on an understanding of each other and thus to truly be a MACHINE LEARNING MASTER, we’ve got to pay our dues. We will therefore start with the slightly less sexy topic of kernel density estimation.
Let  be a random variable with a continuous distribution function (CDF)
 be a random variable with a continuous distribution function (CDF)  and probability density function (PDF)
 and probability density function (PDF)
      ![Rendered by QuickLaTeX.com \[f(x) = \frac{d}{dx} F(x)\]](https://statisticelle.com/wp-content/ql-cache/quicklatex.com-b7f63dc97bf17d7fd7215a8e571dc414_l3.png)
Our goal is to estimate  from a random sample
 from a random sample  . Estimation of
. Estimation of  has a number of applications including construction of the popular Naive Bayes classifier,
 has a number of applications including construction of the popular Naive Bayes classifier,
      ![Rendered by QuickLaTeX.com \[ \hat{Pr}(C = c | X = x_0) = \frac{\hat{\pi}_c \hat{f}_{c}(x_0)}{\sum_{k=1}^{C} \hat{\pi}_{k} \hat{f}_{k}(x_0)} \]](https://statisticelle.com/wp-content/ql-cache/quicklatex.com-ec4bde07d3f9cbbb654aca700939e681_l3.png)
