Lecture 2
Supervised learning methods
Memory-based algorithms: K-nearest neighbors (K-NN)
Approaching supervised learning problems fairly and systematically
Training, testing, validation, and cross-validation
ROC curves and the c-index
Resampling: bootstrapping, boosting, bagging
Combining systems: mixing models and voting
Data preparation: component analysis
Brief introduction to statistical pattern recognition and Bayesian estimation
Memory-based algorithms
- Duda, Richard O., Hart, Peter E., and Stork, David (2001) Pattern Classification. Second Edition. New York: Wiley.
- Section 4.1-4.6: Nonparametric techniques, pp. 161-192
- http://en.wikipedia.org/wiki/KNN
Training, testing, validation, and cross-validation
- Duda, Richard O., Hart, Peter E., and Stork, David (2001) Pattern Classification. Second Edition. New York: Wiley.
- Section 9.6.2: Cross-validation , pp. 483-485.
C-index and ROC curves
- http://en.wikipedia.org/wiki/Roc_curve
- Duda, Richard O., Hart, Peter E., and Stork, David (2001) Pattern Classification. Second Edition. New York: Wiley.
- Section 2.8.3: Signal detection theory and operating characteristics, pp. 48-51.
Resampling: Bootstrapping, boosting, bagging
- http://en.wikipedia.org/wiki/Resampling_%28statistics%29
- Duda, Richard O., Hart, Peter E., and Stork, David (2001) Pattern Classification. Second Edition. New York: Wiley.
- Section 9.4: Resampling for estimating statistics, pp. 471-475.
- Section 9.5: Resampling for classifier design, pp. 475-482
Mixing models and voting
- Duda, Richard O., Hart, Peter E., and Stork, David (2001) Pattern Classification. Second Edition. New York: Wiley.
- Section 9.7: Combining classifiers, pp. 495-499.
- Carpenter, Gail A., Grossberg, Stephen, Markuzon, Natalya, Reynolds, John H., and Rosen, David B. (1992) Fuzzy ARTMAP: A neural network architecture for incremental supervised learning of analog multidimensional maps. IEEE Transactions on Neural Networks, 3, 698–713.
Component analysis
- Duda, Richard O., Hart, Peter E., and Stork, David (2001) Pattern Classification. Second Edition. New York: Wiley.
- Section 3.8: Component analysis and discriminants, pp. 114-124
Maximum-likelihood and Bayesian parameter estimation