Pattern recognition

For the William Gibson novel, see: Pattern Recognition (novel).

Pattern recognition (also known as classification, pattern classification or statistical classification) is a field within the area of machine learning and can be defined as "the act of taking in raw data and taking an action based on the category of the data" [1]. As such, it is a collection of methods for supervised learning.

Typical applications are automatic speech recognition, classification of text into several categories (e.g. spam/non-spam email messages), the automatic recognition of handwritten postal codes on postal envelopes, or the automatic recognition of images of human faces. The last three examples form the subtopic image analysis of pattern recognition that deals with digital images as input to pattern recognition systems.

Formally, the problem can be stated as follows: given training data [itex]\{(\mathbf{x_1},y),\dots,(\mathbf{x_n}, y)\}[itex] produce a classifier [itex]h:\mathcal{X}\rightarrow\mathcal{Y}[itex] which maps an object [itex]\mathbf{x} \in \mathcal{X}[itex] to its classification label [itex]y \in \mathcal{Y}[itex]. For example, if the problem is filtering spam, then [itex]\mathbf{x_i}[itex] is some representation of an email and [itex]y[itex] is either "Spam" or "Non-Spam".

 Contents

Pattern recognition techniques

Pattern recognition is typically an intermediate step in a longer process. These steps generally are acquisition of the data (image, sound, text, etc.) to be classified, preprocessing to remove noise or normalize the data in some way (image processing, stemming text, etc.), computing features, classification and finally post-processing based upon the recognized class and the confidence level.

Pattern recognition itself is primarily concerned with the classification step. In some cases, such as in neural networks , feature selection and extraction may also be partially or fully automated.

While there are many methods for classification, they are solving one of three related mathematical problems.

The first is to find a map of a feature space (which is typically a multi-dimensional vector space) to a set of labels. This is equivalent to partitioning the feature space into regions, then assigning a label to each region. Such algorithms (e.g., the nearest neighbour algorithm) typically do not yield confidence or class probabilities, unless post-processing is applied.

The second problem is to consider classification as an estimation problem, where the goal is to estimate a function of the form

[itex]P({\rm class}|{\vec x}) = f\left(\vec x;\vec \theta\right)[itex]

where the feature vector input is [itex]\vec x[itex], and the function f is typically parameterized by some parameters [itex]\vec \theta[itex]. In the Bayesian approach to this problem, instead of choosing a single parameter vector [itex]\vec \theta[itex], the result is integrated over all possible thetas, with weighted by how likely they are given the training data D:

[itex]P({\rm class}|{\vec x}) = \int f\left(\vec x;\vec \theta\right)P(\vec \theta|D) d\vec \theta[itex]

The third problem is related to the second, but the problem is to estimate the class-conditional probabilities [itex]P(\vec x|{\rm class})[itex] and then use Bayes' rule to produce the class probability as in the second problem.

Examples of classification algorithms include:

References

• Richard O. Duda, Peter E. Hart, David G. Stork (2001) Pattern classification (2nd edition), Wiley, New York, ISBN 0471056693.
• Dietrich Paulus and Joachim Hornegger (1998) Applied Pattern Recognition (2nd edition), Vieweg. ISBN 3-528-15558-2
• J. Schuermann: Pattern Classification: A Unified View of Statistical and Neural Approaches, Wiley&Sons, 1996, ISBN 0471135348

• Art and Cultures
• Countries of the World (http://www.academickids.com/encyclopedia/index.php/Countries)
• Space and Astronomy