Categorization is the process in which ideas and objects are
recognized,
differentiated and
understood. Categorization implies that objects are grouped into
categories, usually for some specific purpose. Ideally, a category illuminates a
relationship between the
subjects and
objects of
knowledge. Categorization is fundamental in
language,
prediction,
inference,
decision making and in all kinds of environmental interaction.
There are many categorization theories and techniques. In a broader historical view, however, three general approaches to categorization may be identified:
The classical view
Classical categorization comes to us first
from
Plato, who, in his
Statesman dialogue, introduces the approach of grouping objects based in their similar
properties. This approach was further explored and systematized by
Aristotle in his
Categories treatise, where he analyzes the differences between
classes and
objects. Aristotle also applied intensively the classical categorization scheme in his approach to the classification of living beings (which uses the technique of applying successive narrowing questions such as "Is it an animal or vegetable?", "How many feet does it have?", "Does it have fur or feathers?", "Can it fly?"...), establishing this way the basis for natural
taxonomy.
The classical
Aristotelian view claims that categories are discrete
entities characterized by a set of properties which are shared by their members. In
analytic philosophy, these properties are assumed to establish the conditions which are both
necessary and sufficient conditions to capture meaning.
According to the classical view, categories should be clearly defined, mutually exclusive and collectively exhaustive. This way, any entity of the given classification universe belongs unequivocally to one, and only one, of the proposed categories.
Conceptual clustering
Conceptual clustering is a modern variation of the classical approach, and derives from attempts to explain how
knowledge is represented. In this approach,
classes (
clusters or
entities) are generated by first formulating their conceptual descriptions and then classifying the entities according to the descriptions.
Conceptual clustering developed mainly during the 1980s, as a machine paradigm for
unsupervised learning. It is distinguished from ordinary
data clustering by generating a concept description for each generated category.
Categorization tasks in which category labels are provided to the learner for certain objects are referred to as supervised classification,
supervised learning, or
concept learning. Categorization tasks in which no labels are supplied are referred to as unsupervised classification,
unsupervised learning, or
data clustering. The task of supervised classification involves extracting information from the labeled examples that allows accurate prediction of class labels of future examples. This may involve the
abstraction of a
rule or
concept relating observed object features to category labels, or it may not involve abstraction (e.g.,
exemplar models). The task of clustering involves recognizing inherent structure in a data set and grouping objects together by
similarity into classes. It is thus a process of
generating a classification structure.
Conceptual clustering is closely related to
fuzzy set theory, in which objects may belong to one or more groups, in varying degrees of fitness.
Prototype Theory
Since the research by
Eleanor Rosch and
George Lakoff in the 1970s, categorization can also be viewed as the process of grouping things based on
prototypes - the idea of necessary and sufficient conditions is almost never met in categories of naturally occurring things. It has also been suggested that categorization based on prototypes is the basis for human development, and that this
learning relies on learning about the world via
embodiment.
A
cognitive approach accepts that natural categories are graded (they tend to be fuzzy at their boundaries) and inconsistent in the status of their constituent members.
Systems of categories are not objectively "out there" in the world but are rooted in people's experience. Conceptual categories are not identical for different cultures, or indeed, for every individual in the same culture.
Categories form part of a hierarchical structure when applied to such subjects as
taxonomy in
biological classification: higher level: life-form level, middle level: generic or
genus level, and lower level: the
species level. These can be distinguished by certain traits that put an item in its distinctive category. But even these can be arbitrary and are subject to revision.
Categories at the middle level are perceptually and conceptually the more salient. The generic level of a category tends to elicit the most responses and richest images and seems to be the psychologically basic level. Typical taxonomies in zoology for example exhibit categorization at the
embodied level, with similarities leading to formulation of "higher" categories, and differences leading to differentiation within categories.
Miscategorization
Miscategorization can be a
logical fallacy in which diverse and dissimilar objects, concepts, entities, etc. are grouped together based upon illogical common denominators, or common denominators that virtually any concept, object or entity have in common. A common way miscategorization occurs is through an over-categorization of concepts, objects or entities, and then miscategorization based upon overly-simliar variables that virtually all things have in common.
See also