A
bit is the basic unit of
information in
computing and
telecommunications; it is the maximum amount of information that can be stored by a device or other physical system that can normally exist in only two distinct
states. These may be the two stable positions of an
electrical switch, two distinct
voltage or
current levels allowed by a
circuit, two distinct levels of
light intensity, two directions of
magnetization or
polarization, etc.
In
computing, a bit can also be defined as a
variable or computed quantity that can have only two possible
values. These two values are often interpreted as
binary digits and are usually denoted by the Arabic
numerical digits
0 and
1. Indeed, the term "bit" is a contraction of
binary digit. The two values can also be interpreted as logical values (
true/
false,
yes/
no). algebraic
signs (
+/
−), activation states (
on/
off), or any other two-valued attribute. In several popular programming languages, numeric 0 is equivalent (or convertible) to logical
false, and 1 to
true. The correspondence between these values and the physical states of the underlying
storage or
device is a matter of convention, and different assignments may be used even within the same device or
program.
In
information theory, one bit is typically defined as the uncertainty of a binary random variable that is 0 or 1 with equal probability, or the information that is gained when the value of such a variable becomes known.
In
quantum computing, a
quantum bit or
qubit is a
quantum system that can exist in
superposition of two bit values, "true" and "false".
The symbol for bit, as a unit of information, is "bit" or (lowercase) "b"; the latter being recommended by the
IEEE 1541 Standard (2002).
History
The encoding of data by discrete bits was used in the
punched cards invented by
Basile Bouchon and
Jean-Baptiste Falcon (1725), developed by
Joseph Marie Jacquard (1804), and later adopted by
Semen Korsakov,
Charles Babbage,
Hermann Hollerith, and early computer manufacturers like
IBM. Another variant of that idea was the perforated
paper tape. In all those systems, the medium (card or tape) conceptually carried an array of hole positions; each position could be either punched through or not, thus potentially carrying one bit of information. The encoding of text by bits was also used in
Morse code (1844) and early digital communications machines such as
teletypes and
stock ticker machines (1870).
Ralph Hartley suggested the use of a logarithmic measure of information in 1928.
[Norman Abramson (1963), Information theory and coding. McGraw-Hill.] Claude E. Shannon first used the word
bit in his seminal 1948 paper
A Mathematical Theory of Communication. He attributed its origin to
John W. Tukey, who had written a Bell Labs memo on 9 January 1947 in which he contracted "binary digit" to simply "bit". Interestingly,
Vannevar Bush had written in 1936 of "bits of information" that could be stored on the
punch cards used in the mechanical computers of that time. The first programmable computer built by
Konrad Zuse used binary notation for numbers.
Representation
Transmission and processing
Bits can be implemented in many forms. In most modern computing devices, a bit is usually represented by an
electrical
voltage or
current pulse, or by the electrical state of a
flip-flop circuit. For devices using
positive logic, a digit value of 1 (true value or high) is represented by a positive voltage relative to the
electrical ground voltage (up to 5
volts in the case of
TTL designs), while a digit value of 0 (false value or low) is represented by 0 volts.
Storage
In the earliest non-electronic information processing devices, such as Jacquard's loom or Babbage's
Analytical Engine, a bit was often stored as the position of a mechanical lever or gear, or the presence or absence of a hole at a specific point of a paper card or tape. The first electrical devices for discrete logic (such as
elevator and
traffic light control circuits,
telephone switches, and Konrad Zuse's computer) represents bits as the states of
electrical relays which could be either "open" or "closed". When relays were replaced by
vacuum tubes, starting in the 1940s, computer builders experimented with a variety of storage methods, such as pressure pulses traveling down a
mercury delay line, charges stored on the inside surface of a
cathode-ray tube, or opaque spots printed on
glass discs by
photolithographic techniques .
In the 1950s and 1960s these methods were largely supplanted by
magnetic storage devices such as
magnetic core memory,
magnetic tapes,
drums, and
disks, where a bit was represented by the polarity of
magnetization of a certain area of a
ferromagnetic film. The same principle was later used in the
magnetic bubble memory developed in the 1980s, and is still found in various
magnetic strip items such as
metro tickets and some
credit cards.
In modern
semiconductor memory, such as
dynamic random-access memory or
flash memory, the two values of a bit may be represented by two levels of
electrical charge stored in a
capacitor. In
programmable logic arrays and certain types of
read-only memory, a bit may be represented by the presence or absence of a conducting path at a certain point of a circuit. In
optical discs, a bit is encoded as the presence or absence of a microscopic pit on a reflective surface. In
bar codes, bits are encoded as the thickness or spacing of a printed black line.
Information capacity and information content
Information
capacity of a storage system is only an upper bound to the actual
quantity of information stored therein. If the two possible values of one bit of storage are not equally likely, that bit of storage will contain less than one bit of information. Indeed, if the value is completely predictable, then the reading of that value will provide no information at all (zero bits). If a computer file that uses
n bits of storage contains only
m <
n bits of information, then that information can in principle be encoded in about
m bits, at least on the average. This principle is the basis of
data compression technology. Sometimes the name
bit is used when discussing data storage while
shannon is used for the statistical bit.
Multiple bits
There are several
units of information which are defined as multiples of bits, such as
byte (8 bits),
kilobit (either 1000 or 2
10 = 1024 bits),
megabyte (either or 8×2
20 = ), etc.
Computers usually manipulate bits in groups of a fixed size, conventionally named "
words". The number of bits in a word varies with the computer model; typically between 8 to 80 bits; or even more in some specialized machines.
The
International Electrotechnical Commission's standard
IEC 60027 specifies that the symbol for bit should be "bit", and this should be used in all multiples, such as "kbit" (for kilobit).
[National Institute of Standards and Technology (2008), Guide for the Use of the International System of Units. ] However, the letter "b" (in lower case) is widely used too. The letter "B" (upper case) is both the standard and customary symbol for byte.
In telecommunications (including
computer networks), data transfer rates are usually measured in
bits per second (bit/s) or its multiples, such as kbit/s. (This unit is not to be confused with
baud.)
Bit-based computing
Certain
bitwise computer
processor instructions (such as
bit set) operate at the level of manipulating bits rather than manipulating data interpreted as an aggregate of bits.
In the 1980s, when
bitmapped computer displays became popular, some computers provided specialized
bit block transfer instructions to set or copy the bits that corresponded to a given rectangular area on the screen.
In most computers and programming languages, when a bit within a group of bits such as a byte or word is to be referred to, it is usually specified by a number from 0 (not 1) upwards corresponding to its position within the byte or word. However, 0 can refer to either the
most significant bit or to the
least significant bit depending on the context, so the convention of use must be known.
Other information units
Other units of information, sometimes used in information theory, include the
natural digit also called a
nat or
nit and defined as
log2 e (≈ 1.443) bits, where
e is the
base of the natural logarithms; and the
decit,
ban, or
Hartley, defined as log
210 (≈ 3.322) bits.
Conversely, one bit of information corresponds to about
ln 2 (≈ 0.693) nats, or log
10 2 (≈ 0.301) Hartleys. Some authors also define a
binit as an arbitrary information unit equivalent to some fixed but unspecified number of bits.
See also