The
arXiv (
pronounced "
archive", as if the "X" were the
Greek letter Chi, χ) is an archive for electronic
preprints of
scientific papers in the fields of
mathematics,
physics,
computer science, quantitative
biology and
statistics which can be accessed via the
world wide web. In many fields of mathematics and physics, almost all scientific papers are placed on the arXiv. , arXiv.org passed the half-million article milestone, with roughly five thousand new e-prints added every month.
History
The arXiv was originally developed by
Paul Ginsparg and started in 1991 as a
repository for preprints in physics and later expanded to include astronomy, mathematics, computer science,
nonlinear science, quantitative
biology and, most recently, statistics. It soon became obvious that there was a demand for long term preservation of preprints. The term
e-print was adopted to describe the articles. Ginsparg was awarded a
MacArthur Fellowship in 2002 for his establishment of arXiv.
It was originally
hosted at the
Los Alamos National Laboratory (at
xxx.lanl.gov, hence its former name, the
LANL preprint archive) and is now hosted and operated by
Cornell University, with
mirrors around the world. It changed its name and address to
arXiv.org in 1999 for greater flexibility.
Its existence was one of the precipitating factors that led to the current revolution in
scientific publishing, known as the
open access movement, with the possibility of the eventual replacement of traditional
scientific journals. Professional
mathematicians and
scientists regularly upload their papers to arXiv.org for worldwide access and sometimes for reviews before they are published in
peer reviewed
journals.
The operation of arXiv is currently funded by
Cornell University and by the
National Science Foundation.
Peer review
Although the arXiv is not
peer-reviewed, a collection of moderators for each area review the submissions and may recategorize any that are deemed off-topic. The lists of moderators for many sections of the arXiv are publicly available but moderators for most of the physics sections remain unlisted.
Additionally, an "endorsement" system was introduced in January 2004 as part of an effort to ensure content that is relevant and of interest to current research in the specified disciplines. The new system has attracted its own share of criticism for allegedly restricting inquiry. Under the system, an author must first get endorsed. Endorsement comes from either another arXiv author who is an
endorser or is automatic, depending on various evolving criteria, which are not publicly spelled out. Endorsers are not asked to review the paper for errors, but to check if the paper is appropriate for the intended subject area. New authors from recognized academic institutions generally receive automatic endorsement, which in practice means that they do not need to deal with the endorsement system at all.
The lack of peer review, while a concern to some, is not considered a hindrance to those who use the arXiv. Many authors exercise care in what they post. A majority of the
e-prints are also submitted to
journals for publication, but some work, including some very influential papers, remain purely as e-prints and are never published in a peer-reviewed journal. A well-known example of the latter is an outline of a proof of
Thurston's geometrization conjecture, including the
Poincaré conjecture as a particular case, uploaded by
Grigori Perelman in November 2002. Perelman appears content to forgo the traditional peer-reviewed journal process, stating "If anybody is interested in my way of solving the problem, it's all there [on the arXiv] - let them go and read about it."
While the arXiv does contain some dubious e-prints, such as those claiming to refute famous theorems or proving famous conjectures such as
Fermat's last theorem using only high school mathematics, they are "surprisingly rare". The arXiv generally re-classifies these works, e.g. in "General mathematics", rather than deleting them.
Nineteen scientists, for example,
Nobel laureate Brian Josephson, testified that none of their papers are accepted and others are forcibly recategorized by the administrators of the arXiv either due to the controversial nature of their work, or it not being canonical to
string theory, in what amounts to intellectual
censorship.
Submission process and file size limitations
Papers can be submitted in several formats, including
LaTeX,
PDF printed from a
wordprocessor other than
TeX or LaTeX, and
DOCX from
MS Office. For LaTeX, all files needed to generate the article automatically must be submitted, in particular, the
LaTeX source and files for all pictures. The submission is rejected by the arXiv software if generating the final
PDF file fails, if any image file is too large, or if the total size of the submission (after compression) is too large. The size limits are fairly small and often force the authors to convert images to achieve a smaller file size, e.g. by converting
Encapsulated Postscript files to bitmaps and manipulate the file size by reducing resolution or image quality in
JPEG files. This requires a fairly high level of
computer literacy. Authors can also contact arXiv if they feel a large file size is justified for a submission with many images.
Access
The standard access route is through the arXiv.org website or one of several mirrors. Several other interfaces and access routes have also been created by other un-associated organisations. These include the
University of California, Davis's
front, a
web portal that offers additional search functions and a more self-explanatory interface for arXiv.org, and is referred to by some mathematicians as (the) Front. A similar function is offered by eprintweb.org, launched in September 2006 by the
Institute of Physics.
Google Scholar and
Windows Live Academic can also be used to search for items in arXiv. Finally, researchers can select sub-fields and receive daily e-mailings or
RSS feeds of all submissions in them.
See also