PetaBox
Encyclopedia
PetaBox is a storage unit from Capricorn Technologies
Capricorn Technologies
Capricorn Technologies is a low-cost, high-density, energy efficient data storage solutions provider based in San Francisco, California. The founder and CEO is C.R. Saikley....

. It was designed by the staff of the Internet Archive
Internet Archive
The Internet Archive is a non-profit digital library with the stated mission of "universal access to all knowledge". It offers permanent storage and access to collections of digitized materials, including websites, music, moving images, and nearly 3 million public domain books. The Internet Archive...

 and C. R. Saikley to store and process one petabyte
Petabyte
A petabyte is a unit of information equal to one quadrillion bytes, or 1000 terabytes. The unit symbol for the petabyte is PB...

 (a million gigabytes) of information.

Goals

  • Low power: 6 kW per rack, 60 kW for the entire storage cluster
  • High density: 100+ TB/rack
    19-inch rack
    A 19-inch rack is a standardized frame or enclosure for mounting multiple equipment modules. Each module has a front panel that is wide, including edges or ears that protrude on each side which allow the module to be fastened to the rack frame with screws.-Overview and history:Equipment designed...

  • Local computing to process the data (800 low-end PC's)
  • Multi-OS possible, Linux
    Linux
    Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...

     standard
  • Colocation
    Colocation centre
    A colocation centre or colocation center , is a type of data centre where equipment space and bandwidth are available for rental to retail customers...

     friendly
  • Shipping container friendly: Able to be run in a 20' by 8' by 8' shipping container
    Shipping container
    A shipping container is a container with strength suitable to withstand shipment, storage, and handling. Shipping containers range from large reusable steel boxes used for intermodal shipments to the ubiquitous corrugated boxes...

    .
  • Easy Maintenance: One system administrator
    System administrator
    A system administrator, IT systems administrator, systems administrator, or sysadmin is a person employed to maintain and operate a computer system and/or network...

     per petabyte
  • Software to automate full mirroring
    Mirror (computing)
    In computing, a mirror is an exact copy of a data set. On the Internet, a mirror site is an exact copy of another Internet site.Mirror sites are most commonly used to provide multiple sources of the same information, and are of particular value as a way of providing reliable access to large downloads...

  • Easy to scale
  • Inexpensive design
  • Inexpensive storage

History

The first 100 terabyte rack became operational at the European Archive
in June 2004. The second 80 terabyte rack became operational in San Francisco that same year.
The Internet Archive then spun off its PetaBox production to the newly-formed company
Capricorn Technologies
Capricorn Technologies
Capricorn Technologies is a low-cost, high-density, energy efficient data storage solutions provider based in San Francisco, California. The founder and CEO is C.R. Saikley....

.

Between 2004 and 2007, Capricorn replicated the Internet Archive's deployment of the PetaBox for major academic institution
Academic institution
Academic institution is an educational institution dedicated to education and research, which grants academic degrees. See also academy and university.- Types of academic institutions include :...

s, digital preservationists, government agencies, high-performance computing
High-performance computing
High-performance computing uses supercomputers and computer clusters to solve advanced computation problems. Today, computer systems approaching the teraflops-region are counted as HPC-computers.-Overview:...

 (HPC) and major research sites, medical imaging
Medical imaging
Medical imaging is the technique and process used to create images of the human body for clinical purposes or medical science...

 providers, digital image repositories
Digital library
A digital library is a library in which collections are stored in digital formats and accessible by computers. The digital content may be stored locally, or accessed remotely via computer networks...

, storage outsourcing
Cloud computing
Cloud computing is the delivery of computing as a service rather than a product, whereby shared resources, software, and information are provided to computers and other devices as a utility over a network ....

sites, and other enterprises. Their largest product uses 750 gigabyte disks. In 2007 the Internet Archive data center housed approximately three petabytes of PetaBox storage technology.

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK