Skip to main content

Supplemental Reading for Data Storage

 

Data Storage Measurements

In this reading, you will learn about the different names for measurements of data storage capacities and file sizes. Data storage capacity increases in step with the evolution of computer hardware technology. Larger storage capacities allow for dynamic growth in file sizes. These advances make it possible for companies like Netflix and Hulu to store thousands of feature-length films in high video quality formats. 

There are standardized sets of terms used to name the ever expanding sizes of data storage and files. For example, the common terms used to describe file sizes and hard drive storage capacity include: bytes, kilobytes, megabytes, gigabytes, and terabytes. However, if you are a computer engineer, you might use a different set of terms. 

Data storage measurement nomenclature

 

Table illustrating decimal values for data storage measurements

 

  • Decimal nomenclature: kilobyte, megabyte, gigabyte, terabyte, petabyte, exabyte, zettabyte, yottabyte

The decimal naming system for computer storage uses the metric system of prefixes from the International System of Units: kilo, mega, giga, tera, peta, exa, zetta, and yotta. These prefixes may also be referred to as the decimal system of prefixes. The metric/decimal nomenclature represent a base-10 approximation of the actual amount of data storage bytes. The metric system prefixes were selected to simplify the marketing of computer products. 

 

Table illustrating binary values for data storage measurements

 

  • Binary nomenclature: kibibyte, mebibyte, gibibyte, tebibyte, pebibyte, exbibyte, zebibyte, yobibyte

The binary naming system is a standard set by the International Organization for Standardization (ISO) in partnership with the International Electrotechnical Commission (IEC). The ISO 80000 and IEC 80000 guides to units of measurement define the International System of Quantities (ISQ). The prefixes kibi-, mebi-, gibi, -tebi-. pebi-, exbi-, zebi-, and yobi- were created by the IEC organization. They are a blend of the first two letters of the metric prefix fused with the first two letters of the word “binary” (example: megabyte + binary + byte= mebibyte). 

Binary measurements of computer data are more accurate than decimal system measurements. While decimal nomenclature is commonly used to market computers and computer parts to the general public, binary nomenclature is often used in computer engineering for numerical accuracy. 

Quantities of storage measurements

As data storage grows, the need for new terminology to describe the exponentially larger byte quantities grows too. The current byte nomenclature, mathematical representations, and storage capacities are as follows:

  • One bit: Also called a binary digit, bits store an electric signal as 1. The absence of an electric signal is stored as 0, which is also the default value of a bit. One bit can store only one value, either 1 or 0. These two possible values are the basis of the binary number system (base-2) that computers use. All numbers in a base-2 system increase exponentially as powers of 2.

  • One byte: One byte stores eight bits of ones and zeros that translate to a symbol or basic computer instruction. Examples: 01101101 is the byte that translates to the letter “m.” The byte 01111111 tells the computer to delete the character to the right of the cursor.

  • One kilobyte (1 KB): 

    • Kilobyte (KB) decimal format: 103 = 1,000 bytes

    • Kibibyte (KiB) binary format: 210 = 1,024 bytes

    • Decimal inaccuracy: Off by -2.4% or -24 bytes

    • Name origin: “Kilo-” is a French derivation from the Ancient Greek word for “thousand” A kilobyte is one thousand bytes.

    • 1 KB can hold: A short text file or a small icon as a 16x16 pixel .gif file.

  • One megabyte (1 MB): 

    • Megabyte (MB) decimal format: 106 = 1,000,000 bytes

    • Mebibyte (MiB) binary format: 220 = 1,048,576 bytes

    • Decimal inaccuracy: Off by -4.9% or -48,576 bytes

    • Name origin: “Mega-” is derived from the Ancient Greek word for “large.” A megabyte is a large number of bytes.

    • 1 MB can hold: Approximately one minute of music in a lossless .mp3 format or a short novel.

  • One gigabyte (1 GB): 

    • Gigabyte (GB) decimal format: 109  = 1,000,000,000 bytes

    • Gibibyte (GiB) binary format: 230 = 1,073,741,824 bytes

    • Decimal inaccuracy: Off by -7.4% or -73,741,824 bytes

    • Name origin: “Giga-” is derived from the Ancient Greek word for “giant.” A gigabyte is a giant number of bytes.

    • 1 GB can hold: Between 2.5-3 hours of music in .mp3 format or 300 high-resolution images.

  • One terabyte (1 TB): 

    • Terabyte (TB) decimal format: 1012 = 1,000,000,000,000 bytes

    • Tebibyte (TiB) binary format: 240 = 1,099,511,627,776 bytes

    • Decimal inaccuracy: Off by -10.0%

    • Name origin: “Tera-” is a shortened form of “tetra-”, which was derived from the Ancient Greek word for the number four. The 1012 decimal format can also be written as 10004 (one-thousand to the 4th power). “Tera-” in Ancient Greek means “monster.” You might think of the word “terabyte” as a monstrously large number of bytes.

    • 1 TB can hold: Approximately 200,000 songs in .mp3 format or 300 hours of video.

  • One petabyte (PB): 

    • Petabyte (PB) decimal format: 1015 = 1,000,000,000,000,000 bytes

    • Pebibyte (PiB) binary format: 250 = 1,125,899,906,842,624 bytes

    • Decimal inaccuracy: Off by -12.6%

    • Name origin: “Peta-” is derived from the Ancient Greek word “penta” meaning five. The 1018 decimal format can also be written as 10005 (one-thousand to the 5th power).

    • 1 PB can hold: The content from 1.5 million CD-ROM discs or 500 billion pages of text.

  • One exabyte (EB): 

    • Exabyte (EB) decimal format: 1018 = 1,000,000,000,000,000,000 bytes

    • Exbibyte (EiB) binary format: 260 = 1,152,921,504,606,846,976 bytes

    • Decimal inaccuracy: Off by -15.3%

    • Name origin: “Exa-” was derived from the Ancient Greek word for six. The 1018 decimal format can also be written as 10006 (one-thousand to the 6th power).

    • 1 EB can hold: Approximately 11 million movies in 4k video resolution or 3,000 copies of the entire United States Library of Congress.

  • One zettabyte (ZB): 

    • Zettabyte (ZB) decimal format: 1021 = 1,000,000,000,000,000,000,000 bytes

    • Zebibyte (ZiB) binary format: 270 = 1,180,591,620,717,411,303,424 bytes

    • Decimal inaccuracy: Off by -18.1%

    • Name origin: “Zetta” was derived from the Latin word “septem” which means seven. The 1021 decimal format can also be written as 10007 (one-thousand to the 7th power).

    • 1 ZB can hold: Seagate reports one zettabyte can hold 30 billion movies in 4k video resolution.

  • One yottabyte (YB): 

    • Yottabyte (YB) decimal format: 1024 = 1,000,000,000,000,000,000,000,000 bytes

    • Yobibyte (YiB) binary format: 280 = 1,208,925,819,614,629,174,706,176 bytes

    • Decimal inaccuracy: Off by -20.9%

    • Name origin: “Yotta” is Ancient Greek for eight. The 1024 decimal format can also be written as 10008 (one-thousand to the 8th power). 

    • 1 YB can hold: In 2011, a cloud storage company estimated that one yottabyte could hold the data of one million data centers.