The first Bit of Data
A bit is a single character in binary, and actually comes from shortening “Binary Digit”. A bit is the simplest possible data that the machine can read, and is either a 1, or a 0. A yes, or a no. True or false. The bit has been around for longer than computers, originating in punch cards in the 1700s for analog machines to “read”.
If you’ve recently upgraded to Windows 10, you may recall having to check if your computer is 32 bit or 64 bit. The numbers determine how much memory the computer’s processor can access by its architecture – is it equipped to read up to 32 consecutive bits of data as an address, or 64? A 32 bit computer has fewer possible memory addresses from its CPU register– not much more than 4 GB’s worth, or 2^32’s address’s worth – while a 64 bit computer can store to up to two TB, or 2^64 addresses. This doesn’t mean 32 bit computers can only store 4 GB of data, it just means it can store 4 GB worth of names. The files themselves can be nearly any size as long as there’s storage available for them.
Then, a Byte
A byte is usually eight bits in compliance with international standard – but it didn’t always have to be. Instead, it used to be as long as needed to show a character on screen, usually somewhere between two and ten bits, with exceptions down to one and up to forty-eight bits for certain characters. Eight-bit bytes became the standard by their convenience for the new generation of microprocessors in the 70s: within 8 bits in binary, there are 255 possible organizations of ones and zeroes. 16 bits would give too many possibilities and could slow the computer down, while 4 bits would mean combining phrases of bits anyway to get more than 32 or so characters.
8 sounds like the perfect combination of length and possible complexity, at least with the benefit of hindsight. The government had struggled with incompatible systems across branches due to byte size before 8-bit came along. ASCII was the compromise, at seven bits per byte, and when commercial microprocessors came along in the 1970s, they were forced to compromise again with ASCII Extended, so that commercial and government systems could communicate.
However, not all ASCII extended versions contained the same additions, so Unicode was then formed later to try and bridge all the gaps between versions. Unicode, a character reading program that includes the ASCII set of characters within it, uses eight-bit bytes, and it’s one of the most common character encoding libraries out there. You’ll run into ASCII a lot, too – if you’ve ever opened an article and seen little boxes where characters should be, that’s because it was viewed with ASCII but written with a bigger library. ASCII doesn’t know what goes there, so it puts a blank!
1000 bytes of storage forms a Kilobyte, or a Kb. This is the smallest unit of measure that the average computer user is likely to see written as a unit on their device – not much can be done with less than 1000 bytes. The smallest document I can currently find on my device is an Excel file with two sheets and no equations put into it. That takes up 9 KB. A downloadable “pen” for an art program on my device takes up 2 KB.
Computers before Windows had about 640 KB to work with, not including memory dedicated to essential operations.
The original Donkey Kong machines had approximately 20 kilobytes of content for the entire game.
A megabyte is 1 million bytes, or 1,000 kilobytes. Computers had made some progress post-relays, moving to hard disks for internal memory. IBM’s first computer containing a megabyte (or two) of storage, the System 355, was huge. It was also one of the first models to use disk drives, which read faster than tapes. In 1970, if users didn’t want a fridge, they could invest in the now desk-sized 3 million bytes on IBM’s model 165 computers, an improvement over GE’s 2.3 million bytes the year before – and the year before that, Univac had unveiled a new machine with separate cores tied together to give users between 14 and 58 megabytes of capacity in Byte Magazine, at the cost of space. IBM’s System 360 could reach up to 233 megabytes with auxiliary storage, but its size was…prohibitive, reminiscent of that first System 355.
Tapes and drums were competitive with the disk format for a while, but ultimately disk and solid state improved faster and won out (right now it’s looking more and more like SSDs, those solid state drives, will outcompete disks in the future too). During the 80s, the technology improved so much that hard disks became standard (IBM released a home computer with 10 MBs of storage in 1983) and floppy disks acted as media transport.
DOOM comes out in the 1990s and takes up 2.39 MB for it’s downloadable file, with smaller, DLC-like packs of fan-created mods coming out along the way.
A Gigabyte is 1 billion bytes, or 1,000 megabytes. In 1980, IBM releases another fridge – but it stores up to a gigabyte of information! According to Miriam-Webster Dictionary, you can pronounce Gigabyte as “Jig-ga-bite”, which just… feels wrong. In 1974, IBM releases a 20 foot long beast of a storage system that stores up to 236 GB of data on magnetic tape.
In 2000, the first USB sticks (memory sticks, jump drives, etc…) are released to the public with 8 megabyte capacities, and they’re so convenient that floppy disk ports begin disappearing from computer designs in favor of USB ports. USB sticks then improve exponentially, and soon have capacities of one, two, and four Gigabytes while floppies struggle to keep up.
Besides being smaller and harder to break, those USB sticks also store more. Where the first USB sticks held 8 MB, the standard size floppy disk at the time could only hold 1.44 MB of memory. Knowing how small DOOM is, it would take two floppy disks to download all of DOOM, but a USB only took one. By 2009, USB sticks with capacities of 256 GB were available on the market. That’s 178 floppy drives.
A terabyte is 1 trillion bytes, or 1,000 gigabytes. The first commercial drive with a capacity of one terabyte was first sold in 2007 by Hitachi, a Japanese construction and electronics company. The movie Interstellar, released in 2015, featured a depiction of a black hole known as Gargantua – and became famous when it closely resembled a picture of an actual black hole taken by NASA. A ring of light surrounds the black hole in two directions, one due to friction-heated material Gargantua has accumulated, one due to the lensing of light around it. The gravity is so intense that light itself is pulled into orbit around Gargantua’s hypothetical horizon and kept there. It took 800 terabytes to fully render the movie and make Gargantua somewhat accurate in terms of light-lensing.
A petabyte is 1 quadrillion bytes, or 1,000 terabytes. This is typically cluster storage, and while it’s available for purchase, it’s very expensive for the average consumer. For comparison, while rendering Interstellar took 800 terabytes, storing it at standard quality takes 1/200th of a terabyte. You could store approximately 2000 DVD quality copies of Interstellar on a petabyte. It took a little less than 5 petabytes to take a picture of the real black hole, M87.