First we must define the fundamental basis of how information is processed in any modern computer. A computer is any device that processes information. The most basic unit of information we use today is the binary bit. A bit is actually the smallest possible unit of information. This is because a bit has exactly two states. Two states is the minimum state difference neccesary to convey any information. It's not possible to convey any information with something that only has one state because that would be a constant signal that never changes. Conveying information with more than two states is possible, but more complex. Thus, conveying information with two states is the simplest method. This information theory is why binary was selected as the base unit for information in modern computers. This also explains the etymology behind the word. The "Bi-" prefix meaning two and the suffix "-nary" comes from the Latin suffix "-narius" which became the english "-nary" which means "related to a certain number of items". In this case the "items" are states of information. Furthermore, binary was selected because of the ease of which it translates to the properties of electicity. Electrical switches and circuits have two states, on and off. Translating these states to binary is simple. Developing an electrical device that has more than two-state behavior becomes more complex. This is generally true for mechanical computers as well as computers that use other things to delineate state.
Much of the inspiration for binary actually came from a textile machine. The Jacquard loom, invented by frenchman Joseph Marie Jacquard in 1801 used a system of vertical punchcard stacks and threaded needles to automatically knit patterns into textiles. The punchcard either allowed or blocked a threading needle from going through the card for each woven layer in the fabric pattern. This Loom was the inspiration for Charles Babbage's Analytical Engine. Charles Babbage was a 19th century inventor and computer scientist well known for his genius in mathematics and mechanics. Charles began working on the design of the Engine in 1834. He continued to refine the design until his death in 1871. While he was unable to build the physical device during his lifetime the idea was revolutionary and expanded on Jacquard's loom idea much further. Charles' engine borrowed the idea of punchcards representing data, but Charles expanded on this by designing the engine so that it could perform general mathematics rather than only modifying Loom patterns. Charles also separated the punchcards into two stacks, one for data, and one for the instructions in his machine. He intended for the punchcards to follow a program that could include steps, loops, and logical checks. This was far ahead of it's time because it would've allowed the machine to be reprogrammed.
The evolution of these mechanical computers from Charles' general purpose design took a strange backwards turn after his death stopped further progress. Unfortunately Charles' design would go unnoticed for years afterwards until his work was re-discovered in the 1930s. In his 1800s era the mechanical limitations of contemporary technology did not allow him to build a general purpose Analytical Engine. He had no feasible way to make his design electronic either. The underlying goal of a general purpose programmable machine with punchcards did survive, and it inspired a new generation of scientists. The next computer science revolution would come during WW2 when Alan Turing and his team at Bletchley park designed his "Bombe" or "logic bomb" machine to decode Nazi Enigma encrpyted radio transmissions. His Bombe is widely known as the world's first computer. Technically, Polish computer scientists and human computers had also developed "Bombes" to decode earlier versions of the Enigma machine before and during the Nazi invasion of Poland. These brave scientists transmitted their findings to British intelligence before and during the war.
The Enigma Machine worked by encrypting messages for radio transmission in the German alphabet one letter at a time. It did so using various rotor and dial settings that would scramble the German plain text alphabet into an encrypted message where each German letter was scrambled to be a different German letter. The settings for the dials were sent securely in advance via couriers. These settings were closely guarded secrets and were ordered to be destroyed before capture. The rotor settings determined the encryption and decryption of the plain text German alphabet. This meant Enigma was a symmetric cipher, one that encrpyts and decrypts a message with the same settings. These rotor settings were known as the key, because igf you knew them, you could read the secret messages. The rotor settings list held by German officers was referred to as a "key list" because it contained a long list of secret rotor and dial settings. The keys were changed daily according to these secret lists. The Germans were wary of their secrets being leaked because they discovered the Polish work to decode them. The Germans changed their keys so frequently to make it impossible to test all the rotor combinations by hand. Due to the fighting it was hard to gain physical access to the key lists. The Allied forces decided to build logic machines in Britain to test Enigma setting combinations automatically. These machines would try to deduce the key for that day within 24 hours. Messages older than 24 hours lost their tactical relevance quickly. The Enigma machine had two major flaws which sped up the automated rotor setting guessing process. The first flaw was that the Engima machine could not encode a German letter to itself. This meant that any Engima rotor combination or "possible key" that tried to encrypt a German letter to itself was not the real key. Since the Enigma encoded one German letter to every other German letter, this reduced the possibilities significantly. The second major flaw is that german messages often contained patterns. These patterns were called "cribs" by the British. These patterns would be things like common German words or phrases used in transmissions, like "WETTER" which is german for "weather". These patterns were visible because lots of these transmissions included weather reports or other common phrases. By analyzing the pattern of the encrpyted transmissions and by using or guessing what "cribs" might be present, the Allies could eliminate possible rotor combinations (keys). Exploiting these flaws manually was still impossible because humans would still not be fast enough to test all the possible rotor combinations by hand even with shortcuts. Alan Turing had a genius idea to create his own Bombe to automate the guessing process. Alan's Bombe took an Enigma encrpyted message as input and it used a series of gears and wires to test Enigma keys. The test was simple, the machine would stop and "halt" if the rotor combination under test was a valid key, meaning it followed the "cribs" pattern and it didn't encrpyt any German letter to itself. The machine would continue moving through tests via gears if the combination under test was not a valid input. This process greatly reduced the number of inputs that needed to be tested. All the valid keys found by the Bombe were offloaded to human "computers" who used replica Enigma machines to find the correct key. This breakthrough allowed engima to be decoded within 24 hours and thus allowed the allies to decode all future Nazi transmissions without capturing the key sheets from German Nazi officers.
The key takeaway from this device was that a logic process could be automated by a machine. However this was simply the first of many such breaktrhoughs. Alan's Bombe was revolutionary but suffered from many limitations. It was a specialty built machine, meaning it only performed the specific logic check that it was hard wired to do. If Alan had wanted to check a different kind of logic check, he would have to physically modify the machine. Additionally the Bombe did not use binary and did not automatically store the output. Instead it's "output" was whether it halted or not, indicating a "True" / "False" result for the logic check. The rest of the decoding process was still manually done by humans.
Compared to Charles' Analytical Engine, Alan's Bombe was simplistic. The Bombe did not generalize to all of math, and was hard wired. Charles had envisioned a generally programmable machine long before Alan's breakthrough. Charles' design was more advanced because it envisioned using punchcards as the program instructions and data. The Bombe could only process one kind of input, German messages encrpyted by the Enigma Machine. Charle's engine imagined a machine that could accept any binary input that could program all of mathematics. Alan's Bombe could only output one type of data, "halt" or "no halt". Charle's engine could output binary data that could theoretically represent all kinds of information. Alan's Bombe could not be re-programmed using different instructions. The logic that the Bombe used to check Enigma combinations was baked into the circuitry of the device. Charle's engine could perform all kinds of different logic checks. Charle's engine could be programmed with stacks of punchcards to do any sort of logic check in binary. While Charle's engine was never built, it was a theoretical leap ahead of Alan's Bombe developed more than a century earlier. Luckily, another genius named Von Neumann would take the lessons learned from both Alan and Charles and apply them to his breakthrough.
Von Nuemann was the genius who created the architecture for the Electronic Numerical Integrator and Computer (ENIAC). The ENIAC was the first programmable, electronic, general-purpose digital (binary) computer. First built in 1945, it was not yet capable of being programmable without human intervention. The ENIAC used IBM brand punchcards for input data and output data similar in theory to Charle's Analytical Engine design. The ENIAC originally used a plugboard to change programs. This manual method of re-programming was tedious and required manual human experts similar to Alan's Bombe. Later in April 1948 the ENIAC was upgraded to allow IBM punchcard inputs to reprogram program instructions. This upgrade marked the first time an electronically programmable general purpose digital computer was created. This was the dream Charles' had spent years designing, now finally brought to life through the dedicated work of teams of scientists. The ENIAC performed general purpose binary math instructions using triode vaccum tubes which stored binary data as the On/Off state of the vaccum tube. It was used for various military computations such as artillery trajectory table calculations.
These breakthroughs form the long chain that led to the development of the first electronic, general-purpose digital binary computer. In order to understand how these devices did math, an explanation of binary mathmematics must be given:
In mathematics, Binary is represented as 0 and 1. Numbers in binary are represented in a base-2 system. A "base" system is a system of counting numbers that uses a number of base "states" to count things and perform computations. There have been many historically used base-systems for many different types of counting including base-60, base-10, base-2 and base-16. In base 2 each digit represents a power of two. For any base system this generally holds true. In base 10 each digit represents a power of 10. Each digit in base 2 has 2 possible options, 0 and 1. This pattern also generally holds for any base system. Each digit in base 10 follows the pattern with 10 possible options, the numbers 0 through 9. It is possible to convert from any base system to any other base system, such as from base-2 to base-10 or vice versa. Fundamentally a base-system is just a different way to represent the same numbers. Here is how to understand bases and convert between them:
The number 123 in base 10 consists of three digits. We could also write this number as \( 1 * 10^2 + 2 * 10^1 + 3 * 10^0 \). This is how base 10 works, each digit is equal to the value of the digit itself multiplied by 10 to the power of the digit's position from the decimal point (or from the "tenths" place). If the digit is immediately to the left of the decimal point in the "tenth's place" the power of ten used is \( 10^0 \) which equals 1. This is called the "ones place". Moving a digit to the left we have a 2 in the "tens place", the power of ten used is \( 10^1 \) which equals 10. Moving another digit to the left we have a 1 in the "hundreds place", thus the power of ten used is \( 10^2 \). This representation can be extended for any number.
The method of representing a number as a collection of digits and base powers can be extended to other base systems like base 2. The base-10 number 123 can also be written as \( 64 + 32 + 16 + 8 + 2 + 1 \). Each of these numbers in the addition sequence is a power of two. Replicating our pattern earlier with base-2 instead of base-10 we have:
\( 123 = 1 * 2^0 + 1 * 2^1 + 0 * 2^2 + 1 * 2^3 + 1 * 2^4 + 1 * 2^5 + 1 * 2^6 \)
If we simplify the powers of two we see a sequence similar to our addition sequence earlier. This sequence also contains powers of two that are not in the addition sequence used earlier to add powers of two up to \( 123 )\:
\( 123 = 1 * 1 + 1 * 2 + 0 * 4 + 1 * 8 + 1 * 16 + 1 * 32 + 1 * 64 \)
We can also observe some other patterns. The left hand side of each multiplication pair is always 1 or 0. This corresponds to our two possible digit values in base 2. Additionally, the right hand side of each multiplication pair is always a power of two which increases by a factor of 2 for every digit in the sequence. We can re-construct any base-10 number this way in base-2. In binary notation we write a 1 when the value of that digit is 1 and 0 when it is zero. Increasing powers of two are written right to left just like how increasing powers of ten are written right to left for base 10 numbers. That means this binary number is written as:
\( 123 = 1111011 \)
The final binary number follows the same sequence as the previous multiplication sequence. The binary number is written as the sequence of the left hand part of those multiplications in the previous equation. This method is used to go between base-10 and base-2 for any integer number.
We can also reverse this process to go from base 2 to base 10. Using the same binary number \( 1101111 \) we will write this number as a sequence of digit value multiplied by the digits' "place" in the sequence. Binary numbers written this way are read right to left just like regular base 10 numbers. The rightmost 1 is in the "ones" place, the second rightmost 1 is in the "twos" place. the thirdmost 1 is in the "fours" place, and so on. \( 1111011 \) is thus written as:
\( 1101111 = 1 * 1 + 1 * 2 + 0 * 4 + 1 * 8 + 1 * 16 + 1 * 32 + 1 * 64 \)
We can then see that this is the same sequence from earlier, which we know equals \( 123 \) in base 10.
Binary can do more than just represent numbers differently, they can also be added, subtracted, multiplied and divided just like base 10 numbers.
Binary operations follow the same mathematics rules as base-10 operations. Two numbers are operated on in left to right notation. The same rules for associativity, communtivity, and PEMDAS ordering apply to base-2 and base-10. Both numbers are written such that their digit values increase in value by a factor of the base number going from left to right. These places extend to any power of the base, which allows each representation to represent infinite value. Binary numbers can be written as any number of digits but some standards have been agreed upon. a Byte is eight bits long and it is the smallest standard representation of any binary number. Two Bytes (16 bits) is referred to as a "half-word". Four Bytes (32 bits) is referred to as a "word". Eight Bytes (64 bits) is referred to as a "double word" or a "double" for short. The pattern between these standard lengths is that the size keeps doubling. Other sizes can be used but these ones are the standard sizes of data used today in most computers, circuits and data theory.
Binary addition follows the same rules as base 10 addition but it has simpler outcomes because there are only two states for each digit. When you add two binary numbers you align their digits just like you would for base 10 numbers. To add two base 2 numbers like \( 00000110 )\ and \( 00000101 \) you first line up their digits vertically. Doing this we have \( 00000110 + 00001101 \). notice that the first digits in the "ones place" to be added are 0 and 1. The second digits in the "twos place" to be added are 1 and 0. The third digits to be added in the "fours place" are 1 and 1. The fourth digits to be added in the "eights place" are 0 and 0 and so on. This example was chosen because it represents the four possible combinations of two binary digits. These combinations can be represented in a truth table, and each combo has a specific solution. Going through each combo:
\[ \begin{align} 0 + 0 &= 0 \\ 1 + 0 &= 1 \\ 0 + 1 &= 1 \\ 1 + 1 &= 10 \end{align} \]
The last digit addition means that \( 1 + 1 \) will "carry over" into the next left digit. This logically makes sense because 1 in base ten and base two is 1 in the "ones place" for both representations. \( 1 + 1 = 2 \) in base ten which puts 2 in the "one's place". In binary, two is represented by a 1 in the "two's place" thus we write it as \( 10 \). The carryover digit is similar to adding 6 and 5 in base ten. 6 + 5 carries a "1" into the "ten's" place for a result of 11. If we follow our addition example digit by digit then 00000110 + 00000101 = 00001011. We can check this answer by converting the equation to base ten. \( 00000110 \) equals 6. We can see that it equals \( 4 + 2 \) because there are 1's in the two's place and the fours place, and zeros everywhere else. \( 00000101 \) equals 5. We can see it equals \( 4 + 1 \) because there are 1's in the one's place and the four's place. \( 6 + 5 = 11 \) is the expected result of our addition. We can see that \( 00001011 \) is indeed equal to eleven because we have 1's in the eight's place, the two's place, and the one's place, and zeros everywhere else. The value represented is thus: \( 8 + 2 + 1 = 11 \). Its important to note these additive patterns hold for all digits in binary numbers. This means you can add any binary number to any other binary number of any length by following just these four rules. This is equivalent to three rules if you consider that order of addition doesn't matter because of associativity, because \( 0 + 1 = 1 + 0 \). This simplification of addition is what makes computers powerful. If you wanted to do the same thing with base ten you'd have to have one rule for all 55 unique combination of base ten digits.
Binary multiplication has the same properties as base ten multiplication. Going back to our example with 6 and 5, or 00000110 and 000000101 in binary, we stack these two vertically to perform the computation. \( 5 * 6 = 30 \) in base ten. \( 00000110 * 00000101 = 00011110 \) in base two. multiplication in binary also has four possible combinations for each digit like addition. similarly, one combination has a result that "carries over" into the next leftmost digit. Unlike addition, the place of the digit we're multiplying, whether its in the ones place, twos place, or any other place, determines the result. To calculate the result we double the value of the result multiple times depending on the "place" of both operands in the multiplication. Luckily, binary numbers can easily be doubled by shifting them left. They can also be divided in half by shifting them right. This is a pattern in base ten as well. For example: \( 200 * 200 = 2 * 2 * 100 * 100 = 4 * 10,000 = 40,000 \). Notice how when we did that multiplication we first multiplied the digits and then shifted the result leftwards four places (equivalent to multiplying by 10,000). The answer is 40,000, because 2 * 2 is 4, and both twos were digits in the hundred's place. Digits in the hundreds place are shifted left of the decimal point by two places. When we multiply two "hundred's place digits" together we add their shifts together. Since both digits were shifted left two places, the total shift of the multiplication must be \( 2 + 2 = 4 \) places.
Unlike addition we also have to take into account digits that aren't aligned vertically. In base ten, the calculation \( 2 * 20 = 40 \) is an example of this. going digit by digit we have 2 and 0 in the one's place. \( 2 * 0 = 0 \) and clearly the result of \( 2 * 20 \) is not just 0. We also have to multiply each place with every other place. The one's place also multiplies the ten's place, and the hundred's place, and the thousand's place, and so on. As an example lets Multiply two base ten numbers with three digits each: \( 123 * 456 \). We multiply the three digits of 123 with each digit of 456 one by one. The calculation is sub-divided into 9 different single digit multiplications.
\[ \begin{align} 123 * 456 &= \\ 100 * 400 &+ \\ 100 * 50 &+ \\ 100 * 6 &+ \\ 20 * 400 &+ \\ 20 * 50 &+ \\ 20 * 6 &+ \\ 3 * 400 &+ \\ 3 * 50 &+ \\ 3 * 6 &= \\ 40,000 + 5,000 + 600 &+ \\ 8,000 + 1000 + 120 &+ \\ 1200 + 150 + 18 &= \\ 56,088 \end{align} \]
We follow this procedure with base two as well. Each digit in 00000101 is multiplied with every digit of 00000110. Thus we will have 64 individual digit-digit multiplications, 8 digits multiplied with 8 other digits. Luckily the possible values of each combination in binary is only 4 possible values instead of 100 possible values. For simplicity the example numbers 5 and 6 have been reduced to half-bytes to reduce the number of "times zero" multiplications in the calculation. The below table shows how a computer breaks down the multiplication digit by digit:
\[ \begin{align} 0110(5) &\\ * 0101(6) &=\\ (0 * 1) 0000 * 0001 &+ \\ (0 * 0) 0000 * 0000 &+ \\ (0 * 4) 0000 * 0100 &+ \\ (0 * 0) 0000 * 0000 &+ \\ (2 * 1) 0010 * 0001 &+ \\ (2 * 0) 0010 * 0000 &+ \\ (2 * 4) 0010 * 0100 &+ \\ (2 * 0) 0010 * 0000 &+ \\ (4 * 1) 0100 * 0001 &+ \\ (4 * 0) 0100 * 0000 &+ \\ (4 * 4) 0100 * 0100 &+ \\ (4 * 0) 0100 * 0000 &+ \\ (0 * 1) 0000 * 0001 &+ \\ (0 * 0) 0000 * 0000 &+ \\ (0 * 4) 0000 * 0100 &+ \\ (0 * 0) 0000 * 0000 &= \\ \\ \text{first bit of 5 times all bits of 6: } 0 + 0 + 0 + 0 = 0000 + 0000 + 0000 + 0000 &+ \\ \text{second bit of 5 times all bits of 6: } 2 + 0 + 8 + 0 = 0010 + 0000 + 1000 + 0000 &+ \\ \text{third bit of 5 times all bits of 6: } 4 + 0 + 16 + 0 = 0100 + 0000 + 10000 + 0000 &+ \\ \text{fourth bit of 5 times all bits of 6: } 0 + 0 + 0 + 0 = 0000 + 0000 + 0000 + 0000 &= \\ \\ 16 + 8 + 4 + 2 + \text{lots of zeros} = 30 = 00011110 \end{align} \]
This method is known as a binary "truth table". In base ten, "multiplication tables" are taught to elementary students and are memorized to multiply any digit 0-9 with any other digit. Multiplication tables are equivalent to a base ten "truth-table". In base-2 the truth table is as follows: \(0 * 0 = 0, 0 * 1 = 0, 1 * 0 = 0, 1 * 1 = 10\). The result is shifted left based on the place of each digit. For example: \(0101 * 0001 = 0101 \text{which is equivalent to: } 5 * 1 = 5 \), \(0101 * 0010 = 1010 \text{which is equivalent to: } 5 * 2 = 10 \), \(0101 * 0100 = 00010100 \text{which is equivalent to: } 5 * 4 = 20 \). There's an important pattern to note here. If we multiply a binary number by another binary number where the second binary number only has one "one" digit and all the other digits are zero, we effectively shift the first binary number to the left by that digit's "place" in the second number. This makes logical sense because we are multiplying some number by a power of 2. When we multiply a number by a power of two we call it a logical shift left. The amount of shifting is equal to what power of two we are multiplying by. We did this earlier with binary number 5. When we multiplied binary 5 with binary 1 we multiplied it by 2^0, and we also shifted it zero places to the left. This is because 5 * 1 is equal to 5. When we multiplied binary 5 with binary 2 we multiplied it by 2^1, and we shifted it 1 place to the left to get ten. When we multiplied binary 5 with binary 4 we multiplied it by 2^2 and we shifted it two places to the left to get twenty. This pattern continues, multiplying by 8 is a shift to the left of three places because \(8 = 2^3\). Multiplying by one-half is a shift to the right by one place because \(2^-1 = 1/2\). This works for any power of two. We use a strategy of splitting the second number in the multiplication (the multiplicand) into its powers of two when calculating by hand and by computer. We then simply shift the first multiplicand by these powers of two, and then add the result for each separate multiplication. This means that multiplying a four digit binary number with another four digit binary number results in four "power of two multiplications" we call logical shifts, and then four additions of each result to get the final result. This pattern is called Logical Shifting. We can Logically shift left to multiply by positive powers of two. We can Logically shift right to multiply by negative powers of two. Combining these logical shifts and addition is what implements binary multiplication in hardware.
As an example of multiplication with multiple digits we have: 0101 * 0110 (5 * 6). We break this down into two multiplications, just like we do for base-10, giving us: 0101 * 0110 (5 * 6) = 0101 * 0100 (5 * 4 or 5 * 2^2 or binary 5 shifted left 2 times) + 0101 * 0010 (5 * 2 or 5 * 2^1 or binary 5 shifted left 1 time). The result of the first shift is: 0101 * 0100 = 00010100. The result of the second shift is 0101 * 0010 = 1010. Adding these two sub-results gives us the final result: 00010100 + 1010 = 00011110. Checking this result to make sure it equals base-10 thirty we have: 00011110 = 0 * 1 + 1 * 2 + 1 * 4 + 1 * 8 + 1 * 16 = 2 + 4 + 8 + 16 = 30.
This binary system also has it's own modification to regular addition, subtraction, multiplication, and division, as well as every other mathematical operation. No matter what base is used, all mathematical operators work the same way. This is important in number theory because it allows humans to compute things using only two states instead of needing ten states to represent the standard digits of base 10.
So far binary representation has been expressed as positive whole numbers. These are called the "natural numbers" in mathematics. Negative numbers also exist, so we must represent these negative numbers in binary. Scientists debated on what standard to use to represent negative numbers for a long time but they eventually settled on the Two's complement method.
Two's complement is a standard for representing and creating negative numbers. Two's complement works by using the leftmost bit in a binary number as the sign bit. This means that this left-most bit does not encode value, instead it encodes the sign of the number, whether it is positive or negative. We call binary numbers encoded this way "signed". Binary numbers that don't use a sign bit and use all bits for value are called "unsigned". For example, an 8-bit signed integer represented in binary like "00001000" represents positive 8. If we changed this signed integer to have a "1" bit in the leftmost bit position like so "10001000", this would encode negative 8. This is different from an unsigned integer where "00001000" would be 8 and "10001000" would be 128 + 8 = 132.
Its also possible to convert a positive binary number into a negative binary number with the same absolute value. This is done with signed binary integers in two steps. The first step is to take the binary number we want to invert, for example: 00001000 which equals 8 and flip every bit. This means every bit will swap zero with one or one with zero. In our example of 00001000 inverting this number gives us 11110111. The second step is to add positive 1 or 00000001 to the inverted number 11110111. Adding these two numbers gives us 11111000 which is the binary representation of "-8". We can check this by adding the binary "8" to "-8" and seeing if the result is 0. If we add 00001000 and 11111000 we get 00000000 thus verifying that 11111000 is the correct representation of "-8".
By extending binary to use signed integers we also introduce subtraction. Subtraction in mathematics is really just addition with negative numbers. This means that implementing subtraction is a simple extension of addition and 2's complement inversion. Borrowing from our earlier examples: say we want a computer to do "8 - 5 = 3" in binary. First we recognize that eight minus five "8 - 5" is equivalent to eight plus negative five "8 + (-5)". Then all we need to do is convert positive five into negative five in binary, and then add positive 8 with negative 5 to finish the calculation.
In hardware this operation is implemented in similar to the mathematical procedure described above:
Historically, computers used a number of different bases for their computations. All modern computers have settled on base-2 as the ideal way to do computation. Base-2 was chosen because of the simplicity of representation and also because its easy to implement binary with electrical circuits. Electrical circuits also make good computers for many other reasons. To see why we will examine what makes a computer "good". In order to make a good computer, you would in theory want a device that has the following properties:
Many devices have attempted to satisfy these requirements, and the best implementation we currently have is the modern electrical binary computer. Modern computers use electrical circuits because they satisfy these requirements the best compared to any alternatives:
We represent bits in circuits using High and Low voltage. First the engineer decides on a voltage range for the circuit (or sub-circuit). High voltage is the top of that range, usually something like 3.3V, 5V, or 12V, depending on the circuit. Low voltage is at the bottom of that range, at 0V. Circuits are designed to store bits as high or low voltages. These voltages are electric charges that are held in transistors. Transistors are a type of circuit device that can rapidly switch between a conductor and an insulator depending on an external input.
In layman's terms a transistor is a glorified lightswitch, but extremely small and fast. The reason Silicon Valley gets it's name is because Silicon is a semi-conductor element. A semi-conductor is defined as a substance which can become a conductor or insulator based on some external factor. As discussed earlier, transistors are circuit devices designed to do exactly this, switch between conducting and insulating. Think about why a lightswitch is useful to us. A lightswitch is only useful if I can control when electricity flows (conducting), and when it doesn't (insulating). Otherwise the lights would always be on, or off, and I wouldn't be able to control them. Similarly, I want to control the bit state held inside a transistor. I can't do that if the transistor is always at High voltage (conducting) or Low voltage (insulating). The transistor is only useful if I can cause it to switch between on and off states.
A Metal copper wire doesn't make a good lightswitch, because electricity always flows.
A rubber wire doesn't make a good lightswitch, because electricity never flows.
A semi-metal silicon transistor makes an excellent lightswitch, because I can control whether it behaves like a conductor or an insulator. This property is why scientists use silicon and other semi-metals for transistors. Modern transistors come in many forms and have a long history of improvement. One of the earliest types were vacuum tube transistors. These were large bulky devices that were prevalent in the 1950s. As transistor design evolved, new types of transistors were created. The most prominent was the Metal Oxide semi-conductor Field Effect Transistor (MOSFET) first demonstrated in 1959. This type of transistor is the type used in all modern computers today. Lets first break down what MOSFET stands for:
Putting this all together, a MOSFET is a transistor made of semi-metal and Oxygen molecules that act as a semiconductor, and that semi-conductor is controlled by the field effect of an external magnetic field. This means that MOSFETs change their conducting behavior depending on this external magnetic field. Engineers create circuits that switch transistors using small electrical currents. This works because electrical currents generate magnetic fields. The external current is the mechanism behind the Field Effect of the transistor.
The theory behind modern computing was thought up by a genius named George Boole. Boole introduced a new branch of math called Boolean algebra, which is the algebra of logic. Boolean algebra has two states for any logic "question": True or False. These "questions" can be evaluated by themselves, or with other "questions" as part of a complex expression. Ultimately by chaining logic expressions together we can recreate the rest of mathematics and perform calculations. This is useful to humans because we can use two absolute states to do math. As discussed earlier, we store information in binary states. We do this with transistors that store these bits as High and Low voltages. To implement Boolean logic, we first must represent the "True" state as a high voltage, and the "False" state as low voltage. Then we must implement the Boolean operations. The arrangement of transistors is what implements Boolean operations in the physical world. Boolean logic expressions are created by chaining these operators together.
Boolean algebra has the following fundamental operators:
These operators are called gates when implemented as a circuit. These gates are wired together to create Boolean logic expressions. An example expression is 3-AND, which is a logic gate that outputs High Voltage only when all three inputs are High Voltage, otherwise the output is Low Voltage. This is equivalent to the Boolean Expression "A & B & C" where "A, B, C" are the inputs and "&" represents the AND boolean operator. The logic result of this expression is True only if all the inputs are True.
Computers are complex arrangements of Logic Gates that allow us to compute regular math very fast. These calculations drive everything in a computer. All computer graphics, all human to computer interactons, the Internet itself, all of it is fundamentally just mathematics. We design a complex circuit to do that math fast and accurately. We use the circuit to store this useful information for later use. We call that circuit a computer.
The structure of a computer starts in the Central Processing Unit or CPU. The CPU contains the physical transistor circuit that does calculations. The only other part of the computer that processes information is the Graphics Processing Unit or GPU. some computers do not have GPUs and only have a CPU. All the other parts of the computer simply exist to support the CPU, store it's result, or power the CPU. The physical CPU design starts with designing the circuitry that allows it to Compute math. This sub-section of the CPU is called the Arithmetic Logic Unit or ALU. The ALU receives inputs from the CPU and does the actual calculations in hardware.
We will first focus on the implementation of the ALU. To start we need to implement the four basic mathematical operations on binary inputs in the ALU. We will start with Addition because binary addition is the beginning of binary operations. We need to implement addition first because we use the addition circuit to implement multiplication, subtraction and division.
INSERT HARDWARE DISCUSSION OF FULL ADDER IMPLEMENTATION THEN DO TWO'S COMPLEMENT IMPLEMENTATION WITH NOT GATES THEN DO SUBTRACTION IMPLEMENTATION AS ADDITION CIRCUIT + TWO'S COMPLEMENT CIRCUIT
The first computer architectures contained:
Some examples include the Turing machine, the Antikythera mechanism, and the Charles Babbage Difference Engine. All of these were Analog computers, meaning they did not use binary states of Boolean True or False to do their computations. Analog means that the storage of information was based on a continuous flow of state. This is different from digital, which stores state in two distinct non-continuous states. They were also designed with a specific calculation in mind, and could only run one program. Often they were mechanically driven instead of the modern way of being electrically driven.
The Evolution began when a genius named Vonn Nuemann came along and proposed that the memory storage be made separate from the computation mechanism. The "Von Neumann model" is an upgraded computer architecture that uses a separate memory to store program instructions and data. This was an evolutionary leap because the storage of results and instructions was no longer tied to the processor that computed the results. Before this, computers used physical objects to process and store the results. Using gears as an example, the Charles Babbage difference engine processed the program in the gears of the device. It also held the results of the computation in the gears of the device. This meant that results had to be written on paper separately by an operator. It meant that if you wanted to change the program's state, you'd erase the results and the current processing. If you wanted to compute a different program, you'd have to manually adjust the gears yourself. It is this crucial separation of computation and memory that served as the fundamental turning point towards modern computers. By storing data separately and automatically, the computer could also read data as instructions and set its own program state without human intervention. This most fundamental modern computer architecture contains:
Some examples include Punchcard computer systems.
As discussed earlier, Modern computers chose to use a binary system of electrical circuits including transistors to implement boolean logic. This logic is implemented to create a computation engine which we call the "Central Processing Unit" or "CPU". The CPU contains sub-circuits that do whatever instructions we have designed it to. The CPU contains many parts:
There are lots of types of instructions, and lots of instruction sets. Here is an example of one:
A basic CPU consists of five steps to conduct these instructions:
A basic Memory takes the voltage outputs of the CPU across the bus-wire connection and stores it in a storage medium. This can be done in a variety of different ways. It uses a memory address provided by the CPU's instruction to specify where to store the information. This memory also needs to be readable in some way. Some very old memory mediums are physically observable, like punchcards, dials, or vacuum tubes. Modern memories store information in transistor states, so that the amount of memory can be very large, and so the memory stays as it was, even if the computer is not powered on.
Modern Computers are much improved from punchcard machines. Punchcard machines are based on Assembly Language. Assembly Language is a circuit dependent language that is a direct 1-1 translation from the Computer's instruction set, to the Assembly language instruction set. This is the bridge between software and hardware. But humans were never meant to talk in 0 and 1, so they developed compilers. Compilers are complex programs that turn a human readable programming language into assembly language for a wide variety of different computer types. One of the first examples was the programming language FORTRAN. Using these more readable and standardized programming languages, humans were able to program computers more efficiently because every program could be written in FORTRAN, yet run on machines with different instruction sets. At the same time, a hardware Evolution was underway which would allow computers to be smaller, faster, more efficient, and talkative.
Modern computers usually include:
A list of the relevant inventions that go into a modern computer include:
While AI seems like a complicated topic, understanding the mathematics behind AI reveals it is simpler than some people think. Fundamentally, AI in its modern form is a pattern recognition machine. All AI architectures use probability and statistics to transform digital inputs into usable patterns. From a high-level view there are a couple main AI "types". The main AI types in use today are:
AI of all types are made from a similar process. In the following example assume we are training a Convolutional Neural Network (CNN) to take photos of cars and classify them: