Number systems
Take the binary word 01100010 01110101 01110011 01111001 — what does it mean? That depends on the representation.
1651864441if interpreted as an integer1.13194324E21when interpreted as a floating point (float)"busy"when interpreted as an ASCII string
So the meaning of the bits depends on the intended interpretation.
Decimal System
In most of our everyday life we use the decimal system. So how does this system work?
- Numbers available:
0 1 2 3 4 5 6 7 8 9 - Powers of ten:
10^5 10^4 10^3 10^2 10^1 10^0
We can write for example the number 124 as 1 * 10^2 + 2 * 10^1 + 4 * 10^0.
Arithmetic
Addition
overflow 110
753
491+
-----
1244Subtraction
borrow 1
753
491-
----
262Multiplication
612
24x
-----
2448 <- multiple by 4
12240 <- move one to the left and multiply by 2
-----
14688 <- additionDivision
Have a look at https://www.mathsisfun.com/long_division.html, where this example is explained in more detail.
017
----
25|425
25
----
175
175
----
0Binary System
How can we represent numbers in binary form?
- Numbers available:
0 1 - Powers of two:
2^5 2^4 2^3 2^2 2^1 2^0
Same principles apply as with the decimal system 1101 is 1 * 2^3 + 1 * 2^2 + 0 * 2^1 + 1 * 2^0 or 13 in decimal.
Storing Negative Numbers
The most obvious solution is to use a single bit (commonly the most significant bit) as a sign bit.
+7: 0000 0111
-7: 1000 0111Problems:
- Two zeroes, we get both
0and-0 - Addition needs to check if one of the numbers is negative, see example below
- Range:
(-2^{n-1}+1, 2^{n-1}-1), we lose a number because of the double zero
Let’s have a look at the addition with negative numbers. On the left we do 7 + 3 and on the right we do 7 + -3. We can see that addition works for positive numbers, however it fails for negative, here we get -10 instead of 4.
0000 0111 0000 0111
0000 0011+ 1000 0011+
---------- ---------
0000 1010 1000 1010Two’s Complement
A better system was developed that fixes all of the problems at once. The system is called the two’s complement and looks as follows for 3-bit number. Notice how all negative numbers still have a 1 as most significant bit.
unsigned signed
000 0 0
001 1 1
010 2 2
011 3 3
100 4 -4
101 5 -3
110 6 -2
111 7 -1Benefits:
- Only one zero
- Addition works with negative numbers
- Range:
(-2^{n-1}, 2^{n-1}-1)
This is also how Java stores numbers:
Integer.toBinaryString(2);
Integer.toBinaryString(-2);Convert into two’s complement
Let’s have a look at how to convert a number to its two’s complement, let’s take 3 as an example:
- Convert to one’s complement → flip all bits
- Add 1
3: 0000 0011
1111 1100 <- convert to one's complement
1111 1100
0000 0001 + <- add 1
------------
1111 1101Addition
overflow
7 0000 0111
3+ 0000 0011+
--- ----------
10 0000 1010Subtraction as Addition
overflow 11111 111
7 0000 0111
3- 1111 1101+
--- ----------
4 0000 0100Multiplication
0000 0100 4
1111 1101 -3x
--------------
0000 0100
0000 000
0001 00
0010 0
0100
100
----------
1111 0100 -12Division
0010100
-----------
100 | 1010000
100
-------
00100
100
-----
000Floating Point Numbers
So far we have a look at whole numbers, let’s have a look at numbers with a decimal part, aka floating point numbers.
Decimal System
What you are use to in the decimal system.
10^5 10^4 10^3 10^2 10^1 10^0 . 10^-1 10^-2
1/10 1/100Binary System
We can do the same in the binary system.
2^5 2^4 2^3 2^2 2^1 2^0 . 2^-1 2^-2
1/2 1/4If we have 8-bit unsigned floating point:
integer portion . decimal portion
0000.0000 -> 0
1111.1111 -> 15.9375Some examples:
- Can store
2.5 - Cannot store
16 - Cannot store
2^-5 = 0.03125
We could store 16 or 2^-5 if we are allowed to move the decimal point position.
Scientific Notation
Moving the decimal point is something that is done in the scientific notation.
300 000 000 : 3.0x10^8
0.00000015 : 1.5x10^-7Arithmetic with scientific notation:
3.0x10^8
1.5x10^-7 x
-----------
4.5x10^1
-----------
45IEEE-754 Format (32-bit float)
Floating point numbers are based on this scientific notation.
- 32 bits total
- 1 bit sign
- 8 bits exponent
- 23 bits mantissa
Formula:
(sign) x (1 + mantissa) x 2^(exponent - 127)Example:
0 10000101 10010001000000000000000
|sign| |exponent| |mantissa |
Exponent: 10000101 → 133
Offset: 133 - 127 = 6
Mantissa bits: 10010001 → 1/2 + 1/16 + 1/256
Total: 1 + 1/2 + 1/16 + 1/256 = 401/256
Final value = + (401/256) x 2^6 = 100.25Show in Java:
var bits = Float.floatToIntBits(100.25f);
Integer.toBinaryString(bits);From Decimal to Binary
Let’s take our number 100.25 again and go in the other direction.
Integer part:
We take our number, divide by two and write down the remainder:
100 0 ^
50 0 |
25 1 |
12 0 |
6 0 |
3 1 |
1 1 | Reading up gives 1100100
0Decimal part:
Multiply with two and if the decimal is greater than 1 then 1 else 0.
After it becomes greater then 1, subtract 1.
0.25 * 2 = 0.5 | 0 |
0.5 * 2 = 1 | 1 v Reading down gives 01
0We can now put these together with a dot in between:
1100100.01 * 2^0
Shift to the left until the last 1:
1.10010001 * 2^6, (we did 6 shifts), we now have our mantissa
Next convert the exponent to binary, but first we need to offset 127 (remember that we subtract 127 to get the real exponent, so now we need to add 127 to it)
6 + 127 = 133
133 1
66 0
33 1
16 0
8 0
4 0
2 0
1 1 -> 10000101Final float binary:
0 10000101 10010001 000000000000000Why 0.1 Can’t Be Represented Exactly
There are some problems with floating point numbers, on of the is that not every number can be represented correctly.
0.1 * 2 = 0.2 → 0
0.2 * 2 = 0.4 → 0
0.4 * 2 = 0.8 → 0
0.8 * 2 = 1.6 → 1
0.6 * 2 = 1.2 → 1
0.2 * 2 = 0.4 → 0
...
→ repeating patternYou have seen this also in the decimal system 1/3 → 0.33333….
Java:
bits = Float.floatToRawIntBits(0.1f);
Integer.toBinaryString(bits);Double Precision
The double is defined as follows:
- 64-bit
- 1 bit sign
- 11 bits exponent
- 52 bits mantissa
BigDecimal
If we want to keep precision, Java supports BigDecimal and BigInteger, which have virtually infinite precision.
// Wrong usage
System.out.println("Wrong usage, damage has already been done...");
BigDecimal bd1 = new BigDecimal(0.1);
BigDecimal bd2 = new BigDecimal(5.8);
BigDecimal bd3 = bd1.add(bd2);
bd3.setScale(2, RoundingMode.UP); // bd3 is unchanged
System.out.println("bd3 = " + bd3);
System.out.println("bd3 (rounded) = " + bd3.setScale(2, RoundingMode.UP));
System.out.println();
// Correct usage
System.out.println("Correct usage of BigDecimal");
BigDecimal bd4 = new BigDecimal("0.1");
BigDecimal bd5 = new BigDecimal("5.8");
BigDecimal bd6 = bd4.add(bd5);
System.out.println("bd6 = " + bd6);