A Level Computing - COMP3 Real Numbers



Blog Posts


Real Numbers

Standard Form

Very large or very small denary numbers are often written in standard form. This clearly saves writing out a lot of digits. Standard form is a number between 1 and 10 multiplied by a power of 10.

For example,

5.67 x 103 = 5.67 x 1000 = 5670
6.23 x 10-2 = 6.23 x 0.01 = 0.0623

To convert a decimal number to standard form, move the decimal point so that the number lies between 1 and 10. The power of 10 is the number of places the decimal point was moved, positive if moved to the left, negative if moved to the right.

Real numbers are stored in the computer using a similar principle to standard form. Instead of using a power of 10 however, they are stored using a power of 2. The decimal part of the number is known as the mantissa, and the power of 2 to which it is raised is known as the exponent. For simplicity in the examples given will use 16 bits. In practice real numbers are stored using a minimum of 32 bits. The greater the number of bits for the mantissa, the greater the precision that the number can be stored. The greater the number of bits for the exponent the greater the range of the number.

Our 16 bit numbers will use 10 bits for the mantissa and 6 bits for the exponent.

Converting From Denary To Two's Complement Format

6.5
Convert the absolute value of the decimal number to fixed point binary.110.1
Move the binary point so that the first digit is non-zero..1101 (3 places to the left)
Replace the binary point with a zero, pad out the right hand side of the number with 0s to make the number 10 digits.110100000
If the original number was negative, convert it to two's complement form. This makes the mantissa.110100000
Convert the number of places the binary point moved into a 6 bit binary number.11
If the point was moved to the right, convert the number to two's complement form. This makes the exponent.11
The whole floating point number is the mantissa followed by the exponent.110100000000011
0.25
Convert the absolute value of the decimal number to fixed point binary.0.001
Move the binary point so that the first digit is non-zero..1 (moved 2 places to the right)
Replace the binary point with a zero, pad out the right hand side of the number with 0s to make the number 10 digits.100000000
If the original number was negative, convert it to two's complement form. This makes the mantissa.100000000
Convert the number of places the binary point moved into a 6 bit binary number.10
If the point was moved to the right, convert the number to two's complement form. This makes the exponent.111110
The whole floating point number is the mantissa followed by the exponent.100000000111110
-42.75
Convert the absolute value of the decimal number to fixed point binary.101010.11
Move the binary point so that the first digit is non-zero..10101011 (moved 6 places to the left)
Replace the binary point with a zero, pad out the right hand side of the number with 0s to make the number 10 digits.101010110
If the original number was negative, convert it to two's complement form. This makes the mantissa.1010101010
Convert the number of places the binary point moved into a 6 bit binary number.110
If the point was moved to the right, convert the number to two's complement form. This makes the exponent.110
The whole floating point number is the mantissa followed by the exponent.1010101010000110
-0.1875
Convert the absolute value of the decimal number to fixed point binary.0.0011
Move the binary point so that the first digit is non-zero..11 (moved 2 places to the right)
Replace the binary point with a zero, pad out the right hand side of the number with 0s to make the number 10 digits.110000000
If the original number was negative, convert it to two's complement form. This makes the mantissa.1010000000
Convert the number of places the binary point moved into a 6 bit binary number.10
If the point was moved to the right, convert the number to two's complement form. This makes the exponent.111110
The whole floating point number is the mantissa followed by the exponent.1010000000111110


Converting From Two's Complement Format To Denary

100010000000011
Convert the exponent of the number to denary. Perform two's complement if the exponent starts with a 1.000011 = 3
If the mantissa was negative, perform two's complement to convert to a positive number.100010000
Replace the first zero with a binary point.0.10001
Move the binary point the number of places indicated by the exponent (to the right if the exponent is positive, to the left if negative).100.01
Convert the fixed point binary number to denary. Remember to add the negative sign if the mantissa was negative.4.25
111000000111110
Convert the exponent of the number to denary. Perform two's complement if the exponent starts with a 1.111110 = -2
If the mantissa was negative, perform two's complement to convert to a positive number.111000000
Replace the first zero with a binary point.0.111
Move the binary point the number of places indicated by the exponent (to the right if the exponent is positive, to the left if negative).0.00111
Convert the fixed point binary number to denary. Remember to add the negative sign if the mantissa was negative.0.21875
1001111110000110
Convert the exponent of the number to denary. Perform two's complement if the exponent starts with a 1.000111 = 7
If the mantissa was negative, perform two's complement to convert to a positive number.110000010
Replace the first zero with a binary point.0.11000001
Move the binary point the number of places indicated by the exponent (to the right if the exponent is positive, to the left if negative).1100000.1
Convert the fixed point binary number to denary. Remember to add the negative sign if the mantissa was negative.-96.5


IEEE Standard For Floating Point

This system uses 32 bits to represent a number. The bit pattern is slightly different to Two's Complement format. From the left, the bit pattern represents,

1 - Sign Bit
8 - Exponent stored in excess 127 mode (127 is added to the exponent before it is stored
23 - Mantissa (a leading 1-bit is implied with a binary point after it

Minifloat Format

Minifloat format is a 16 bit representation of real numbers. It uses a sign bit, a 5-bit excess 15 mode exponent, 10 mantissa bits with an implied leading 1-bit and binary point.

Normalisation Of Floating Point Numbers

Precision

The precision of a floating point number depends on the number of bits used to represent the mantissa. To illustrate this point, consider the following denary number,

42 012 000

We can express this in standard form as .42012 x 108 using 5 digits for the mantissa. If we only use 4 digits for the mantissa, we get .4201 x 108 and lose some accuracy.

If we put the decimal point in another place, say .042012 x 109, we need more digits for the mantissa. Systems for representing numbers need to allow the maximum precision for a given number of digits stored.

With binary floating point, numbers are normalised to allow this to happen.


Example 1

Place 0000100000000110 in normalised form.

0000100000000110 = .000100000 x 26

To normalise the number the decimal point should be moved in front of the first non-zero bit. If the decimal point is moved n places to the right then the power of 2 is reduced by n.

.000100000 x 26 = .100000 x 23 = 0100000000000011


Example 2

Place 1110111000000011 in normalised form.

1110111000000011 = -.00100100 x 23

-.00100100 x 23 = -.100100 x 21

-.100100 x 21 = -0100100000000001

-0100100000000001 = 1011100000000001


Key Facts

Normalised numbers always start with 2 different bits (01 for positive, 10 for negative). The mantissa of a positive number always lies between 0.5 and 1, and the mantissa of a negative number always lies between -0.5 and -1

Normalisation is used to,

  • ensure the maximum precision for a given number of bits
  • ensure that there is only one representation of a number


Reals - Online Lesson


Real Numbers - June 10.pdf
Real Numbers - June 11.pdf
Real Numbers - June 12.pdf
3.1 Real Numbers Starter and Self Assess.pdf