A Level Computing - COMP1 Data Representation

WikiBooks Fundamentals of Programming 'Test Me' Information coding Schemes 'Test Me' Graphics 'Test Me' Sound

Useful Videos

Blog Posts

Binary

Denary Numbers

Our number system is called the denary or base 10 system. 10 digits (0 - 9) are used for counting.

You should remember the place value table from your Maths lessons in Year 7.

10000000	1000000	100000	10000	1000	100	10	1
0	0	0	4	7	3	9	5

As each place value moves one to the left, the power of ten is increased by 1. The units column is 100, the tens, 101, the hundreds, 102 and so on.

Binary Numbers

The place value columns in the binary table below are powers of 2. The only valid digits to use are 1 and 0. We call these bits. Just like in the denary table, we add up the results of multiplying each number in the table by its place value. Click on the buttons below the table to see how the numbers from 0 to 255 can be represented in pure binary. The denary value is shown below the table.

Play the CISCO Binary Game - Tetris but with Binary!

Binary Addition

Addition using binary numbers is relatively straightforward. The following rules need to be observed,
•0 + 0 = 0
•0 + 1 = 1 + 0 = 1
•1 + 1 = 10 (0, carry 1)
•1 + 1 + 1 = 11 (1, carry 1)

In the following example, the binary numbers, 1011 (11 in denary) and 10010 (18 in denary) are added together.

When performing binary addition, you must lay your sums out this way. Do not work things out in your head or use any mental methods that you learned in little school. The potential for error is too great not to take care with something that is quite basic.

Representing Negative Numbers

The number shown in the interactive binary place value table is an unsigned binary integer. That means that it is a positive whole number as far as we are concerned.

In denary, when we want to represent a negative number, we simply place a minus sign before it. Binary doesn't work that way.

The Two's Complement system is used to represent negative numbers in binary. The system works a bit like a milometer. If the milometer is set at 00000 and is turned back one mile, it would read 99999. A negative binary number always has a 1 as the first bit. This is often referred to as the sign bit

Convert From Denary To Two's Complement

To convert a negative denary number to binary, first find the binary equivalent of the positive integer.

Eg -27, 27 = 00011011

Change all of the 0s to 1s and all the 1s to 0s. (Flip the bits).

Eg 11100100

Add 1 to the result.

Eg 11100101

The following place value table works like the previous one. This one allows you to show negative numbers. Pick a random negative number. Follow the steps described above for converting to two's complement. You should end up with the positive number.

-128	64	32	16	8	4	2	1
1	1	1	0	0	0	1	0

=30

Convert From Two's Complement To Denary

The Two's Complement method of representing negative numbers makes sense if you think of the sign bit as representing a negative number. In the case of the binary counter above, the first column represents -128. Convert to denary as normal, adding the column heading whenever there is a 1 below it.

Binary Subtraction

To perform subtraction in binary, simply convert the number that you are subtracting into Two's Complement form and add it to the other number. You do not carry the one on the leftmost digit.

Key Points To Remember

A positive number always has 0 as the Most Significant Bit.
A negative number always has 1 as the Most Significant Bit.
An even number always has 0 as the Least Significant Bit.
An odd number always has 1 as the Least Significant Bit.
The number of digits used to represent a number is called the word length.
-1 is always represented with a 1 in every bit.

Fixed Point Binary Numbers

So far, you have only looked at ways of representing binary integers. To understand how binary fractions work, first consider how decimal fractions work.

1000	100	10	Units	1/10	1/100	1/1000
0	5	4	9	.3	6	7

In binary, the column headings are powers of 2 rather than 10.

8	4	2	Units	1/2	1/4	1/8	1/16
1	0	1	1	.1	0	1	1

This would store the number 8 + 2 + 1 + 0.5 +0.125 + 0.0625 = 11.6875

When storing a fraction in binary digits the binary point is not stored. We need to assume a certain number of bits before the binary point and a number after the binary point. This representation of binary numbers is called fixed point binary.

Assuming 8 bits either side of the binary point and we get the following headings,

In the above example the number stored would be :

32 + 8 + 2 + 1 + 0.25 + 0.03125 + 0.0078125 + 0.00390625 = 43.2869375

Unless a number is an exact power of 2 it is impossible to store it exactly using this method. Errors in conversion of decimal numbers arise unless a large number of bits are allocated to store them.

You can convert any decimal fraction to a binary fraction. Multiply the fractional part of the number by 2. Take the integer part of the result (1 or 0) as the first bit. Repeat this process with the result until you run out of patience. For example, to convert 0.3568 into fixed point binary with 8 bits to the right of the binary point,

0.3568 x 2 = 0.7136 :0
0.7136 x 2 = 1.4272 :1
0.4272 x 2 = 0.8544 :0
0.8544 x 2 = 1.7088 :1
0.7088 x 2 = 1.4176 :1
0.4176 x 2 = 0.8352 :0
0.8352 x 2 = 1.6704 :1
0.6704 x 2 = 1.3408 :1

0.3568 is .01011011

When we convert the binary result back to denary, we get 0.35546875. This isn't too far away - with 16 bits we could get, 0.356796264648437. The precision increases the more bits we use.

Binary - June 2010.pdf
Binary - June 2012.pdf
Binary 2 - June 2010.pdf
Binary 2 - June 2012.pdf
Binary Addition + Subtraction.pptx
Binary Coded Decimals Number Representation.docx
Binary Coded Decimals.pptx
Binary Numbers and Hex.ppt
Fixed Point Binary Numbers.docx
Worksheet - Binary Addition.docx
Worksheet - Binary Subtraction.docx
Worksheet - Binary to Denary Conversion.docx
Worksheet - Twos complement.docx
Two's Complement.ppt

Hexadecimal

Denary Numbers

In a digital computer, everything is represented using the binary number system. Binary patterns are not the easiest for human beings to interpret, mainly because even small numbers require a large number of place value columns. For example, using 16 bits, the pure binary numbers from 0 to 65535 can be represented. It is also worth noting that it takes less space to represent a number in hexadecimal on the screen. In this sense, hexadecimal can be said to be a shorthand for binary patterns.

Hexadecimal is base 16. Each place value is a power of 16 and allows the digits 0 - 9 and the letters A - F to be used to represent values from 0 - 15 in each column.

16 Units
2 B
=43

Converting From Binary To Hex

Take the 8 bit binary number, 10111001. Chop it into 2 groups of 4 bits. Write the hexadecimal for the new binary values created.

binary to hex

To convert from denary to hex, first convert to binary and then follow the same procedure.

You can do the same in reverse. Converting from hex to denary is made a little easier if you do the binary conversion first using the method outlined.

Whichever methods you use, it makes sense to check your answer carefully. As long as you remember how to form the headings for the place value chart, you can always work from first principles.

A good Hex conversion game is:
http://people.sinclair.edu/nickreeder/Flash/binHex.htm

Binary and Hex - June 2011.pdf
Binary Numbers and Hex.ppt
Worksheet - Converting from denary to hexadecimal representation.docx
Worksheet - Converting from hexadecimal to denary representation.docx

Character Coding

Character coding schemes use binary patterns to represent character data (text).

A common code in all computers ensures that information can easily be transferred between machines.

American Standard Code For Information Interchange (ASCII)

7 bits are used allowing 128 different characters to be represented. Each character in the standard set is assigned a number. The 7 bit binary representation of that number is used to represent that character.

The first 32 characters are control codes such as TAB or Line Feed. Digits, lower and upper case letters and standard symbols are represented. The extended ASCII uses the eighth bit and codes more characters and symbols.

A number will be stored differently depending on whether it is being displayed or used for calculations. The ASCII representation of the character data "23" is not the same as the pure binary pattern for this number using the same number of bits. The representation of this data within the computer system depends on the context and use of the data.

ASCII Table

It's worth looking carefully at the ASCII table. Look, for example at the codes for upper case letters and compare them with corresponding codes for lower case. There is a difference of only one bit. Another nice feature of the codes chosen is that you only need to change 2 bits to convert the character code of a number into the binary representation of that number.

Decimal	Binary	Hexadecimal	Character
0	0	0	NUL (null)
1	1	1	SOH (start of heading)
2	10	2	STX (start of text)
3	11	3	ETX (end of text)
4	100	4	EOT (end of transmission)
5	101	5	ENQ (enquiry)
6	110	6	ACK (acknowledge)
7	111	7	BEL (bel)
8	1000	8	BS (backspace)
9	1001	9	TAB (horizontal tab)
10	1010	A	LF (NL line feed, new line)
11	1011	B	VT (vertical tab)
12	1100	C	FF (NP form feed, new page)
13	1101	D	CR (carriage return)
14	1110	E	SO (shift out)
15	1111	F	SI (shift in)
16	10000	10	DLE (data link exchange)
17	10001	11	DC1 (device control 1)
18	10010	12	DC2 (device control 2)
19	10011	13	DC3 (device control 3)
20	10100	14	DC4 (device control 4)
21	10101	15	NAK (negative acknowledge)
22	10110	16	SYN (synchronous idle)
23	10111	17	ETB (end of trans. block)
24	11000	18	CAN (cancel)
25	11001	19	EM (end of medium)
26	11010	1A	SUB (substitute)
27	11011	1B	ESC (escape)
28	11100	1C	FS (file separator)
29	11101	1D	GS (group separator)
30	11110	1E	RS (record separator)
31	11111	1F	US (unit separator)
32	100000	20	SPACE
33	100001	21	!
34	100010	22	"
35	100011	23	#
36	100100	24	$
37	100101	25	%
38	100110	26	&
39	100111	27	'
40	101000	28	(
41	101001	29	)
42	101010	2A	*
43	101011	2B	+
44	101100	2C	,
45	101101	2D	-
46	101110	2E	.
47	101111	2F	/
48	110000	30	0
49	110001	31	1
50	110010	32	2
51	110011	33	3
52	110100	34	4
53	110101	35	5
54	110110	36	6
55	110111	37	7
56	111000	38	8
57	111001	39	9
58	111010	3A	:
59	111011	3B	;
60	111100	3C	<
61	111101	3D	=
62	111110	3E	>
63	111111	3F	?
64	1000000	40	@
65	1000001	41	A
66	1000010	42	B
67	1000011	43	C
68	1000100	44	D
69	1000101	45	E
70	1000110	46	F
71	1000111	47	G
72	1001000	48	H
73	1001001	49	I
74	1001010	4A	J
75	1001011	4B	K
76	1001100	4C	L
77	1001101	4D	M
78	1001110	4E	N
79	1001111	4F	O
80	1010000	50	P
81	1010001	51	Q
82	1010010	52	R
83	1010011	53	S
84	1010100	54	T
85	1010101	55	U
86	1010110	56	V
87	1010111	57	W
88	1011000	58	X
89	1011001	59	Y
90	1011010	5A	Z
91	1011011	5B	[
92	1011100	5C	\
93	1011101	5D	]
94	1011110	5E	^
95	1011111	5F	_
96	1100000	60	`
97	1100001	61	a
98	1100010	62	b
99	1100011	63	c
100	1100100	64	d
101	1100101	65	e
102	1100110	66	f
103	1100111	67	g
104	1101000	68	h
105	1101001	69	i
106	1101010	6A	j
107	1101011	6B	k
108	1101100	6C	l
109	1101101	6D	m
110	1101110	6E	n
111	1101111	6F	o
112	1110000	70	p
113	1110001	71	q
114	1110010	72	r
115	1110011	73	s
116	1110100	74	t
117	1110101	75	u
118	1110110	76	v
119	1110111	77	w
120	1111000	78	x
121	1111001	79	y
122	1111010	7A	z
123	1111011	7B	{
124	1111100	7C	\|
125	1111101	7D	}
126	1111110	7E	~
127	1111111	7F

Unicode

The ASCII codes can now be considered to be a subset of unicode. In fact, the first 128 characters of both are the same. Many common character encoding systems, like UTF-8, are backwards-compatible with ASCII.

Unicode is so named because of the intention that it describes a universal character set. That is, it contains all possible characters for all languages and scripts. There are over 10000 characters in the Unicode character set. It uses 16 bits to represent each character.

The Unicode Consortium manages the standards for Unicode including the addition of new symbols where necessary.

ASCII Alphanumeric Code.pptx
Gray Code Research Task.docx
Gray Code.pptx
Parity - Odd and Even.ppt
Hamming code.docx
ASCII Code.ppt

Bitmapped and Vector Graphics

Bitmapped Graphics

In bitmapped graphics, the image is divided into a grid of picture elements or pixels. When an image is loaded, the binary codes that represent the colour of each pixel are transferred to memory. The term bitmap comes from the way that each binary code is 'mapped' to a single location in memory.

Screen Resolution

The VDU screen is also divided into pixels. The higher the resolution of the display, the greater the number of pixels for the given screen. Having more pixels does not make an image sharper, but the size of each pixel does. The smaller the pixels on screen, the better.

Typical display resolutions are,
1440 x 900
1024 x 768
800 x 600
640 x 480

Colour Depth

Colour depth is the number of bits used to represent the colour of each pixel. The more bits, the greater the range of colours that can be displayed.

1 Bit Colour: A single bit is used to produce a black and white image. Often called monochrome. 0 = black, 1 = white.

8 Bit Colour: A byte is used to represent a palette of 256 colours.

The RGB model expresses colours in terms of the relative brightness of red, green and blue in the colour. 12 bit Direct Colour assigns 4 bits to each of these channels and allows for the representation of 4096 distinct colours.

True Colour display requires 24 bits per pixel, 8 bits for each channel. It allows for 16.7 million colours, about the same as the human eye can perceive.

Most modern PCs have a 32 bit word size - they can manipulate up to 32 bits simultaneously. Colours are often expressed using 32 bits. The spare 8 bits are either ignored or used to represent alpha (for partial transparency).

The RGB model is based on emitted light. We see the colour of most objects by receiving light reflected from their surface. For this reason, printing requires a different model. CMYK (Cyan, Magenta, Yellow, Black) is a common standard.

The Web Safe Palette is based on 216 colours that could reliably be assumed to display the same way in all web browsers. Major differences in the way browsers implement HTML, CSS and Javascript can cause headaches when a web designer wants to ensure that a page displays and works the same way regardless of the browser.

Memory Requirements

100 x 100 pixels means 10000 pixels total. If 24 bit colour is used, we need 3 bytes for each pixel, 30000 bytes in total. Test this assertion using a graphics package. Bitmap files also include some header information - allowing for that you should come fairly close to the figure you calculate.

Encoding Data In Images

If you replaced the bytes used to store the colour of every hundredth pixel with ASCII codes, you could encode a message in an image without changing enough about the way it looks to give yourself away. There is enough information in the Visual Programming section of the C# to do this.

Vector Graphics

Bitmapped graphics represent an image by listing the colours of each pixel. Vector graphics list the details of the shapes that make up the drawing as a whole. Sizes are relative to the drawing as a whole and the resulting format is scalable.

If you have used clipart or have designed in Flash, the chances are you have already used a vector graphic. Notice that no quality is lost when the image is scaled up.

In the vector graphic format, object information is stored in the Drawing List file.

Each part of the drawing is listed as a series of commands,
Line (10,100,50,100,red,4)
Rectangle(50,50,100,50, filled, black)
Each instruction in the drawing list lists the properties of the object being drawn, the thickness of a line, coordinates, font size, brush style, fill colour, border colour and thickness and more...

Geometric images require fewer bytes to store than in bitmap format. Complex shapes and filled shapes are better stored as bitmaps. Vector graphics scale without distortion.

It is also worth observing that any graphic displayed on screen is effectively a bitmap.

Image Compression

Data compression is all about squishing data into the a smaller number of bytes. A variety of techniques are used to reduce the size of a bitmap.

Run-Length Encoding

This technique is based on the repetition of colours in an image. If you read the image from the top left, reading a row of pixels at a time, you often get runs of pixels of the same colour. If there are 3 or more pixels in a row, storing the number in the run and the colour is more efficient.

The image above can be described as follows,

White 71
Orange 8
White 22
Orange 8
White 22
Orange 8
White 22

RLE is lossless compression. No information is lost in the image compression. RLE was used in the now defunct pcx format.

Compression techniques that involve the loss of information are known as lossy compression.

JPEG

Joint Photographic Experts Group created the standard. The standard specifies the algorithm used for compression and decompression (codec). Lossy compression - some visual quality is lost.

The jpg format is good with photographs. It is not so good with high contrast pictures like screenshots or computer art. It relies on the eye not being sensitive to tiny changes of colour.

Compression can be varied when the image is saved. This format recompresses each time it is saved and repeated saving may lose quality. You should always work with uncompressed formats before saving in the target format.

GIF

The Graphics Interchange Format is based on limiting the colours used in the image. Typically up to 256 colours are used to make the palette - a table assigning up to 256 colours to the number 0 - 255. The pixel data for the image is then stored using the 8 bit number that represents the colour's position in the table.

Good for cartoony stuff and computer art - supports transparency.

PNG

Portable Network Graphics uses lossless data compression. PNG is an open-source format that was created to improve upon the GIF. GIF was not open source and there were licensing costs to developers using the format. This was also a motivating factor in the uptake of the PNG.

SVG

Scalable Vector Graphics is a language for describing 2d graphics and graphical applications in XML. Principally used for vector graphics on the WWW.

Images - June 2011.pdf
Images - June 2012.pdf

Sound

What Is Sound?

'a mechanical disturbance from a state of equilibrium that propagates through an elastic material medium.'
Encylopaedia Britannica

'an air pressure wave'
A Computing Text Book

In an analogue system, sound is captured by a transducer (perhaps a microphone) that produces an electrical signal that varies in proportion to the pressure created by the sound. That electrical signal can be transmitted or stored on a suitable medium (eg magnetic tape).

For the sound to be heard, the electrical signal must be used to recreate the original sound by vibrating a mechanical surface in a speaker. The term fidelity refers to the precision with which the original sound wave is recreated.

Vinyl

On vinyl LPs, sound was encoded into the shape of a spiral groove that ran across the surface of the record. To play the record, a fine needle followed the tiny changes in this groove, reading the changes in sound.

Playing vinyl produced a warm, rich sound that you don't get with a CD. Vinyl was no less reliable than CD. Album covers were much more impressive and the music lover's experience was richer for it.

As with many changes in format, the public were seriously ripped off when CDs were introduced.

Analogue Data

Physical quantities such as temperature and pressure vary continuously over time.
analogue graph

Digital Data

Digital data is discontinuous. It varies in discrete steps. Imagine that the temperature shown in the graph is measured at regular intervals and recorded as a series of discrete values. That is digital.

Data & Signals

An analogue signal is an electrical signal that varies continuously over time. A digital signal is an electrical signal that changes in discrete steps.
An Analogue To Digital Converter samples sound at regular intervals and records each value as a digital value.

Pulse Code Modulation

PCM is a process for coding sampled analogue signals by recording the height of each sample in a binary electrical equivalent.
A-D Conversion

Samples are taken of the analogue signal at fixed and regular intervals of time. The samples are represented as narrow voltage pulses, proportional in height to the original signal. This is called Pulse Amplitude Modulation (PAM)

PCM data is produced by quantising the PAM samples. That means that the height of each sample is approximated using an integer value of n bits. The height of each PCM pulse is encoded using n bits to produce the output in binary signal form.

D-A Conversion

In order to play the digital sound, the process of conversion is reversed. The DAC produces a signal which is an approximation of the original signal.

The entire encoding and decoding process is represented in the diagram.

A-D conversion

The Maths Of Sampling

An analogue frequency of 1000Hz is converted to a PCM signal by sampling at a frequency of 2000Hz (2000 samples per second). Each sample is encoded in 8 bits using PCM coding.
How many bytes of storage are required to encode 10 seconds of the analogue signal?
2000 samples taken each second.
10 x 2000 = 20 000 samples.
1 byte (8 bits) for each sample.
20 000 x 1 byte = 20 000 bytes.
Sampled Sound
Sampling Rate = The number of samples taken each second.
Measured in Hz or KHz.
Sampling Resolution = The number of bits used to store each sample.
Measured in Kbps

How Often To Sample

Nyquist's theorem states that we must sample at twice the frequency of the highest signal in order not to miss meaningful changes in the original signal.

The bandwidth of an analogue signal is the difference between the highest frequency and the lowest frequency of the signal. This is the maximum frequency range of the signal. The Nyquist interval is one over twice the bandwidth.

For a signal with a frequency range of 3000Hz, sampling intervals should be no more than 1/6000 seconds apart.

Sampling Rate

The higher the rate, the more often samples are taken, the more accurate the representation of the sound.

Sampling Resolution

A more accurate representation of the analogue signal can be achieved if more bits are used to store each sample.

File Formats

WAV is a very common format for storing digitised sound. The WAV format allows variation of frequency and resolution. File sizes are relatively large and fidelity is good.

Compressed Audio

MPEG audio files come in a variety of flavours and can carry extensions such as .mp2, .mpa, .mp3, .mp4.

The compression algorithms used to produce the files are based on psychoacoustic modelling - removing frequencies that the brain and ear will not miss.

File sizes are substantially reduced - some loss of quality can occur.

Compressed Speech

Data compression techniques are also used for encoding speech. These are different techniques to those used for compressing music.

Sound Mixing

Mixing sounds from different sources into a single file can be fun. Digital encoding of audio information makes this even easier than it was in the way back.

Sound Synthesis

Approximating real-world sounds using electrical equipment. Things like pianos aren't too bad. Some instruments are harder to emulate. MIDI (Music Information Digital Interface) stores no sound data, simply notes, duration and instruments. Very compact.

Streaming Audio

An audio streaming client (say, RealPlayer) starts receiving audio data from a remote location. This data is stored in a buffer. Once there are a few seconds of data in the buffer, the client begins playback from the buffer. As long as the buffer does not run out of data, the sound will play without pause.

Sound Editing

Representing sound in digital form allows for editing. This might mean removing background noise or specific frequencies. It might mean cropping or merging with other sounds.

Representing Sound - June 2010.pdf
Representing Sound - June 2012.pdf
Worksheet - Sound Files.docx
Worksheet - Sound Synthesis.docx
Worksheet - Streaming Audio.docx