Fundamentals of data representation

Table of Contents

1 Number Systems

Learn It

  • Natural numbers are numbers that occur commonly and obviously in nature. As such, it is they are whole, non-negative numbers. The set of natural numbers, denoted N, can be defined in either of two ways:.
N=[0,1,2,3,...]
N=(1,2,3,...)
  • Integer numbers are whole numbers. The set of integer numbers, denoted Z, can be defined in the following way:.
Z = { ..., -3, -2, -1, 0, 1, 2, 3, ... }
  • Rational numbers are numbers that can be written as fractions. By default, all integers are rational numbers. A set of rational numbers can be expressed as:
Q = { 1/3, 3/4, 1/6, 0.5, -3, -2, -1, 0, 1, 2, 3, ... }
  • Irrational numbers An irrational number is one that cannot be written as a fraction, for example √2. A set of rational numbers can be expressed as:
I = { √2, √3, Π}
  • Real numbers are possible real world quantities. This includes both the rational and irrational numbers. A set of real number, denoted as R:
R = {√2, 1.23, 3.142, -3, -2, -1, 0, 111, 245.56, 3, ... }
  • Ordinal numbers When objects are placed in order, ordinal numbers are used to tell their position.
  • For example, the following set of ordered objects, 'a' is the 1st object, 'b' is the 2nd, so on and so forth.
S = {‘a’, ‘b’, ‘c’, ‘d’}
  • Counting and measurement :
    • natural numbers are used for counting.
    • real numbers used for measurement.

2 Number Base

Learn It - decimal, binary and hexadecimal

  • decimal numbers, or denary numbers also known as base 10 numbers. We use base 10, or decimal numbers in our daily lives. Base-10 number system has 10 digits to make up all numbers:
0, 1, 2, 3, 4, 5, 6, 7, 8, 9
  1. Decimal numbers have place values of 100, 101, 102, 103, 104, etc
  2. Decimal numbers OR base 10 numbers are wrtten as:
Number10, eg 6710
  • binary numbers are made of only two digits, 0 and 1.
    1. Binary numbers have place values of 20, 21, 22, 23, 24, etc
    2. Binary numbers or base 2 numbers can be written as:
Number2, eg 10012
  • hexadecimal numbers are base 16 numbers. Hexadecimal number system has 16 symbols:
    0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F
    
    • the notation of hexadecimal

      Several different notations are used to represent hexadecimal in computing. The prefix "0x" is widespread due to its use in Unix and C, some denote hexadecimal values using a suffix or subscript. For example, user prefix: 0x2AF3 or use subscript 2AF316.

    • the purpose of hexadecimal
      1. In computing, it is very inconvenient to write or read a long string of 1s and 0s. Hexadecimal is used to "compress" binary bit patterns. Each hex digit represents 4 binary bits (or a nibble).
      2. Hex is commonly used in computer memory addresses.
      3. Binary numbers are difficult for programmers to work with.
      4. If we're building a web-page, then chances are we're going to want to use colour.
      5. In general computers use the RGB colour model, where any colour we need is made up by additive mixing Red, Green and Blue.
      6. The value of RGB colours ranges from 0,0,0 up to 255,255,255.

Here's a colour that is represented as R-159, G-028, B-173.

Try It

  • Convert each of the 3 digit denary numbers to binary, as see what you get.
  • Do you think that writing the binary number for the colour would be convenient.

Document It

  • Note down why it is that programmers don't like using binary in their code.

Hexadecimal is easier

  • Learn It
    • Coders use Hexadecimal as it's easy to convert it to binary, but it is more succinct.
    • When counting in Hexadecimal, we use the letters A-F after reaching 9.
    • 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F
    • Normally we would use a table to aid in the conversion.
    Binary Hexadecimal Decimal
    0000 0 0
    0001 1 1
    0010 2 2
    0011 3 3
    0100 4 4
    0101 5 5
    0110 6 6
    0111 7 7
    1000 8 8
    1001 9 9
    1010 A 10
    1011 B 11
    1100 C 12
    1101 D 13
    1110 E 14
    1111 F 15
    • We can take a binary number, such as 10101101.
    • Split it up into nibbles - 1010 1101
    • Convert each to Hex - AD
  • Try It
    • Convert each of the binary numbers you worked out before, into their hex equivalents.
    • Now go on line and look up the hex code you've calculated (place a # in front of the number in your search) and check it matches the colour on this web-page.
  • Learn It
    • With experience, you'll not need a table, as you'll come to recognise the hex equivalent for each nibble.
    • However, if you don't have a conversion table or can't remember the values, you can convert between binary and hex, by using denary as a middle step.

Learn It - Counting in Binary and Hexadecimal

  • Counting in Binary starts with 0, then 1. Since there are only '0' and '1' to be used, when counting the next number (decimal 2), we need to move the place value to the left and use '1' to indicate that the 2's place is used. Usig this method, counting in binary from 0 to 8 is like so:
0, 1, 10, 11, 100, 101, 110, 111, 1000
  • Have a look at this page on counting in binary and hex

Learn It - Conversion between decimal,binary and hexadecimal

  1. To convert binary to decimal:
         Using binary number 1011 as an example:
    Step 1: write the place values of each bit
     place value          8     4     2    1
     binary number        1     0     1    1
    
    Step 2: if there is a '0' bit under the place value, make it 0, else, write down the place value
     place value           8     4     2    1
     binary number:        1     0     1    1
                           8     0     2    1
    
    Step 3: add all the place values with a bit '1' (on bit) together and you will get the decimal value for the binary number
                           8     4     2    1
                           1     0     1    1
                           8         + 2  + 1 = 11
    
    Therefore, binary 1011 is decimal number 11.
    
  2. To convert decimal to binary:
    Using decimal number 27 as an example:
    Step 1:  find the highest place value in the decimal number 27. In this case, it is 16.
     Because the next place value will be 32 which is too large for 27. Place a '1' bit under 16.
     place value          32    16    8    4    2    1
     binary number              1     
    
    Step 2: subtract 16 from 27, now we have 11. Repeat the last step on the remaining decimal number 11
     place value:         32    16    8    4    2    1
     binary number:             1     1
    
    Step 3: repeat step 2, now we have the remaining 3 left. The next place value 4 is too big, so will not be used. 
    We place a bit '0' underneath it.
     place value:         32    16    8    4    2    1
     binary number:             1     1    0
    
    Step 4: Since there is a 2 and 1 in the remaining 3, we place a bit '1' underneath them
     place value:         32    16    8    4    2    1
     binary number:             1     1    0    1    1
    
    Therefore, the decimal number 27 is decimal number 11011.
    
  3. To convert a binary bit pattern into hex:
    example 1: Binary bit pattern: 10011110
    Group the bits in 4
    starting from the LSB (least significant bit, the rightmost bit): 
    There are exactly 2 groups of 4-bits
        binary: 1001 1110
        hex:     9    E
    Therefore, the binary pattern 10011110 can be written in hex as 9E.
    
    example 2: Binary bit pattern: 1001111.
    There are 7 bits.
    In this case, you group the bits in 4
    starting from the LSB (least significant bit, the rightmost bit)
    You pad the remaining 3-bits with a leading 0:
          binary: 0100 1111
          hex:     4    F 
    Therefore, the binary pattern 1001111 can be written in hex as 4F
    
  4. to convert hex to decimal:
    example 1: Hex number: C4
    Work out the place values for each hex character.
         The C is at the 16^1 and the 4 is at the 16^0
    Work out the C hex character's decimal value, which is 12
    Work out the 4 hex character's decimal value, which is 4
    
          12x16^1 + 4x16^0 = 196
    
    Therefore, the hex number C4 in decimal is 196.
    
    example 2: Hex number: F16
    Work out the place values for each hex character.
         The F is at the 16^2,1 at 16^1 and 6 at the 16^0
    Work out the F hex character's decimal value, which is 15
    Work out the 1 hex character's decimal value, which is 1
    Work out the 6 hex character's decimal value, which is 6
    
          15x16^2 + 1x16^1 + 6x16^0 = 3862
    
    Therefore, the hex number F16 in decimal is 3862
    
  5. You can confirm the above example's correctness by convert each hex number into binary, then to decimal.
    Example 1: hex number C4
    
     4 in binary: 0100
     C in binary: 1100
     The binary number for hex 4C therefore is: 100 1100
    
        1  1  0  0  0 1  0  0
       128 64 32 16 8 4  2  1     = 128+64+4=196
    

Try It

  1. Convert the following binary numbers to hex and show your working. Check them by using the table.
    • 11110011
    • 00101100
    • 10101010
  2. Play this binary game to see how good you are!

3 Bits and Bytes

Learn It

  • a bit is either 0 OR 1.
    1. Binary numbers are made of 0s and 1s
    2. A byte is a group 8 bits
    3. n bits can make 2n different binary values. For example, using 3 bits, we can make 23=8 different binary values:
000, 001, 010, 011, 100, 101, 110, 111
  • A nibble is a group of 4 bits.
  • A byte (a group 8 bits) is a unit of storage in a computer which contains 8-bits and can store 256 different values: 0 to 255. Letters are usually stored in a byte for example.
  • A kilobyte is 1000(103) bytes. But a kibibyte is 1024(210) bytes.kibi is the ki lo in bi nary.Since computers store and measure things in binary, to differentiate the base 10 and base 2 units, here is a table of those unit's names:
Kilobytes (KB) 1000 Kibibyte (KiB) 1024  
Megabyte (MB) 10002 Mebibyte (MiB) 10242  
Gigabyte (GB) 10003 Gibibyte (GiB) 10243  
Terabyte (TB) 10004 Tebibyte (TiB) 10244  
Petabyte (PB) 10005 Pebibyte (PiB) 10245  
… … … …        

4 Binary Number System

Learn It - unsinged and signed binary numbers

  • Unsigned binary numbers do not have a bit to indicate they are positive or negative numbers. If you have a 8-bit unsigned binary number, the maximum value can be is 255 and there are 256 (28) different values.
  • Singned binary numbers use one bit to indicate positive (+) or negative (-). In this case, if you have a 8-bit signed binary number,

the maximum value can be is 127 and there are 128 (27) different values. In signed binary numbers, the left most(the most significant bit AKA msb) is used to indicate positive (0) or negative(1).

  1. a 8-bit positive binary number: 01010011
  2. a 8-bit negative binary number: 11010011

Learn It - unsigned binary number arithmetics

  • Add two unsigned binary integers

While adding binary numbers together, you need to remember that:

  1. 0+0=0
  2. 0+1=1
  3. 1+1=10 –— the 1 in "10" is being carried over to the 2's place
  4. 1+1+1=10+1=11

Now, lets apply the above knowledge to some examples:

example 1:                                  example 2:

       1010                                     1010              
      +0010                                    +1111
        1 <-- the carry over bit                 1 <--- the carry over bit 
   --------                                     1 <--- the carry over bit
       1100                                 -----------
                                                11000
  • multiply two unsigned binary integers

When multiply binary number, it works the same way as decimal numbers, you need to remember:

  1. 0*0=0
  2. 1*1=1
  3. 0*1=0
  4. You should end up with the total number of bits of the two unsigned binary integers.
example 1:                                  example 2:

       11                                     1010              
      x10                                     x101
   -------                                ----------
       00                                     1010
      11                                     0000    
      110 <-- the product                   1010
                                            110010 <--- the product

Learn It - signed integer representations

There are three schemes for signed integers:

  1. sign magnitude
  2. 1's compliment
  3. 2's compliment –— this is the only one the exam will cover

All the above schemes use the left most bit, called the sign bit to indicate a number is negative (1) or positive(0).

  • sign magnitude:

    In this scheme, a 8-bit binary number 1000 0001 will have the first (msb) indicates "-" and the rest of the bit 000 0001 will be the magnintude of the number. Therefore, 1000 0001 in binary is -1 in decimal.

  • 1's compliment:
    1. Again, the most significant bit (msb) is the sign bit, with value of 0 representing positive integers and 1 representing negative integers.
    2. The remaining n-1 bits represents the magnitude of the integer, as follows:

    For positive integers, the absolute value of the integer is equal to "the magnitude of the (n-1)-bit binary pattern". For negative integers, the absolute value of the integer is equal to "the magnitude of the complement (inverse) of the (n-1)-bit binary pattern" (hence called 1's complement).

    Example: the 8-bit pattern 1000 0001 represents a negative number (msb=1). The inverse (complement) of the remaining bit pattern is 111 1110 which is 126. Thus we get -126.

  • 2's compliment:
    1. The most significant bit (msb) is the sign bit, with value of 0 representing positive integers and 1 representing negative integers.
    2. The remaining n-1 bits represents the magnitude of the integer, as follows:

      For positive integers, the absolute value of the integer is equal to "the magnitude of the (n-1)-bit binary pattern".

      For negative integers, the absolute value of the integer is equal to "the magnitude of the complement of the (n-1)-bit binary pattern plus one" (hence called 2's complement).

Learn It - the range of values with fixed bits (n-bit pattern) in 2's complement

  • In 2's complement, the msb is used for sign (0 for positive AND 1 for negative).
  • The remaining n-1 bit is used for the absolute number value (magnitude). Therefore, the maximum magnitude of a n-bit pattern can have is 2n-1.
  • The range of values will be between -2n-1 and 2n-1-1
  • an example of the range of values with 8-bit (from -128 to 127)

    2sComp.png

Learn It - convert decimal to 2’s complement representation in binary:

  1. Work out the sign bit If the decimal number is positive, the msb is 0 If the decimal number is negative, the msb is 1
  2. Work out the magnitude of the decimal number in binary
    1. If the decimal number is negative, invert the bit pattern from step 2 and add 1
    2. If the decimal number is positive, move to step 3
  3. Place the sign bit (0 or 1) at the leftmost position of the bit pattern from step 2
For example, to convert -7 into a 8-bit binary in 2's compliment

Step 1: negative number → sign bit (msb) is 1 

Step 2: decimal 7 in binary → 000 0111

Step 3: Invert the bits: 000 0111 ⇒ 111 1000

Step 4: Add 1 to the inverted bits: 111 1000 + 1 = 111 1001

Step 3 and step 4 can be also replaced by finding the first '1' bit
 from the bit pattern result from step 2 
 and simply flip all the bits to its left.

Step 5: Place the sign bit at the msb position to make the resulting 8-bit 1111 1001

Learn It - convert 2’s complement representation to decimal:

  1. Check the sign bit If the sign bit is 0, the number is positive and its absolute value is the binary value of the remaining n-1 bits. Stop here. If the sign bit is 1, the number is negative, and proceed to step 2.
  2. Invert the remaining bits and plus 1 to get the absolute value (magnitude) of the negative number.
For example,
8-bit pattern: 1 100 0100
Sign bit is 1 → negative (-)

Invert the remaining bits: 100 0100⇒ 011 1011

Add 1 to the inverted bits: 011 1011 + 1 = 011 1100 which in decimal is 60

Hence, 8-bit 1100 0100 in 2's compliment has the decimal value of -60

Step 2 can also be done by checking the remaining bits from the right
 (least-significant bit AKA lsb). Look for the first occurrence of 1.
 Flip all the bits to the left of that first occurrence of 1.
 The flipped pattern gives the absolute value.

Learn It - 2's compliment addition and bit overflow

  • When adding two binary numbers in 2's compliment, you simply perform the addition as you would with any binary number. However, things can become undesirable when the resulting sum has more than the fixed number of bits computers used to store numbers. For example, a computer uses 8-bits to store number using 2's compliment, the resulting sum has 9-bits. This problem is called overflow.

    See the example below:

    For example, adding two positive numbers in 2's compliment with fixed 8-bits.
    
                01111110
                00111010 +
           ---------------
                10111000
    There are two problems with the result:
      1. The result is a negative number!
      2. Since the MSB bit is for sign and the remaining 7-bits for the value,
      the result has 8-bits, it causes overflow - more bits than the computer can store.
    

Learn It - 2's compliment subtraction

  1. When perfroming decimal subtraction, for example, 12-9, you practically perform addition 12+(-9)
  2. The same principle applies to binary numbers — so change the second binary to negative first, then just perform addition .
For example, subtracting 1(0001) from 7(0111) using 2's complement with 4-bit pattern.

Step 1: convert 0001 to its negative equivalent in 2's complement.
       1) invert the bits -> 1110
       2) add 1           -> 1111   <- note the msb is 1

Step 2: perform the additon:  0111 + 1111 = 10110

Step 3: Since the result has 5 bits, which is one bit more than the original 4 bits,
        the addition has caused overflow.
 Wherever there is an overflow bit in 2's complement,
 discard that extra bit. 
 The result 0110 is the answer.

Learn It - represent fraction numbers in binary

  1. Now we know we can represent any integers (positive or negative) using binary numbers. However, to represent fractions or numbers with decimal points in computers, it is not so easy.
  2. First, lets examine fractions in decimal numbers. For example, 52.64 can be written 50 + 2 + 0.6 + 0.04, or in their corresponding place:
    place 10s 1s . 1/10 1/100
      5 2 . 6 4
  3. Similarly, in binary (base two) notation, the fraction 4.625 can be represented in the following way: 100.101
    place 4s 2s 1s . 1/2 1/4 1/8
      1 0 0 . 1 0 1

    The binary 100 is decimal 4, while the .101 uses fractions of (1/2)n to proximately make the remaining although in this case, 0.625 can be made of (1/2)1 + (1/2)3 exactly.

    Now lets check if we can convert the following numbers from decimal to binary:

    Example 1: convert 3.14 to binary
           1) convert 3 into binary -> 11
           2) convert .14 into binary:
               1/2 = 0.5, large than 0.14, not use ->0
               1/4 = 0.25, still large than 0.14, not use ->0
               1/8 = 0.125, smaller than 0.14, use ->1,  0.14-0.125=0.015 remaining
               1/16 = 0.0625, large than 0.015, not use ->0
               1/32 = 0.03125, too large, not use ->0
               1/64 = 0.015625, too large but it is close enough, use ->1
            The result for 0.14 in binry is approximately: .001001
     Therefore, 3.14 in decimal to binary is 11.001001
    
  4. Fraction numbers can be represented in binary using a fixed number of bits with fixed number of bits for the values before and after the point.

    For example, using 8-bits, one can dedicate the first 4 bits for the number before the point and the remaining 4 bits for the number after the point.

    • to convert a decimal number 9.25 to a 8-bit unsigned binary with the above fixed point system:
      Example : convert 7.25 to binary using 8-bits fixed point
             1) convert 7 into binary in 4-bits -> 0111
             2) convert .25 into binary:
                 1/2 = 0.5, large than 0.25, not use ->0
                 1/4 = 0.25, just right use ->1
      
           Therefore, 7.25 in decimal to 8-bits binary is 0111 0100
      

Two things we have learned from the above example:

  1. using limited number of bits to represent fractions in computers cannot always be accurate.
  2. to increase number representation accuracy, you need to increase number of bits which result in increase in storage.

5 Information Coding Systems

Learn It - ASCII and Unicode

  • ASCII: American Standard Code for Information Interchange. It is the standard code for representing the characters on the keyboard. ASCII uses 7 bits which form 128 different bit combination, more than enough to cover all of the characters on a standard English-anguage keyboard. See this link for the complete ASCII code table.

"Originally based on the English alphabet, ASCII encodes 128 specified characters into seven-bit binary integers as shown by the ASCII chart on the right.The characters encoded are numbers 0 to 9, lowercase letters a to z, uppercase letters A to Z, basic punctuation symbols, control codes that originated with Teletype machines, and a space. For example, lowercase j would become binary 1101010 and decimal 106. ASCII includes definitions for 128 characters: 33 are non-printing control characters (many now obsolete)that affect how text and space are processedand 95 printable characters, including the space." – from wikipedia

  • The ASCII has been used for a long time. But it has some serious shortcomings:
    1. It only uses English alphabets
    2. It is limited to 7-bits, so it can only represent 128 distinct characters.
    3. It is not usable for non-latin languages, such as Chinese
  • Character form of a decimal digit In ASCII, the number character is not the same as the actual number value. For example, the ASCII value 011 0100 will print the character '4', the binary value is actually equal to the decimal number 52. Therefore ASCII cannot be used for arithmetic.
  • ASCII code has been extended to use 8 bits during the 1980s to include an additional 128 combinations to represent more symbols.
  • Unicode (Unique, Universal, and Uniform character enCoding) has been introduced to address the shortcomings with ASCII. The latest version of Unicode contains a repertoire of more than 120,000 characters covering 129 modern and historic scripts, as well as multiple symbol sets. ASCII character encoding is a subset of Unicode.
    1. Unicode has three main encoding forms, UTF-8, UTF-16 and UTF-32.
    2. UTF (Unicode transformation format) is an algorithmic mapping from every Unicode code character called point to a unique byte sequence.
    3. Depending on the encoding form you choose (UTF-8, UTF-16, or UTF-32), each character will then be represented either as a sequence of one to four 8-bits (bytes), one or two 16-bit code units, or a single 32-bit code unit.For example, in UTF-16, 2 bytes (16 bits) combinations are used to code each character.
    4. UTF-8 is the dominant character encoding for the World Wide Web, accounting for 84.6% of all Web pages in August 2015 .
    5. While there is one global standard, this means one character in unicode scheme could use up to 4 bytes, significantly increasing file sizes and data transmission time.
  • Unicode advantages over ASCII
    1. Can have representation of a greater range of characters;
    2. More languages or all (modern) languages can be represented (in one character set);
    3. Improved portability of documents in UNICODE as each character has a unique representation in UNICODE;

Learn It - Error Checking and Correction

During transmission, digital signals suffer from noise that can introduce errors in the binary bits travelling from one system to other. That means a 0 bit may change to 1 or a 1 bit may change to 0. error.jpg

  • Parity bit is the simplest technique used to check the accuracy of data transimission. In this scheme, the MSB(most significant bit) of an 8-bits word is used as the parity bit and the remaining 7 bits are used as data or message bits. The parity of 8-bits transmitted word can be either even parity or odd parity. Between two parties start data transmission, they MUST agree on using even or odd parity.

parity_check.jpg

  1. For even parity, the parity bit is set to 1 or 0 such that the number of "1 bits" in the entire word is even.
  2. For odd parity, the parity bit is set to 1 or 0 such that the number of "1 bits" in the entire word is odd.
  3. Parity checking at the receiver can detect the presence of an error if the parity of the receiver signal is different from the expected parity. That means, if it is known that the parity of the transmitted signal is always going to be "even" and if the received signal has an odd parity, then the receiver can conclude that the received signal is not correct. If an error is detected, then the receiver will ignore the received byte and request for retransmission of the same byte to the transmitter.

error_detection.jpg

  • Majority voting is another technique used to detect error. I this scheme, each bit is transmitted three times (redundancy) in the hope that majority of the bits in each repetition will be correct.

    For example, we want to transmit 8-bits of data, for every bit, we send it three times. At the recieving end, the following bits are seen:

    100 110 001 000 101 010 001 111

    By using majority voting, the first bit is 0, the second is 1, so on and so forth. So the recieving end concludes that the correct data is: 01001001.

  • Check sum is a mathmatical algorithm that is applied to a sequence of data to create a checksum value which is transmitted with the data. The same algorithm is applied to the data at the recieving end. If the two checksums match, the data transmission has been successful; if not, an error must have occurred and a re-transmission is requested.

    A simple checksum algorithm could be like so:

    1. the data is being divided into 4-bits groups.
    2. the sum of each group's numerical values is the checksum.
    3. if any bits change occurred during transmission, then it is likely, but not guaranteed to change the checksum.

    checksum.png

  • Check digit is another scheme designed to check for errors in data input or transmission. It uses an extra digit at the end of a sequence of data. The numbers printed normally above or just below the bar codes you see on product packages and books use this scheme.

    UPC (Universal Product Code) example:

UPCcode.jpg

ISBN (International Standard Book Number) 13 example: ISBN13.gif

6 representing Images, Sound and Other Data

Learn It - represent images in computers

  • An image is made up of tiny dots called pixels.
  • A pixel is the smallest element can be drawn on screen.
  • The more pixels on the screen, the higher the resolution and the better the quality of the picture will be. Resolution of an image is expressed directly as width of image in pixels by height of image in pixels using notation width x height. Alternatively, resolution can be expressed in number of dots per inch where a dot is a pixel.
  • The smaller the pixels the finer the detail that can be displayed on the screen.
  • Each pixel is represented by the same number of bits. The number of bits used for each pixel is called colour depth. A 1 bit colour depth can only represent 2 colours. For example, a black and white image can use 0 for black and 1 for white.
  • The more bits used in each pixel (higher colour depth), the more colours can be represented (richer colours), the more storage is required.
  • Graphics can be classed as either: Bitmapped graphics (painting) Vector graphics (drawing).
    • Bitmapped Graphics are stored in a two dimensional array using binary numbers to represent the colours in the pixels.
      • Some examples of bitmap: png, gif, jpg
      • bitmap image files may contain metadata to describe their height, width and colour depth.
      • by using the metadata in a bitmap, we can calculate:
        1. the size of the image:

          height (number of pixels) x width (number of pixels) x colour depth(number of bits per pixel)

        2. the resolution of the image:

          height (number of pixels) x width (number of pixels)

    • Vector gaphics use coordinates and geometry to precisely define the parts of the image using mathmatic formula. It is more efficient than bitmaps at storing large areas of the same colour because it does not need to store every pixel as a bitmap does.
      1. drawing is made up of drawing objects(circle, rectangle etc).
      2. Objects are stored as drawing commands or drawing list. An example drawing list for a circle would include: centre coordinate, radius, line thickness, line colour, fill or not.
      3. different objects have a defined set of properties.
      4. some properties use mathematical equations or formulae.

        Some examples of vector: SVG

  • Regardless of bitmap or vector graphics, all images are converted (rasterised ) and displayed in pixels.

Learn It - Bitmap and vector graphics comparisons

  • Advantages of vector graphics
    • (For geometric images) less storage space/memory likely to be needed;
    • (For geometric images) will load faster from secondary storage;
    • (For geometric images) will download faster;
    • Can be scaled/resized without distortion;
    • Image can be more easily searched for particular objects;
    • Can easily manipulate individual objects in an image;
  • Disadvantages of vector graphics
    • Only appropriate for images made of geometric shapes;
    • Unsuitable if colour of each pixel is likely to vary such as a digital photograph;
    • Some drawing tools are unlikely to be be available when using vector graphics (eg spray paint, blurring);
    • for complex images, can take longer to render an image;

Learn It - represent sound in computers

  1. Analogue and Digital data
    • Analogue data have continous values with infinite possible values in between. Sound waves are analogue data (sinewave like).
    • Digital data have discrete and finite set of values.
    • To store sound in computers, it must be converted into digital data first.
  2. Convert analogue data into digital data
    • Analogue to digital conversion (ADC) happens when we record sound to be used on a computer. This is accomplished normally by a ADC converter. ADC.png
    • The number of bits used to represent a value/amplitude, or one sample in analogue data is called sample resolution. The more bits used (higher sample resolution), the wider range of values can be represented.
    • Another factor affecting the quality of ADC is the sampling rate or frequency - the number of samples taken per second, expressed as Hz. sample-res-comparison.jpg
    • The Nyquist Theorem, also known as the sampling theorem, is a principle that engineers follow in the digitisation of analog signals. It states that sample rate should be at a frequency which is at least twice the value of the highest frequency in the sampled signal.

      Take a look at this site for more info on Nyquist Theorem.

    • Bit rate - the number of bits required to store 1 second of sound
    • To calculate a resulting digital file size:
      file size (in bits) = sampling rate x sample resolution x length of sound
      
      OR
      
      file size (in bits) = bit rate x length of sound
      
  3. Convert digital to analogue
    • Digital to analogue conversion (DAC) happens when digital sound files(such as CD, mp3 files) are played back as sound waves to be heard. The analogue wave produced may differ significantly from the original sound wave due to the fact that conversion has to best "guess" the fit of a continous wave over a finite set of values (by interpolation algorithms).
  4. Musical Instrument Digital Interface (MIDI)

    " MIDI is a technical standard that describes a protocol, digital interface and connectors and allows a wide variety of electronic musical instruments, computers and other related devices to connect and communicate with one another. _wikipedia"

    • MIDI basics MIDI is a communication standard developed in the early 1980s for electronic musical instruments and computers.

      Since MIDI is a communication protocol, it uses a standard collection of messages to communicate. See this for a list of MIDI messages.

      Unlike digital audio files (.wav, .aiff, etc.) or CDs, a MIDI file does not need to capture and store actual sounds. Instead, the MIDI file can be just a list of events which describe the specific steps that a soundcard or other playback device must take to generate ceratin sounds.

    • MIDI audio files MIDI files is a list of time-stamped MIDI event messages that are recordings of musical actions in sequence. MIDI event messages contain MIDI instructions for notes, volumes, sounds, and even effects. MIDI files tend to be significantly smaller than equivalent digitised waveform files. All popular computer platforms can play MIDI files (*.mid).

      Advantages of MIDI files:

      1. MIDI files are much more compact than digital audio files.
      2. MIDI files embedded in web pages load and play more quickly than their digital equivalent.
      3. MIDI data is completely editable. A particular instrument can be removed from the song and/or a particular instrument can be changed by another just by selecting it.
      4. MIDI files may sound better than digital audio files if the MIDI sound source you are using his of high quality.

7 Data compression

Why data compression?

  • Human's hearing frequency range is commonly considered between 20Hz to 20kHz. Based on Nyquist Theorem, the sampling rate for CD and most audio files is 44.1kHz. Lets exam how big file size will be for a typical sound recording on a CD that is 3 miutes long. A sampling resolution of 16 bits is used when recording an audio CD.

    Lets examine the raw file size for the above audio recording:

    total file size in bits = 44.1kHz x 16 x 3 x 60=127 008 000 bits
    total file size in bytes = 127008000/8=15876000 bytes
    That is close to 16MB
    
  • It is easy to see why compression is needed:
    1. Compression reduces file size thus storage requirements.
    2. Compressed files are transmitted quickly over the internet.

How compression works?

There are two types of compression: lossy and lossless

  • Lossy compression lose accuracy but tends to result in smaller file size than lossless.
  • Lossless compression doesn't lose any accuracy and can be decompressed into an identical copy of the original audio data.

Lossless compression

  • FLAC is one example of lossless compression file format. It uses run length encoding(RLE), which looks for repeated patterns in the sound file, and instead of recording each pattern separately, it stores information on how many times the pattern occurs in a row.

    The following is an example how the run length encoding work:

Lossy compression

  • mp3 is one of the lossy compression. mp3 reduces the raw file size by a factor of 10 to 14 without losing noticeably the original quality. When you rip music from a CD, most likely you are compressing the CD audio files using mp3. Besides using other normal compression algorithms, mp3 and other lossy compression uses the algorithm called perpetual coding, which uses the nature with human hearing, specifically,
    1. Certain sounds (below or above the human audible frequency boundaries) human cannot hear.
    2. Certain sounds human can hear much better than others.
    3. If two sounds played simultaneously, humans can hear the louder one better than the softer one.

    Using the above facts, certain parts of the sound track can be removed to save space without losing much quality but with much reduced file size which is better suited for mobile devices and internet transmissions.

Image compression

We have learned previously on how to calculate a bitmap image size by using bit depth, and resolution. Bitmap images could be quite big. Image compression is much like the audio file compression. There are lossless and lossy compression, GIF image files are lossless while JPEG files are lossy.

GIF images are compressed using the Lempel-Ziv-Welch (LZW) lossless data compression technique to reduce the file size without degrading the visual quality. LZW is a dictionary based compression algorithm.

"A lossless compression program can't do much with this type of file. While large parts of the picture may look the same – the whole sky is blue, for example – most of the individual pixels are a little bit different. To make this picture smaller without compromising the resolution, you have to change the color value for certain pixels. If the picture had a lot of blue sky, the program would pick one color of blue that could be used for every pixel. Then, the program rewrites the file so that the value for every sky pixel refers back to this information. If the compression scheme works well, you won't notice the change, but the file size will be significantly reduced. – Howstuffworks"

File compression (lossless) works by using the LZW (see above) adaptive dictionary based algorithm. It finds repeated patterns in a file. It then build a dictionary of repeated patterns and writes the original file by pointing to this dictionary wherever repeated patterns encountered. If a file has a lot of repeated patterns, the rate of reduction typically increases with file size. When a compressed file is downloaded from the Intenet, the dictionary is also downloaded so the decompression software can decompress the file to its original form.

This kind of compression works best for text files, not so well with images and sounds as they do not have many repeated patterns.

zip files are examples of file compression.

8 Encryption

key terms:

  • plaintext: the original message or data
  • ciphertext: the encrypted message or data
  • cipher: the algorithm used to encrypt the plaintext
  • key: the secret to lock or unlock the ciphertext

Purpose of encryption is to prevent unauthorised third parties from accessing data. Without the key used to encrypt the plaintext, ciphertext will not be meaningful.

There are many different ciphers in the world. Some are harder to break than others. But there is only one cipher, Vernam Cipher, that is still proven unbreakable.

  1. The Vernam Cipher
    • Invented by the American scientist Gilbert Vernam, It works by using a true random one-time encryption key, AKA, one-time pad. It must be the same or longer in characters than the plaintext. The plaintext’s each character is expressed in ASCII binary. The ciphertext is produced by XORing the plaintext binary with the one-time pad ASCII binary bit by bit. When two bits (0 or 1) have been applied the XOR logic operation, it produces a truth table like the following:
      input input2 output
      0 0 0
      0 1 1
      1 0 1
      1 1 0

      Try to image you have only a XOR output like the table above '0110'. Now try just to break the first bit '0'. The posibility of the key is '0' or '1' is equal! The same applies to the other 3 bits.

  2. Vernam Cipher is the only non-breakable cipher because:
    • Ciphertext contains no useful information about plaintext;
    • Mapping from plaintext to ciphertext (or vice-versa) is different for each letter position in the plaintext/ciphertext;
    • Brute force mapping cannot reveal plaintext due to too many possible keys to use brute force;
    • Frequency analysis does not help as different plaintext letters can map onto the same ciphertext letter;

9 Sample Exam Questions(7516/2)

Practice the following sample exam questions

Question 1: R denotes the set of real numbers, which includes the natural numbers, the rational numbers and the irrational numbers.

  1. Give one example of a natural number.[1 mark]
  2. Give one example of an irrational number.[1 mark]

Question 2:

  1. What is the decimal equivalent of the hexadecimal number D616? Show your working.[2 marks]
  2. Represent the decimal value 9.37510 as an unsigned binary fixed point number, with 4 bits before and 4 bits after the binary point.[2 marks]
  3. Represent the decimal value -6710 as an 8-bit two’s complement binary integer. [2 marks]
  4. A computer represents numbers using 8-bit two’s complement binary. Using this representation perform the calculation:
            01001000
            01100011 +
       ---------------
    answer
    
  5. What problem has resulted from performing the calculation using 8-bit two’s complement binary?

Question 3:The ASCII binary code for character a is 11000012

  1. Explain what is mean by a character code. [1 mark]
  2. Complete Table 1 to show how the word be would be encoded in the binary form of ASCII [2 marks]
    character binary form of ASCII
    b  
    e  

    A program has been developed to convert a string so that all of its characters are in upper case.

    The computer does this by taking each character’s ASCII binary code and applying a bitwise AND operation to it, using the mask 10111112.

  3. Convert the lower case character c, ASCII code 11000112, into the upper case character C using the method described above. [1 mark]

Question 4: R denotes the set of real numbers. A real number can either be rational or irrational.

  1. Describe the set of rational numbers. [2 marks]
  2. Explain why all integers are rational numbers. [1 mark]

Question 5:

  1. What is the binary equivalent of the decimal number 10210? [1 mark]
  2. What is the hexadecimal equivalent of the decimal number 8710? Show your working. [2 marks]
  3. Provide an example of where we continue to use hexadecimal notation to represent data in computing and explain why we do not use binary. [2 marks]
  4. A computer represents numbers using 8-bit two’s complement binary. Using this representation perform the calculation showing all your working: [2 marks]
            00001001
            00000011 x
       ---------------
    answer
    

    A number is to be represented in binary using 6 bits and twos complement.

  5. What is the largest possible positive number that can be represented using this representation. [1 mark]
  6. Explain the difference between a kibibyte and a kilobyte.[1 mark]

Question 6:The ASCII binary code for character a is 11000012

  1. If the ASCII character has been received during a transmission, with the most significant (leftmost) bit being used as a parity bit and the odd parity system in use, explain whether or not the character has been received correctly and how you have determined this. [2 marks]
  2. A system uses majority voting to send ASCII characters from one device to another. The receiver obtains the following for the transmission of one ASCII character

    000 010 011 111 110 000 010 011

    Determine the 8-bits that the receiver should use to represent the transmitted ASCII character. [1 mark]