CSIS 625 Week 5

Error Detection & Correction

Copyright 2001 and 2002 – Daniel R. Oelke

For use by students of CSIS 625 for purposes of this class only.

  1. Overview
    1. Error Control
      1. Error Detection
      2. Error Correction
    2. Data Link Layer
      1. Line Discipline
      2. Flow Control
      3. Error Control
    3. Byte Oriented Link Control
    4. Bit Oriented Link Control
      1. HDLC
  2. Error Detection
    1. Errors always occur during transmission
      1. Point of Error Detection is to detect these errors so we can correct them.
      2. Some systems fix these errors by re-transmission.
      3. Some systems fix these errors using Forward Error Correction (FEC)
    2. Vocabulary
      1. Bit Error - an error that changes only one bit
      2. Burst Error - an error that changes several adjacent bits.
      3. Coding Violations - When the line coding mechanism tells us of an error.
        1. Need to use something like Bipolar-AMI, 8b10b, etc.
    3. Error Detection - Parity Bits
      1. Parity Bit - one extra bit of data sent with every data packet.
        1. Even Parity - The extra bit is set so that the number of bits is always even
        2. Odd Parity - The extra bit is set so that the number of bits is always odd
        3. Receiver checks parity and discards data if parity is not valid
        4. Regular parity is also known as Vertical Redundancy Check
        5. Parity term is sometimes mis-used to refer to any redundant bit
    4. Error Detection - BIP
      1. BIP-8 - Bit Interleaved Parity - 8 bits
        1. BIP is also known as Longitudinal Redundancy Check
        2. Takes bytes and calculates a parity bit for each bit position
        3. Provides better detection for burst errors.
      2. It is technically possible to have BIP-16 or BIP-32
        1. Not commonly seen
        2. BIP-8 is by far the most common system.
    5. Error Detection - Checksum
      1. Checksum - a byte added to the end of the frame to catch errors
      2. Simplest form is calculated by adding up all the bytes in the frame
      3. There are several algorithms - so check specifics on any one algorithm
      4. BIP-8 is sometimes called a checksum
        1. Start with 0x00 and do bitwise XOR of every byte in the message
      5. Checksums do not catch all kinds of errors.
        1. For example: They don’t catch blocks of data being re-arranged.
    6. Error Detection - CRC
      1. CRC - Cyclic Redundancy Check
      2. Catches many errors that Parity or BIP-8 will miss.
      3. Adds bits to the end of a frame so that it can be evenly divided by a number
      4. Usually 8, 16, or 32-bit CRC
      5. Easy to implement in hardware with shift register and feedback xor’s.
      6. See book for examples
      7. CRC - Strength
        1. CRC’s are considered quite strong
        2. All bursts of errors £ r (r = length of CRC)
        3. All odd number of errors
        4. Probability of missing error
          1. if burst is r+1 => 0.5r-1 (0.532-1 = 4.6E-10)
          2. if burst is > r+1 => 0.5r (0.532 = 2.3E-10)
      8. See also: http://www.ross.net/crc/
    7. Error Rates - BER
      1. BER - Bit Error Rate - the probability of a bit being changed
        1. Used to describe transmission lines
        2. All transmission lines have some non-zero BER
        3. Calculate by counting the number of errors and dividing by the number of bits through the system.
          1. Need to adjust for errors that occur but are not detected
          2. For example - an odd number of errors when using parity
      2. Normal practice is to measure 10 times the period to report a BER
        1. This means that to state the line has a 1E-10 error rate (or better) you must measure 1E+11 bits
        2. A T1 would require 18 hours to have 1E+11 bits go through.
      3. BER and real transmission systems
        1. Voice sounds ok at BER of 1E-5
        2. Voice till intelligible at BER of 1E-3
        3. Data doesn’t work very well at BER below 1E-7
          1. Retransmission of packets occurs so often that it is difficult to get a very high data rate.
  3. Error Correction
    1. FEC - Forward Error Correction
      1. FEC – A method by which errors in received data can be corrected without requiring retransmission
      2. Sometimes it is advantageous to correct an error instead of just detecting it.
        1. May take too long for retransmission to occur
          1. Real-time Data may not be able to wait for the round trip time
        2. May not be possible to ask for retransmission
          1. One way transmissions – Satellite, Broadcast data, etc.
      3. Brute force method:
        1. Send data repeatedly
        2. To correct n errors, send data 2n+1 times
        3. Not very efficient - there are better ways.
        4. This is almost never used in real systems.
      4. Hamming codes or Reed-Solomon codes do it much more efficiently
    2. Hamming Distance
      1. Hamming distance - the number of bits that have to be changed to go from one valid code to another.
        1. To detect d errors -Hamming distance of d+1
          1. parity gives a Hamming distance of 2, so it detects up to one bit error
        2. To correct d errors - Hamming distance of 2d+1
          1. Need to be able to distinguish "closest" valid code
    3. Hamming Code
      1. A symbol is a block of bits to be transmitted from source to destination.
      2. Hamming Code - an error correcting code to correct 1 bit error in a symbol
        1. For 4 data bits, 3 redundancy bits needed
        2. For 8 data bits, 5 redundancy bits needed
      3. Hamming Codes may also be used to detect up to 2 bit errors in the code word.
      4. Rule for Hamming Codes:
        1. (d + p + 1) £ 2P
          1. d = message bits (data)
          2. p = redundant bits (parity
      5. Codes often notated as (c,d)
        1. c = code size or d + p
        2. Therefore a (12,8) code has 8 data bits and 4 parity bits.
          1. A 50% overhead.
          2. (12 + 1) < 24
      6. To handle burst errors, multiple codes can be interleaved.
        1. Errors normally occur in bursts.
        2. Hamming code can handle only one bit error in the code word.
        3. Solution is to spread out the bits so that more than one won’t be hit by a burst of errors.
      7. Perfect code – an error correcting code where all possible codes are within hamming distance of a symbol.
        1. No "wasted" space in the code.
        2. Most efficient code that is possible for that symbol size, and code size.
      8. For Hamming codes a perfect code is when there is equality in the Hamming Code equation above
        1. (7,4) code is an example – m = 4, r = 3
        2. (4 + 3 + 1) = 23
    4. Reed-Solomon codes
      1. Another error correcting code
      2. Designed to have redundant symbols
      3. Allows for whole symbols to be errored
        1. It handles bursts of errors
      4. Common applications
        1. CD Players,
        2. Spacecraft communication
        3. DSL lines
      5. Notation - RS(NN,KK)
        1. MM - the code symbol size in bits
        2. NN - the block size in symbols (2**MM - 1)
        3. KK - the number of data symbols per block
          1. KK < NN
      6. Can correct (NN-KK)/2 errors per block
      7. If known that there are missing symbols, then RS can correct NN-KK "erasures"
      8. Common Reed Solomon Codes
        1. RS (255,223)
          1. Sends 255 8 bit characters, 223 of them are data and 32 are parity
          2. Can correct 16 errors or 32 erasures
        2. RS(204,188) - Used in Digital Video
          1. Sends 204 8 bit characters, 188 of them are data and 16 are parity
          2. Can correct 8 symbol errors
      9. Can be done on other than 8-bit characters
    5. More on FEC
      1. FEC is often used to improve the BER of lines.
        1. Can be more cost effective than boosting the power sent over the line
        2. Retransmission at higher levels may not be practical
      2. Error Correcting web pages
        1. http://www.piclist.com/techref/method/errors.htm
        2. http://people.qualcomm.com/karn/code/fec/
        3. http://www.engelschall.com/u/sb/hamming/
  4. Line Discipline
    1. Line Discipline defines who can send and when they can send.
    2. Simplex system
      1. Who is always known
      2. When is usually any time
    3. ENQ/ACK system
      1. Used in point to point systems
      2. Process
        1. Sender sends a ENQ (Enquiry) to the intended receiver.
        2. Receiver sends back an ACK (Acknowledgement) when it is ready to receive
        3. Sender sends data
        4. Sender sends EOT (End of Transmission) when done sending.
        5. Receiver may send a NAK (Negative Acknowledgement) if it isn’t ready.
      3. Common to have one node a primary and another a secondary
        1. Primary is responsible for initiating communication
      4. Some systems have nodes as peers
        1. Either can initiate a transfer
      5. In full-duplex system, data and control messages can be sent simultaneously
    4. Poll & Select system
      1. Always has a primary and secondary nodes
        1. May be point-to-point or multi-point
      2. Process - Poll
        1. Primary node requests data from the secondary
        2. If no data, secondary responds with a NAK
        3. Otherwise secondary responds with the data, and the primary ACKs this data.
      3. Process - Select
        1. Primary node sends message to secondary
        2. Secondary responds with ACK if it can accept data
        3. Primary sends data to secondary
        4. Secondary responds with ACK after successful transfer of data
    5. Network Addresses
      1. In a multi-point system, addresses needed so that nodes know who is talking to who
      2. Simple in some systems where physical position or dip-switch sets it.
      3. Ethernet defines that all manufactures of devices include a unique 48-bit (6-byte) address in every device.
        1. Polling 2^48 devices isn’t practical
        2. at 10Mbps, with a 60 byte poll it would take
          1. 2^24 / 10Mbps * 60 * 8 bytes = 1.35E+10 seconds
          2. or about 428 years
    6. Media Access Control
      1. MAC - Media access Control
      2. A multipoint network with no clear master and slave.
        1. master == primary
        2. slave == secondary
      3. In this system, there needs to be more complicated protocol than ENQ/ACK or select/poll to figure out who sends when
      4. See later on in this lecture….
  5. Flow Control
    1. Flow control - a protocol to ensure that the sender does not overwhelm a receiving station
    2. Incoming data must be checked and processed before it can be used
      1. This is often slower than the transmission rate
      2. Receivers need to buffer received data until it is processed
      3. Buffers are always limited
    3. Stop-n-Wait
      1. Stop and Wait flow control is simplest form of flow control
      2. Process
        1. Sender sends one frame of data
        2. Sender waits for ACK before sending more
        3. Receiver can slow down process by waiting to send ACK
      3. May be simple - but it is inefficient
      4. Stop-n-Wait efficiency
        1. Entire set of data is divided into n frames
        2. t
        3. prop = time to go from sender to receiver
          1. depends on distance and medium
        4. t
        5. frame = time to send one data frame
          1. depends on bit rate and frame size
        6. TD = n(2 tprop + tframe) = Time to send data
        7. Actual transmission of data is n* tframe
        8. Efficiency is (n* tframe )/ n(2 tprop + tframe)
        9. = 1 / (1 +2(tprop / tframe))
        10. If tprop is small and tframefor frame is large, it approaches 100%, but never reaches it.
    4. Sliding Window Protocol
      1. Sliding Window flow control allows for more than one frame to be sent before an acknowledgement is received
        1. frames are numbered from 0 to n-1
        2. Sender sends frames up until it has sent n-1 frames
        3. Receiver sends back an ACK with the number of the next frame it expects
        4. Sender can now send up to this new number
        5. Maximum window size is n-1 frames
      2. Sliding Window Efficiency
        1. If the sliding window is big enough, and the receiver is fast enough, then 100% efficiency can be achieved
        2. To get 100% -
          1. 2 * t
          2. prop £ ((n - 1) * tframe )
          3. t
          4. prop = Time for a frame to propagate from one end to the other of the system
          5. t
          6. frame = Time to send a frame (based on frame size and bit rate)
  6. Error Control
    1. Error Control is the use of error detection combined with retransmission of data that has had an error detected in it.
    2. Error detection is achieved
      1. using parity, checksums, CRC, etc
      2. Extra data added to each frame
    3. ARQ - Automatic Repeat Request
      1. This abbreviation isn’t really used much
    4. Stop-n-Wait Error Control
      1. Extension of Stop-n-Wait flow control
        1. When a good frame is received, an ACK sent back to originator.
        2. When a damaged frame is received, a NAK is sent back to originator
        3. When a NAK is received, the same frame is retransmitted
      2. Any frame may be dropped - data or NAK or ACK.
        1. If ACK isn’t received after some time, a NAK is assumed
        2. If a NAK is lost - timeout and retransmission
        3. If a ACK is lost timeout and retransmission
          1. Frames must be numbered so receiver can throw away duplicates
        4. If data is lost - timeout and retransmission
      3. Has same efficiency issues
    5. Go-back-N Error Control
      1. Uses a Sliding window
      2. If a NAK is received or an ACK timeout occurs
        1. all data since last ACK received is resent
        2. This may be several frames
      3. ACKs contain next expected frame number
      4. NAKs contain errored frame number
    6. Selective-Reject Error Control
      1. Uses a Sliding window
        1. If a NAK is received, only the damaged frame is resent
      2. Selective Reject is more complicated to implement
        1. Receiver must contain the sorting logic to enable it to reorder frames
        2. Sending device must contain searching mechanism to enable it to find and select only the requested frame for retransmission
        3. Buffer in the receiver must keep all previously received frames on hold until all retransmissions have been sorted, duplicates identified and discarded
        4. ACK numbers must refer to the frame received instead of next frame expected
      3. More efficient as it resends less data
        1. If errors don’t occur very often - there isn’t a big difference in efficiency
      4. Selective Reject is rarely implemented over Go-back-N method
        1. Complication makes it slower to get products made
        2. Increased Efficiency is minimal and not worth the extra effort
  7. Byte Oriented Link Control
    1. List
    2. here is all from Modem communications world
      1. Normally people don’t address these as link layer controls
    3. Not just data-link as most of these protocols addressed issues of file-transfer
    4. XMODEM
      1. Stop-n-Wait error control
      2. Fixed data field of 128 bytes, CRC-8
    5. YMODEM
      1. increased data field to 1024 bytes, CRC-16
    6. ZMODEM
      1. More features
      2. Sliding window
    7. Kermit
      1. sliding window
      2. Also a terminal emulation package
    8. BSC - Binary synchronous communication
      1. Few people care about this one
  8. Bit Oriented Link Control
    1. Many bit Oriented protocols today are oriented around HDLC
    2. Ethernet, Token-Ring and other LAN technologies are also bit-oriented
    3. SONET, T1, and most telephony systems are bit-oriented.
    4. It is common to use HDLC over a SONET or T1 system
  9. HDLC – High-Level Data Link Control
    1. HDLC node types
      1. Primary station
        1. Has complete control over the link
        2. Can be used in point-to-point or multi-link
      2. Secondary station
        1. Receives commands from the primary and responds accordingly
      3. Combined station
        1. Can both command and respond to commands
        2. Used in peer to peer (or balanced) configurations
    2. HDLC Configurations
      1. Unbalanced
        1. One primary node and one or more secondary nodes
        2. May be point-to-point or multi-point
        3. Full or half-duplex
      2. Balanced
        1. point-to-point topology only
        2. Both nodes are combined stations
    3. HDLC Modes of Communication
      1. NRM - Normal Response mode
        1. Standard Primary-secondary relationship
        2. Secondary can transmit only when polled
      2. ARM - Asynchronous Response Mode
        1. A secondary may initiate a transmission without permission from the primary
      3. ABM - Asynchronous Balanced Mode
        1. Combined stations - either can initiate
        2. Point-to-point topology
    4. HDLC Frame structure
      1. Frame of data consists of:
        1. Flag: 8 bits
        2. Address: 1 or more bytes
        3. Control: 8 or 16 bits
        4. Information: many bits
        5. Frame Check Sequence (FCS): 16 or 32 bits
        6. Flag: 8 bits
      2. HDLC Flag byte
        1. The flag is a special bit pattern that signifies the start/end of a frame.
        2. A single flag byte may be used to end one frame and start another
        3. Flag bit pattern is 0111 1110
        4. Bit stuffing is used to make sure this pattern does not occur any where else.
        5. Bit stuffing occurs on all bytes (control or data) that are not flag bytes
      3. HDLC Address field
        1. First seven bits are used for address
        2. Last bit - if a 1 then there are no more address bytes
        3. Last bit - if a 0 then there is another address byte
        4. Last bit of last address byte is always a 1
        5. Number of address bits is a multiple of 7
      4. HDLC Control Field
        1. One or two byte segment for flow management
        2. Control fields change based on type of frame
        3. If first bit is 0
          1. frame is an I-Frame
        4. If first bit is 1 and second 0
          1. frame is an S-Frame
        5. If first bit is 1 and second 1
          1. frame is a U-Frame
        6. I-Frame - Information Frame
          1. 3 bits for send sequence number
          2. 3 bits for receive sequence number (ACKs)
          3. 1-bit - Poll/Final bit
            1. From secondary - set to 1 if last I-frame of response
            2. From primary - set to 1 if polling secondary for data
        7. S-Frame - Supervisory Frame
          1. 3 bits for receive sequence number
          2. 2 bits for "supervisory" functions
          3. 1 bit - Poll/final bit
          4. S-Frames used for ACK when no data
        8. U-Frame - Unnumbered frame
          1. 1 bit- Poll/Final bit
          2. 5 bits - "Unnumbered function" bits
        9. S-frames can be used for:
          1. reject
          2. selective rejects
          3. receiver ready
          4. receiver not ready
        10. U-frames can be used for
          1. Mode setting (primary/secondary)
          2. unnumbered information transfer
          3. recovery - bad commands, etc
          4. initialization
          5. disconnect
          6. test command/response
      5. HDLC - Information field
        1. The information field is a series of bits until the next flag byte is found
        2. S-Frame has no information field
        3. The last 16 or 32 bits of the Information field is actually the FCS
      6. FCS - Frame check sequence
        1. a 16 or 32-bit CRC on the data