Revision Notes for IGCSE

Character Sets

1. What is a Character Set?

A Character Set is a defined list of characters recognized by computer hardware and software. Each character is assigned a unique binary number (binary code).

Without a character set, a computer would just see a string of bits and wouldn't know if they represented a number, a sound, or the letter 'A'.

2. ASCII (American Standard Code for Information Interchange)

ASCII was the first widely used character set. It originally used 7 bits, providing $2^7$ ($128$) unique characters.

Example Mapping:
Character 'A' → Denary 65 → Binary 01000001
Character 'a' → Denary 97 → Binary 01100001
Character '!' → Denary 33 → Binary 00100001

Extended ASCII: Later updated to 8 bits ($2^8$), allowing for $256$ characters. This added mathematical symbols and some non-English characters.

3. Unicode

As computing went global, 256 characters weren't enough for languages like Chinese, Arabic, or Hindi. Unicode was created to represent every character in every language.

Uses 16 bits (65,536 characters) or 32 bits (over 4 billion characters).
The first 128 codes in Unicode are identical to ASCII, making it "backward compatible."
Includes Emojis and historical scripts (like Hieroglyphics).

4. ASCII vs. Unicode: Comparison

Feature	ASCII	Unicode
Bits per Character	7 or 8 bits	16 or 32 bits
Number of Characters	128 to 256	Over 1 million (currently)
Storage Requirements	Low (1 byte per char)	High (2 to 4 bytes per char)
Global Use	English/Western only	Universal (all languages)

5. Key Exam Terms

Character:: A single symbol (letter, number, or punctuation mark).
Alphanumeric:: Characters that include both letters and numbers.
Control Characters:: Non-printing characters that perform actions, such as "Shift," "Backspaced," or "Enter."