Character Sets
1. What is a Character Set?
A Character Set is a defined list of characters recognized by computer hardware and software. Each character is assigned a unique binary number (binary code).
Without a character set, a computer would just see a string of bits and wouldn't know if they represented a number, a sound, or the letter 'A'.
2. ASCII (American Standard Code for Information Interchange)
ASCII was the first widely used character set. It originally used 7 bits, providing $2^7$ ($128$) unique characters.
Example Mapping:
Character 'A' → Denary 65 → Binary 01000001
Character 'a' → Denary 97 → Binary 01100001
Character '!' → Denary 33 → Binary 00100001
Character 'A' → Denary 65 → Binary 01000001
Character 'a' → Denary 97 → Binary 01100001
Character '!' → Denary 33 → Binary 00100001
Extended ASCII: Later updated to 8 bits ($2^8$), allowing for $256$ characters. This added mathematical symbols and some non-English characters.
3. Unicode
As computing went global, 256 characters weren't enough for languages like Chinese, Arabic, or Hindi. Unicode was created to represent every character in every language.
- Uses 16 bits (65,536 characters) or 32 bits (over 4 billion characters).
- The first 128 codes in Unicode are identical to ASCII, making it "backward compatible."
- Includes Emojis and historical scripts (like Hieroglyphics).
4. ASCII vs. Unicode: Comparison
| Feature | ASCII | Unicode |
|---|---|---|
| Bits per Character | 7 or 8 bits | 16 or 32 bits |
| Number of Characters | 128 to 256 | Over 1 million (currently) |
| Storage Requirements | Low (1 byte per char) | High (2 to 4 bytes per char) |
| Global Use | English/Western only | Universal (all languages) |
5. Key Exam Terms
- Character:
- A single symbol (letter, number, or punctuation mark).
- Alphanumeric:
- Characters that include both letters and numbers.
- Control Characters:
- Non-printing characters that perform actions, such as "Shift," "Backspaced," or "Enter."
💡 Exam Tip: If an exam question asks why Unicode is better than ASCII, mention that it allows for global communication and supports multilingual applications, even though it requires more storage space.