1.2 Data Storage

Show All Section Notes

Data Compression

1. Why Compress Data?

Compression is the process of reducing the size of a file. This is crucial for several reasons:

  • Faster Transmission: Files take less time to upload/download or send via email.
  • Storage Savings: More files can be stored on a drive (SSD/HDD).
  • Streaming: Required for smooth video (YouTube/Netflix) and music streaming.
  • Website Speed: Smaller images help web pages load faster.

2. Lossy vs. Lossless Compression

Lossy

Permanently removes "unnecessary" data that the human eye or ear cannot easily perceive.

File size: Significantly reduced.

Formats: JPEG, MP3, MP4.

Lossless

Reduces file size without losing any original data. The file can be reconstructed exactly.

File size: Moderately reduced.

Formats: PNG, GIF, ZIP, FLAC.

3. How Lossless Works (RLE)

One common method of lossless compression is Run-Length Encoding (RLE). It looks for consecutive repeating data and stores it as a single value and a count.

Uncompressed Data:
AAAAABBBCCCDDDDD

RLE Compressed:
5A3B3C5D

In images, RLE works by identifying long runs of identical colored pixels.

4. How Lossy Works

Lossy compression algorithms use Perceptual Coding:

  • Images (JPEG): Reduces the number of colors or simplifies areas where the eye won't notice a change.
  • Sound (MP3): Removes frequencies that the human ear cannot hear and removes quieter sounds that are "masked" by louder sounds.

5. Comparison Summary

Feature Lossy Lossless
Original Quality Lost permanently Kept perfectly
Compression Ratio Very High (Tiny files) Low (Larger files)
Best for... Photos, Video, Streaming Text files, Spreadsheets, Code
⚠️ Exam Note: You cannot use Lossy compression for software programs or text files. If you lose even one bit of a program's code, the entire program may fail to run!