Differences

This shows you the differences between two versions of the page.

--- tanszek:oktatas:techcomm:multimedia_compression [2024/11/19 08:07] – [Video Frame Compression Example] knehez
+++ tanszek:oktatas:techcomm:multimedia_compression [2024/11/19 11:10] (current) – knehez
@@ Line 1: / Line 1: @@
 ===== Multimedia Compression Methods =====
-Multimedia files like audio, video, and images are often very large in their uncompressed form. Compression is used to reduce the amount of data required to store or transmit these files, making storage more efficient and reducing bandwidth requirements for transmission. There are two main types of compression methods:
+Multimedia files like audio, video, and images are often very large in their uncompressed form. Compression is used to reduce the amount of data required to store or transmit this information, making storage more efficient and reducing bandwidth requirements for transmission. There are two main types of compression methods:
-   - **Lossless Compression**: This preserves all the original data perfectly, meaning that the decompressed file is identical to the original. Common examples include **PNG** for images and **FLAC** for audio.
+   - **Lossless Compression**: This preserves all the original data perfectly, meaning that the decompressed file is identical to the original. Examples include **PNG** for images and **FLAC** for audio.
-   - **Lossy Compression**: This reduces the file size by removing data that is less important or less noticeable to human perception, often sacrificing some quality in the process. **JPEG** for images and **MP3** for audio are popular lossy formats.
+   - **Lossy Compression**: This reduces the file size by removing less important or less noticeable data to human perception, often sacrificing some quality in the process. **JPEG** for images and **MP3** for audio are popular lossy formats.
 ==== Lossy Compression Techniques in Multimedia ====
-Lossy compression techniques are often used for **audio, video, and images**, aiming to remove data that is not perceptually significant to humans.
+Lossy compression techniques are often used for **audio, video, and images** to remove data that is not perceptually significant to humans.
    - **Human Perception Optimization**: Lossy compression exploits limitations in the human eye and ear:
      - For images, the human eye is **less sensitive** to subtle changes in **high spatial frequency** areas (fine details), which allows image compression methods like **JPEG** to reduce the size by dropping some fine-grained details.
@@ Line 11: / Line 11: @@
 ==== Audio Sampling Example ====
-When audio is recorded (e.g., with a microphone), it needs to be **digitized** for storage. The **sampling rate** is the number of times per second that the audio signal is measured.
+When audio is recorded (e.g., with a microphone), it needs to be **digitized** for storage. The **sampling rate** is the number of times the audio signal is measured per second.
-   - **CD Quality Audio**: Has a sampling rate of **44.1 kHz** (44,100 samples per second) and each sample is represented as a **16-bit** value. In stereo audio, two channels (left and right) are recorded, which means twice the amount of data is needed.
+   - **CD Quality Audio**: Has a sampling rate of **44.1 kHz** (44,100 samples per second), representing each sample as a **16-bit** value. In stereo audio, two channels (left and right) are recorded, which means twice the amount of data is needed.
    - Example Calculation:
      - **1 second of stereo CD quality audio**:
@@ Line 29: / Line 29: @@
 ==== Two-Dimensional Fourier Transform ====
-The **Fourier Transform** is a mathematical operation that transforms a signal from the **time domain** (or spatial domain, for images) to the **frequency domain**.
+The **Fourier Transform** is a mathematical operation transforming a signal from the **time domain** (or spatial domain, for images) to the **frequency domain**.
-   - In multimedia compression, **2D Fourier Transform** is used for images. It converts image pixels into frequency components. These frequency components can then be analyzed, and parts that are less significant to human perception can be discarded.
+   - In multimedia compression, **2D Fourier Transform** is used for images. It converts image pixels into frequency components. These frequency components can then be analyzed, and parts less significant to human perception can be discarded.
-   - **Example**: Imagine an audio waveform where the amplitude is plotted over time. The Fourier Transform decomposes this waveform into its frequency components—essentially figuring out which notes (frequencies) are being played and how strong they are. This concept is applied to images as well, allowing more effective compression.
+   - **Example**: Imagine an audio waveform where the amplitude is plotted over time. The Fourier Transform decomposes this waveform into its frequency components — figuring out which notes (frequencies) are being played and how loud they are. This concept is applied to images as well, allowing more effective compression.
    - In practice, this means that areas of an image with **high-frequency details** (e.g., fine patterns) can be simplified or removed during compression without significantly impacting the perceived quality. This is what is exploited in compression standards like **JPEG** to achieve significant size reduction.
@@ Line 38: / Line 38: @@
 The analogy mentioned in the text compares the **Fourier Transform** to recognizing musical notes in an audio recording:
    - Imagine you have a **mono recording** of music with different notes being played over time. The **Fourier Transform** is like figuring out which notes are being played (e.g., **C#**, **C**) during different time intervals.
-   - This is similar to trying to write the **musical score** (sheet music) just by listening to the audio. By focusing only on the most important notes (the **note heads**), you would end up with a much more **compressed** version of the original sound, while still retaining most of the important information.
+   - This is similar to trying to write the **musical score** (sheet music) just by listening to the audio. By focusing only on the most important notes (the **note heads**), you would end up with a much more **compressed** version of the original sound while retaining most vital information.
 ==== Human Sensory Limitations and Compression ====
    - **Vision**: The human eye is more sensitive to **low spatial frequencies** (smooth gradients) and less sensitive to **high spatial frequencies** (fine details or noise). This property is used in image and video compression to drop unnecessary detail in complex patterns, which most viewers won’t notice.
-   - **Hearing**: The human ear is more sensitive to certain frequency ranges. **MP3 compression** uses this by discarding audio data that is in ranges we typically cannot hear well.
+   - **Hearing**: The human ear is more sensitive to certain frequency ranges. **MP3 compression** uses this by discarding audio data in ranges we typically cannot hear well.
 ==== Summary ====
-Multimedia compression methods, particularly lossy ones, take advantage of **human sensory limitations** to reduce data size without noticeable loss in quality. Techniques such as **Fourier Transform** allow multimedia compression algorithms to identify and remove data that is less perceptible, thereby achieving substantial compression ratios. These methods make it possible to store and transmit multimedia content effectively, without overwhelming data storage capacities or requiring impractical bandwidth.
+Multimedia compression methods, particularly lossy ones, take advantage of **human sensory limitations** to reduce data size without noticeable loss in quality. Techniques such as **Fourier Transform** allow multimedia compression algorithms to identify and remove less perceptible data, thereby achieving substantial compression ratios. These methods enable storing and transmitting multimedia content effectively, without overwhelming data storage capacities or requiring impractical bandwidth.