File compression how does it work
The more heavily a file is compressed with lossy compression, the more noticeable the reduction in quality becomes. Also, lossy compression does not work well with files where all of the data is crucial for example, compressing a spreadsheet would yield unusable results. Lossless compression reduces file size without removing any bits of information. Instead, this format works by removing redundancies within data to reduce the overall file size.
With lossless, it is possible to perfectly reconstruct the original file. For example, the most common lossless compression format ZIP is often used for program files in Windows, as it preserves all the original information.
Decompressing the file unzipping produces an executable program that would otherwise be useless with lossy. Browse All Smart Home Articles Customize the Taskbar in Windows Browse All Microsoft Office Articles What Is svchost. Browse All Privacy and Security Articles Browse All Linux Articles Browse All Buying Guides. Best iPhone 13 Pro Case. Best Bluetooth Headphones for Switch. Best Roku TV. Best Apple Watch. Best iPad Cases.
Best Portable Monitors. Best Gaming Keyboards. Best Drones. Best 4K TVs. Best iPhone 13 Cases. Best Tech Gifts for Kids Aged Awesome PC Accessories. Best Linux Laptops.
Best Bluetooth Trackers. Best eReaders. Best Gaming Monitors. Best Android Phones. Browse All News Articles. Prey Predator Prequel Hulu. Because of the large size of these files, downloading them can take hours.
To solve this problem, and make better use of disk space, large files are compressed, using various software. Once downloaded, they can then be decompressed, and viewed, using a decompression program. Compression software works by using mathematical equations to scan file data and look for repeating patterns. The software then replaces these repeating patterns with smaller pieces of data, or code, that take up less room. Once the compression software has identified a repeating pattern, it replaces that pattern with a smaller code that also shows the locations of the pattern.
For example, in a picture, compression software replaces every instance of the color red with a code for red that also indicates everywhere in the picture red occurs. Compressed files usually end with. These are called extensions, and they indicate different compression formats--different types of software used to compress files.
Then, we simply write the number instead of writing out the whole word. If you knew the system, you could easily reconstruct the original phrase using only this dictionary and number pattern. This is what the expansion program on your computer does when it expands a downloaded file. You might also have encountered compressed files that open themselves up. To create this sort of file, the programmer includes a simple expansion program with the compressed file. It automatically reconstructs the original file once it's downloaded.
But how much space have we actually saved with this system? In an actual compression scheme, figuring out the various file requirements would be fairly complicated; but for our purposes, let's go back to the idea that every character and every space takes up one unit of memory.
We already saw that the full phrase takes up 79 units. Our compressed sentence including spaces takes up 37 units, and the dictionary words and numbers also takes up 37 units. This gives us a file size of 74, so we haven't reduced the file size by very much. But this is only one sentence!
You can imagine that if the compression program worked through the rest of Kennedy's speech, it would find these words and others repeated many more times. And, as we'll see in the next section, it would also be rewriting the dictionary to get the most efficient organization possible. In our previous example, we picked out all the repeated words and put those in a dictionary.
To us, this is the most obvious way to write a dictionary. But a compression program sees it quite differently: It doesn't have any concept of separate words -- it only looks for patterns. And in order to reduce the file size as much as possible, it carefully selects which patterns to include in the dictionary.
If we approach the phrase from this perspective, we end up with a completely different dictionary. If the compression program scanned Kennedy's phrase, the first redundancy it would come across would be only a couple of letters long.
In "ask not what your," there is a repeated pattern of the letter "t" followed by a space -- in "not" and "what. But in this short phrase, this pattern doesn't occur enough to make it a worthwhile entry, so the program would eventually overwrite it. The next thing the program might notice is "ou," which appears in both "your" and "country.
But as the compression program worked through this sentence, it would quickly discover a better choice for a dictionary entry: Not only is "ou" repeated, but the entire words "your" and "country" are both repeated, and they are actually repeated together, as the phrase "your country. The phrase "can do for" is also repeated, one time followed by "your" and one time followed by "you," giving us a repeated pattern of "can do for you.
This ability to rewrite the dictionary is the "adaptive" part of LZ adaptive dictionary-based algorithm. The way a program actually does this is fairly complicated, as you can see by the discussions on Data-Compression.
No matter what specific method you use, this in-depth searching system lets you compress the file much more efficiently than you could by just picking out words.
0コメント