how-to-do-this-python-project

Overview: In this project, you will be creating code to compress and decompress image files.

Goal: To gain experience with compression algorithms.

Requirements: You are going to implement a compression algorithm in Python 3.x. You have been

given a set of example images and your goal is to compress them as much as possible without losing any

perceptible information – upon decompression they should appear identical to the original images.

Images are essentially stored as a series of points of color, where each point is represented as a

combination of red, green and blue (rgb). Each component of the rgb value ranges between 0-255, so

for example: (100, 0, 200) would represent a shade of purple. Using a fixed length encoding, each

component of the rgb value requires 8 bits to encode (28 = 256) meaning that the entire rgb value

requires 24 bits to encode. You could use a compression algorithm like Huffman encoding to reduce the

number of bits needed for more common values and thereby reduce the total number of bits needed to

encode your image.

You can use the Pillow library of Python to read in image files and extract the rgb values of individual

points. To install the library in PyCharm, go to Project Settings -> Project Interpreter. To the right of the

list of packages will be a small + sign. Click on it to open up the window of available packages and search

for Pillow – since there are several similarly named libraries, you want the one written by Alex Clark and

its full name should be Python Imaging Library (Fork). Then click on Install Package. (It’s possible you

may get an error if you have an older version of pip installed – it would be listed on the prior page of

installed packages. If the error occurs, first install the latest version of pip, then install Pillow.)

Once you have Pillow installed, you can include the library into your project code with:

from PIL import Image

You can then open image files using, for example:

img = Image.open(“powercat.bmp”)

If you wanted to display the image in your standard image viewer, you could use:

img.show()

If you want to access the pixel values of points in an image, you’ll first need to call load() to create a

pixel access object and then you can access individual pixels using list notation:

width, height = img.size

px = img.load()

for x in range(width):

for y in range(height):

print(px[x,y])

Your code should read in an image file, compute how many bits are required for a fixed length encoding

and then apply a compression algorithm to create a smaller encoding – you need to implement the

compression, you cannot use a compression library. You should output how many bits are required to

store the image in your compressed format as well as the compression ratio achieved. When it comes

to saving your compressed image, you won’t be able to save it as a standard image format, since you will

have created your own encoding, but you can save the sequence of bits into a text or binary file.

Your code should also be able to prompt the user for the filename of a text file containing a compressed

sequence of bits and then decompress that file into the original image – you can assume that the file

uses the same compression format as the last file you compressed. So, for example, if you compressed

pacificat.bmp into a series of bits stored in pacificat.txt and then the user asked you to decompress

alt_pacificat.txt, you could assume that alt_pacificat.txt used the same compression data structure as

pacificat.txt (it might be a subset of the data from the original image, for example).

There are a number of libraries that can help you store formatted data into a file from Python. If you

research the options and find a way to store your compression data structure into a file, such that the

user can select both a bit file and a data structure file and use the data structure to decompress the bit

file, then you can earn an extra half point on the rubric.

Your code should compute the following statistics each time the compression code is called on a file:

 Runtime in milliseconds of the compression process, including any time needed to create the

data structures used to help with the compression process.

 Number of bits needed by your algorithm to encode the contents of the file.

 Number of bits needed by a fixed length encoding to encode the contents of the file.

 The compression ratio achieved.

For the decompression process, you only need to track the runtime in milliseconds.

Algorithm: In addition to your code, you need to turn in a well-formatted document containing the

pseudo-code for your core algorithm (written in the same format as the textbook uses for pseudo-code)

along with any proofs you decide to include about your algorithm – see grading section below.