This lesson looks at the basics of processing digital images using Python. Digital images are arrays of dots, called pixels, the small points of color which make up an image. We will manipulate the pixels one at a time to change the appearance of the image. Note that Python is rather slow for this kind of application. Serious image processing applications are usually written in a language such as C, which is designed more for execution speed than programmer convenience. However, such programs do exactly the same sorts of computation as we will perform.
Instead of beginning by starting Idle, download the following test images.
To download, right click and select
save target or save image.
The test images are small since Python tends to be slow at this, as
noted above.
They are GIFs, since that's what our example tool reads.
You may save them anywhere you like, but remember the location
since you'll have to find them again.
Now, download this Python
program imdisp.pyw
and
save it on your desktop. Again, right click and tell
the browser to save the target.
Okay, I was just putting it off. Start Idle, as before.
This time, though, use the file menu to open the file
imdisp.pyw
you saved in the last step.
Run the program. You will see a window appear with a standard File
menu at the top, an empty area, and six buttons at the bottom. Use
the File menu at the top to Open the image
Python_royal_35.gif
which
you downloaded earlier. It will appear in the window, though not all
that quickly.
Press the Brighter button. After a short delay, the picture will get brighter. Try this a couple of times.
Press any of the other buttons. You'll just get a dialog box noting that this function is not yet implemented.
This program uses a number of Python features which we have
not discussed. For the purposes of this lesson, most of
this program will remain a mystery. But feel free to look at it
all you like for your own enlightenment.
The portion we will work with in this lesson starts near the bottom
of the file, beginning with the comment that says START HERE
.
It looks like this:
The program again uses Tkinter
. The function mkbut
is used to create each of the buttons at the bottom of the window.
Each one is created with standard colors, with text and location
sent as parameters. This is done with the six calls to mkbut
which
follow the function definition.
Each is created with the command function
sorry
, which merely shows a window indicating that this
function is not implemented. You will see this message
if you click any button besides Brighter.
Following the button creation is the actual implementation of
the Brighter button. The function dolighten
performs the
operation, and the call to lighten.configure
assigns this
new command function in place of sorry
. Push the Brighter
button to see what it does. Push it again.
If you want the original image back, you can open the file again.
The buttons only change the the displayed version.
The Brighter button does just what it says: It makes the picture
brighter. The image is stored in an object called pic
.
The picture is a two-dimensional array of pixels. Its size in pixels
is described by pic.height()
and
pic.width()
. Any particular pixel can be accessed
by pic.idat[
r][
c]
,
where r and c are the row and column number. Each can
be any integer expression, ranging from 0 to pic.height()
-1 and
pic.width()
-1, respectively.
We used subscripting in Lesson 2 and Lesson 3
to select individual characters from a string.
We are applying the same operation here to select
pixels out of an image.
Very often, we access the pixels with a double loop of the sort
you see in dolighten
.
The outer for
goes through each legal row number.
For each row, we go through each legal column number. Inside the two
loops, the program generates each possible pair of row and column numbers,
and therefore refers to each pixel.
The body of the double for
loop, then, is executed exactly once
for each pixel in the image.
Each pixel has three parts, r
, g
and b
, for red, green
and blue. Each is an integer value from 0 to 255 indicating how brightly
that particular color is turned on.
You can access and change these values, as shown here.
Consider the loop body in dolighten
.
First,
p
.
Then,
p.r
is just an
integer variable that is part of the pixel p
.
We could almost use instead
min
function returns the lower of its
two arguments. If the sum p.r + 30
is less than 255, min
returns that. If it is more, min
returns 255.
We then do the same thing for green and blue.
Finally, you must say pic.sync()
before the changes you make
actually appear on the screen.*
dolighten
to eliminate the
max
. That is, change the first one from
.configure()
statement.
The implementations vary somewhat in difficulty.
int
builtin function converts from floating point to integer
by discarding the fractional part.
As you can see (perhaps with the help
of a calculator), a component of 30 becomes 0, and 225 becomes 255.
Smaller values are reduced: 45 becomes 20; larger ones are
increased: 180 becomes 196. Middling values are changed little:
120 becomes 118. The ratio 255/195 is the maximum pixel value you want
divided by the maximum value of c - 30. Adding 0.5 is a very
old trick to get the int
function to round to an integer. (If you
don't believe it, try a few.) You might want to create an
extra function to perform this mapping, then your loop body can call
it in three places, one for each color component.
pic.height()-1
. Pixels in
row 1 must exchange with pixels pic.height()-2
, and so forth.
You can write a for
loop like
row
must
then be exchanged with those in pic.height() - row - 1
.
In Python, you can exchange two values with an assignment
like
![]() | ![]() | ![]() |
Red | Blk & White | Contrast |
![]() | ![]() |
Flip | Pixelized |
Try your operations on each of the test images to see what happens. Photoshop is a tad more complex, but makes the same kinds of calculations you are making.
Try applying some of your operations multiple times. Except for Flip, once you have applied the operation, you cannot get your original image back, except by reloading it from the file. Could you create buttons to undo the other transforms without keeping a copy of the original image around? Could you create a Darker button that would always exactly reverse the Brighter button?
Computers are high-speed detail artists. An image is manipulated by doing many fairly simple calculations over and over for each pixel in the image. The human mind sees a whole image, but the computer only "looks" at one pixel at a time. This is also the primary difficulty of programming. The computer executes many small instructions quickly, but it has no sense of what the program does. It is the programmer who must make the connection between the program and its parts. And that can be a challenge.
*This last step is needed
because pic.idat
is actually a copy of what is displayed on
the screen, not the screen data itself. You code updates the idat
copy,
and the sync
method copies idat
to the display area.
The reason why this is needed is much longer than this footnote.
Copyright 2005, 2006
Thomas W Bennet •
Image Credits
•
Terms of use:
Creative
Commons