Chapter 16. The Python Programming Language

In This Chapter

  • Manipulating numbers

  • Working with strings of characters

  • Making use of lists

  • Uncovering the fundamental structure of a Python program

SPSS has added Python — a general-purpose programming language — that can be used as a SPSS scripting language. Chapter 17 is about using Python inside SPSS, but this chapter is about the Python programming language itself. If you're not a programmer, don't worry; Python is famous for being easy to learn. And you might think it's named after a snake, but it isn't. It's named after Monty Python's Flying Circus. I just thought I'd mention that in case you thought things were going to get serious. And now for something completely different.

Instructing Python: You Type It In and Python Does It

If you give Python an instruction that it understands, it will obey the instruction and do something. It's very obliging that way. But you have to be specific when you tell it what you want it to do.

If you want a Python of your own, outside of the one that comes with SPSS, you can download and install one from the Internet for free. By playing with your own Python, you can see how the examples in this chapter work. The only way to really get to know a programming language is to fiddle around with it and write some programs of your own. Sometimes you get great insight into programming from finding out what doesn't work.

Python is an interpreter. That is, instead of taking your set of program instructions and translating them to machine language, it just reads and obeys whatever you type. In effect, it reads your commands as it would read a script — so Python programs are also called scripts.

Python can be used to generate graphic displays, communicate over the Internet, make calls into the operating system, and some other things that we won't be messing with. This chapter shows you just enough basic Python to get you comfortable with writing scripts for SPSS.

If you want to get your own private Python, you can get your own standalone version at www.python.org. When you fire up the standalone version of Python, it displays >>> as a prompt for you to give it some instructions. If you type something it knows how to do, it will do it. If you type something it doesn't understand, it will complain — but it won't bite. Remember, it's not a snake.

Understanding the Way Python Does Arithmetic

Statistics is made out of arithmetic, and Python is good at arithmetic. You can enter any expression you want, and Python will do the calculations and give you the answer.

Let's start with something simple. At the prompt, type a simple addition such as the following. Python comes back with a result:

>>> 2 + 2
4

You can use multiplication, division, decimal points, parentheses, and all sorts of fancy stuff:

>>> (88 + 2) / 6
15

The symbol for multiplication is the asterisk:

>>> 10 * 10
100

If you do integers, Python does integers. If you do decimal points, Python does decimal points. Integer arithmetic just chops off, like this:

>>> 7/2
3

And arithmetic that uses decimal points (floating-point arithmetic) keeps the fractional portion in decimal form, like this:

>>> 7.0/2.0
3.5

Note

Actually, the behavior of the decimal point in integer division varies a bit. Newer versions of Python (3.1 and later) automatically insert the decimal point for division. But it's always best to put them in just to make sure what your result will be. If you always use decimal points in your numbers, the behavior is consistent for all versions of Python.

You can mix integer and decimal numbers in the same expression, but watch what you're doing. Whenever any operation involves at least one number with a decimal point, Python treats all the numbers as if they have decimal points. For example:

>>> 7/2.0
3.5

However, you have to be really careful when you mix the number types like that. You could get something other than what you expect. The following two examples look the same, but they could be different, depending on your version of Python:

>>> 7.0/2.0 + 4.5
8.0
>>> 7/2 + 4.5
7.5

The first example performs a decimal-point division and winds up adding 3.5 to 4.5. The second example performs an integer division — which chops off the decimal part — and winds up adding 3 to 4.5. These results are different in a very practical sense: One of them is wrong for whatever you happen to be calculating.

Instead of just printing the numbers on the display, as we've done so far, you can store them in a name, called a variable. The three dimensions of a box could be stored in variables this way:

>>> height=20.0
>>> width=9.0
>>> depth=12.0

No number is displayed by these statements because simply storing a number somewhere doesn't display it. Python remembers those names and numbers for you. You can calculate the volume of the box and have it displayed this way:

>>> height * width * depth
2160.0

Note

In Python, the equal sign (=) is the assignment operator: It takes the value from whatever you put on the right and stores it in whatever location you name on the left. It writes over whatever was there before.

If you want, you could store the volume of the box in the current example in another variable and then display it, like this:

>>> volume = height * width * depth
>>> volume
2160.0

Warning

Whatever name you enter is the one Python uses. If you spell it wrong, it's a different name, so use names that are easy to spell. And don't use things like the uppercase letter I and the lowercase letter l because they can be mistaken for each other and confused with the number 1. And watch out for the letter O and the number 0.

Python has the memory of an elephant snake. After you stick a value in a variable, Python will remember it forever. (Well, at least until you stop running the program.) If you want to really save a value, write it to a file on disk so you can read it back. (That's easy to do, and we get to it later.)

As you've seen, if you simply name a value or a variable, Python prints it for you. You can also use the print function, like this:

>>> print(volume)
2160.0
>>> print(height,width,depth)
20.0 9.0 12.0

Python remembered these variables from earlier. And you can see how the print function can handle more than one value at a time.

Understanding the Way Python Handles Words

If you want Python to notice what you're saying, you'll have to put it in quotes. You can use single or double quotes, but whichever one you use at the start is the one you must use at the finish. Like this:

>>> 'Single quotes'
'Single quotes'
>>> "Double quotes"
'Double quotes'

If you enter a quoted string by itself this way, Python echoes it back to you just as it would a number. Python usually uses single quotes when it echoes, but that's just an attitude problem and doesn't matter.

Note

In the world of computer programming, any group of characters used to make up a name, a sentence, or anything you can read is called a string. Also, a blank is a character just like any other, except you can't see it if you're a mere mortal.

You can put single quotes inside double quotes and double quotes inside single quotes, like this:

>>> "Girl's clothes?"
"Girl's clothes?"
>>> '"Girl clothes?" he asked'
'"Girl clothes?" he asked'

Hmm. This time Python uses double quotes to display the string that contains a single quote. Attitude meets necessity. Python figures out which kind of quotes it needs to use to be consistent. Don't think about it too much. Let's move on to an example of storing a string in a variable:

>>> fred="Is this a cheese shop? "
>>> fred
'Is this a cheese shop? '

You can stick a string in a variable exactly the way you can a number. You can even add one string to another one, like this:

>>> herbie = fred + "Is this a parrot shop?"
>>> herbie
'Is this a cheese shop? Is this a parrot shop?'

As you can imagine, the strings can get long. You can make them show up on more than one line by inserting a (newline) character and using the print command, like this:

>>> herbie = herbie + "
No. This is for lumberjacks."
>>> print(herbie)
Is this a cheese shop? Is this a parrot shop?
No. This is for lumberjacks.

The print function translates as being the start of a new line. If you just echo the variable, it doesn't work — you just get a backslash and an n in the output.

Now for something slightly different: Using triple quotes — three sets of either single or double quotes — causes the automatic insertion of newline characters into your string whenever you start a new line. You can organize formatted text with it, like this:

>>> hebert="""
... Algy met a bear
... The bear was bulgy
... The bulge was Algy
... """
>>> print(hebert)
Algy met a bear
The bear was bulgy
The bulge was Algy
>>>

Notice that Python drops the normal >>> prompt while you're entering the triple-quoted string and uses three dots (...) instead. It's not important — it's just another example of Python assuming an attitude.

You can use either single quotes or double quotes to construct your triple quotes. (If that sentence makes any sense to you, you're really getting into this. Let's move on.) I showed you earlier how you can add strings; now I'll show you how they can be multiplied:

>>> essword="spam "
>>> print(essword * 7)
spam spam spam spam spam spam spam

If you want to define a long string, you can break it and enter it on more than one line, like this:

>>> go="Now is the time for all good men to
... get out of town."
>>> print(go)
Now is the time for all good men to get out of town.

When you are entering a string of characters, you can put a backslash () at the end of the line and continue at the beginning of the next line just as if you had continued on the same line. As you can see by toget in the output line, I should have added a space after to and before the backward slash. You can also build long strings by adding smaller strings without putting in a plus sign.

>>> hank="ugly "  'dog'
>>> hank
'ugly dog'

You might want to do it that way and you might not. I think a plus sign between the two makes it a lot clearer, but you might want to leave it out just to show off. (That's what I was doing when I put this example in the book.)

Okay. That's enough about putting strings together. Let's take some apart. It's easy because you can refer directly to each letter by its position number. The letter at the extreme left is number 0, the next one is number 1, one after that is number 2, and so on. For example, to pull the first letter out of the string of the preceding example, you just address it by number, like this:

>>> hank[0]
'u'

If you want to extract a range of characters, just use the number of the first character you want and the number of the character following the last one you want, and put a colon between the two, like this:

>>> hank[2:6]
'ly d'

If you use the colon but leave out the first number, Python assumes 0. If the ending number is missing, it assumes the end of the string. Here's an example:

>>>> hank[:4]
'ugly'
>>> hank[5:]
'dog'

You can use extraction to build new strings by adding the pieces together like this:

>>> frank = 'very ' + hank[:4] + ' fat ' + hank[5:]
>>> frank
'very ugly fat dog'

One of the questions that always comes up in a program is, "How long is that string?" Here's how to find out:

>>> len(hank)
8
>>> len(frank)
17

You will find lots of functions that do things to strings that result in new strings that are different. The original string is never changed — you can't really change an existing string, no matter what you do. To make a difference in a string, you have to create a new string and then replace the original. Here are some examples of functions that create modified strings:

>>> hank.capitalize()
'Ugly dog'
>>> hank.find("dog")
5
>>> hank.replace('g','x')
'uxly dox'
>>> hank.title()
'Ugly Dog'
>>> hank.upper()
'UGLY DOG'

Remember: None of these examples changed the original. They produced new strings. But this group of functions is just the tip of the iceberg. You'll find a Python function to do just about anything you can imagine to a string

Understanding the Way Python Handles Lists

You can have a variable hold an arbitrary collection of strings and numbers. You address any specific one by its position number in the list, with the first one in the list being number 0, as in the following:

>>> jam=['a',100,"c",'dee']
>>> jam
['a', 100, 'c', 'dee']
>>> jam[0]
'a'
>>> jam[1]
100
>>> jam[1:3]
[100, 'c']

In this example, you can see where four things were stuffed into the variable named jam. When the variable was displayed, all four items it contained were displayed. You can, however, use a position value to refer to individual items in the list and address them one item at a time, or select a subset of the items in the list.

Note

When you use a pair of position numbers, the first number is the number of the first item you want, but the second number is the number following the last item you want. The first item in the list is always number zero.

Position values work on lists the same way they work on strings. But strings can't be modified, only replaced. Lists can be modified. You can replace one member of a list by simply assigning a new thing to it, like this:

>>>> jam = ['a', 100, 'c', 'dee']
>>>> jam
['a', 100, 'c', 'dee']
>>>> jam[2]='hooha'
>>>> jam
['a', 100, 'hooha', 'dee']

You can quickly find out how many things are in a list:

>>> len(jam)
4

Lists are one of the really nice things about Python. If you want to do something to a list, try it — and it will probably work. You can even put lists inside lists, like this:

>>> jam[0] = ['apple', 'pear']
>>> jam
[['apple', 'pear'], 500, 'hooha', 'dee']

Making Functions

Python can remember a set of instructions for you, and you can later call on that set by name. Here's a simple example that divides a number in half and displays the results:

>>> def showhalf(x):
...     print(x/2.0)

The line with the def command names this as a function called showhalf. And don't forget the colon on the end of the def line. This example has one variable, named x, used in the body of the function. All statements following the definition line are included as part of the function, as long as you indent them. When you type a line that is less indented, the function ends. Python then remembers your definition of the function; you can use it as often as you like. Here's an example:

>>> showhalf(10)
5.0
>>> bunch=100
>>> showhalf(bunch)
50.0

Whatever value you include in the parentheses becomes the value of x inside the function. The following example shows that instead of just doing something inside (as in the previous example), the function can return a value to you:

>>> def getthird(value):
...     return(value / 3.0)
...
>>> j = 9
>>> k = getthird(j)
>>> print(k)
3.0

In this example, whatever value is passed to the function is divided by 3 and the result comes back because it is part of a return statement. You can pass anything into a function and return anything else: strings, numbers, lists, whatever you want.

It's normal to have a Python program begin with a bunch of function definitions and then have the body of the program use the functions to do its work. Functions can even call other functions, but be careful. Too much disorganization leads to spaghetti code — a tangle so convoluted that you can't read it.

You should know that although you can get only one value back from a function, you can pass lots of values to one. Here's an example of a function that requires more than one value for its input:

>>> def showsum(a,b,c):
...     print(a+b+c)
...
>>> showsum(3,5,9)
16

Tip

The limit of being able to return only one value from a function is never a problem. If you find that you really need to return more than one value, you can return a list, but in reality you probably need more than one function.

Here's a nifty trick: You can define your function to have some defaults for some of the values you pass to it. Then, if you leave out any of those values when you invoke the function, the defaults will be used:

>>> def spark(a,b="too big",c=44):
...     if (a > c):
...         print(b)
...
>>> spark(20)
>>> spark(50)
too big
>>> spark(100,"way too large")
way too large

In this example, the function named spark() has three arguments; a, b, and c. The last two have default values. The function simply tests whether the value of a is larger than the value of c; if so, it prints b. In the example, the first call to the function sets the value of a to 20, which is not larger than c, so nothing happens. In the second call, a is set to 50, and that's larger than c, so the default content of b is printed. The last call to spark() has a value for a that is larger than c, but the string printed for b is different because the value passed to the function overrides the default.

Function definitions are, in a way, the heart of the system. You normally write a program by defining your own functions and using them along with Python's plentiful built-in functions. This program structure becomes particularly convenient when you do the same sort of thing more than once, but the most important characteristic of this program structure is that you can organize your instructions in a logical way. The main problem with programs isn't writing them — it's fixing them later when they don't work the way you want. And the main problem with fixing them is finding out where to make the change. Be organized!

Making Decisions with if/else

Often you'll have a statement or two that you want to execute only under certain conditions. You can use an if statement to ask a question (essentially a true/false test); the indented statements following the if statement are executed only if the answer to your question is true. Here's an example:

>>> x = 3
>>> if x < 5:
...     x = 20
...     print(x)
...
20

Note

You can group statements together and execute them as a single unit by putting them together as a block — two or more consecutive statements indented by the same amount.

Sometimes you'll want to do one thing under some circumstances and something different under other circumstances. That's where you can use else:

>>> x = 10
>>> if x < 8:
...     print 'x is less than 8'
... else:
...     print 'x is not less than 8'
...
x is not less than 8

Instead of just ending the statements in the if block, this code uses the else keyword, followed by a colon, to start a new block. Result: When the code is executed, it skips one block and runs the other. Using if/else statements this way means that one, and only one, of the two blocks of code executes.

Using if blocks in code is common. If you write a script of any complexity, you will nest such blocks inside one another. With a bit of practice, you will get proficient at doing such things. One odd situation comes up, however — usually when you back up to change something: you find yourself having to put in some code that does nothing at all. Python is persnickety about its syntax, and there are places where you are always required to put something in, but you may find that you don't want the code to do anything at that point. To the rescue comes the keyword pass, which you can use like this:

>>> x = 3
>>> if x < 8:
...     pass
... else:
...     print('x is not less than 8')
...

This example has no output; all it does is execute the pass command, which does absolutely nothing. But perhaps you want to use the if statement to select a single action among several possible choices. You can do that as follows:

>>> x = 8
>>> if x < 8:
...     print 'x is less than eight'
... elif x == 8:
...     print 'x is equal to eight'
... else:
...     print 'x is greater than eight'
...
x is equal to eight

The elif keyword is short for else if — you can use it to add another condition, followed by another block of statements that will be executed only if that second expression is true. You can daisy-chain as many of these elif statements as you want, and only the first one found to be true is executed — the rest are skipped. You can have only one else statement, and it must come last.

When you're telling the computer what to do, be sure to say what you mean. Although the single equal sign (=) is the assignment operator and is used to copy data, the test for a couple of values being equal is the double equal sign (==). You can include the greater than or equal to test with >=, the less than or equal to test with <=, and the not equal to test with !=.

You can also use and, or, and parentheses in expressions. Things can get complicated if you need to ask a hard question. For example, the following is true only if aa is greater than or equal to bb and x is not equal to y:

>>> if (aa >= bb) and (x != y):

Don't think too much about what that statement means. (That just leads to headaches.) I wanted to show it to you so you'd know that that sort of thing is possible if you really need it, or if you find yourself with a sudden urge to do something baroque.

Doing It Over Again with for and while

Repetition occurs often in programming because it's often necessary. Having your program go back through the same code again is called looping, or iteration. (You're probably familiar with the word reiterate, which means to repeat something.)

You can iterate in Python by using the for keyword, like this:

>>> bog = ['first',50,'third',800,3.14159]
>>> for x in bog:
...     print(x)
...
first
50
third
800
3.14159

First you create a list and then set up a variable in the for loop to iterate through the list. The loop executes once for each member of the list; for each iteration, the variable assumes the value of a member of the list. It couldn't be easier. (Well, if you think of an easier way, tell the folks at Python, and I'm sure they'll put it in the language.)

Warning

Don't change any of the values in the list while you're inside the loop. The results of doing that are unpredictable, and the last thing you want in your computer is a confused Python. If you absolutely, positively have to change the list inside the loop, use a copy of the list to iterate.

It's common in other programming languages to iterate a specific number of times. You can do that in Python if you feel you must. A special, built-in function called range() returns a list and lets you iterate a set number of times. You can do it this way if you feel an irresistible urge to count:

>>> for z in range(5):
...     print(z)
...
0
1
2
3
4

Or you can use the range() function for starting at some value other than 0, like this:

>>> for y in range(5,10):
...     print(y)
...
5
6
7
8
9

Iterating by a count is actually not a different capability of the language — the range() function simply returns a list containing the numbers needed for the count. But a different capability of the language is found in the other iterater, named while. It works a lot like if, except it repeats continuously, testing a conditional expression to determine when to stop. while continues to execute its block of statements as long as the condition it tests comes up true. The following is a simple example:

>>> x = 2
>>> while x < 8:
...     print(x)
...     x = x + 1
...
2
3
4
5
6
7

Warning

When inside a while loop, make sure you do something that affects the value of the expression tested by the while command. Otherwise, you could be caught in the loop forever. And that's an embarrassingly long time.

I said earlier that a while statement is sort of like an if statement. In fact, it is so much like an if statement that you can put an else at the end of the block of a while statement, like this:

>>> x = 7
>>> while x < 9:
...     print(x)
...     x = x + 1
... else:
...     print("The loop is done")
...
7
8
The loop is done

The first part of the loop works just like an if statement, except it executes over and over as long as the conditional expression is true. When the expression becomes false, the else part of the statement executes once — and then the while statement is finished.

"But, hold varlet," you shout, drawing your sword. "A statement following the loop would execute once without regard to the presence of else." Whereupon I wisely retort, "Stay your hand. Bear with my discourse but a bit longer and I will show you purpose." Then I cleverly explain the operations of continue and break.

A continue statement anywhere inside a for loop or a while loop will cause the rest of the statements inside the loop to be skipped. That is, the continue keyword jumps immediately to the bottom of the loop, allowing things to come back around again normally.

A break statement inside a loop will cause the while or for loop to be abandoned as if all iterations had completed, regardless of whether that is the case. In fact, when a break statement abandons the execution of a loop, it will also cause any terminating else code to be skipped. This is where you slip your sword back into its scabbard, muttering, "I'll get you next time."

"One more thing!" I shout. "It is common to nest for and while loops inside one another. When that happens, the continue and break statements only continue or break the innermost loop." I mention this only because it's the kind of thing that can send you on a long and fruitless bug hunt.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.9.197