Appendix C. Python 3: The Evolution of a Programming Language

Matz (the author of Ruby) has a great quote, “Open Source needs to move or die.”

—Guido van Rossum, March 2008
(verbally at PyCon conference)

Python 3 represents an evolution of the language such that it will not execute most older code that was written against the version 2.x interpreters. This doesn’t mean that you won’t recognize the old code any more, or that extensive porting is required to make old code work under version 3.x. In fact, the new syntax is quite similar to that of the past. However, when the print statement no longer exists, it makes it easy to disrupt the old code. In this appendix, we discuss print and other version 3.x changes, and we shed some light on the required evolution that Python must undergo to be better than it was before. Finally, we present a few migration tools that might help you to make this transition.

C.1. Why Is Python Changing?

Python is currently undergoing its most significant transformation since it was released in the early 1990s. Even the revision change from 1.x to 2.x in 2000 was relatively mild—Python 2.0 ran 1.5.2 software just fine. One of the main reasons for Python’s stability over the years has been the steadfast determination of the core development team to preserve backward compatibility. Over the years, however, certain “sticky” flaws (issues that hang around from release to release) were identified by creator Guido van Rossum, Andrew Kuchling, and other users (refer to the references section at the end of this appendix for links to relevant articles). Their persistence made it clear that a release with hard changes was needed to ensure that the language evolved. The 3.0 release in 2008 marked the first time that a Python interpreter has been released that deliberately breaks the tenets of backward compatibility.

C.2. What Has Changed?

The changes in Python 3 are not mind-boggling—it’s not as if you’ll no longer recognize Python. The remainder of this appendix provides an overview of some of the major changes:

print becomes print().

• Strings are cast into Unicode by default.

• There is a single class type.

• The syntax for exceptions has been updated.

• Integers have been updated.

• Iterables are used everywhere.

C.2.1. print Becomes print()

The switch to print() is the change that breaks the greatest amount of existing Python code. Why is Python changing from a statement to a built-in function (BIF)? Having print as a statement is limiting in many regards, as detailed by Guido in his “Python Regrets” talk, in which he outlined what he feels are shortcomings of the language. In addition, having print as a statement limits improvements to it. However, when print() is available as a function, new keyword parameters can be added, certain standard behaviors can be overridden with keyword parameters, and print() can be replaced if desired, just like any other BIF. Here are before-and-after examples:

Python 2.x

>>> i = 1
>>> print 'Python' 'is', 'number', i
Pythonis number 1

Python 3.x

>>> i = 1
>>> print('Python' 'is', 'number', i)
Pythonis number 1

The omission of a comma between 'Python' and 'is' is deliberate; it was done to show you that direct string literal concatenation has not changed. You can see more examples in the “What’s New in Python 3.0” document (refer to the references section at the end of this appendix). You can find additional information about this change in PEP 3105.

C.2.2. Strings: Unicode by Default

The next “gotcha” that current Python users face is that strings are now Unicode by default. This change couldn’t have come soon enough. Not a day goes by that countless Python developers don’t run into a problem when dealing with Unicode and regular ASCII strings that looks something like this:

UnicodeEncodeError: 'ascii' codec can't encode character
u'xae' in position 0: ordinal not in range(128)

These types of errors will no longer be an everyday occurrence in 3.x. For more information on using Unicode in Python, see the Unicode HOWTO document (refer to the References section at the end of this appendix for the Web address). With the model adopted by the new version of Python, users shouldn’t even use the terms Unicode and ASCII/non-Unicode strings anymore. The “What’s New in Python 3.0” document sums up this new model pretty explicitly.

Python 3 uses the concepts of text and (binary) data instead of Unicode strings and 8-bit strings. All text is Unicode; however, encoded Unicode is represented as binary data. The type used to hold text is str, and the type used to hold data is bytes.

With regard to syntax, because Unicode is now the default, the leading u or U is deprecated. Similarly, the new bytes objects require a leading b or B for its literals (more information can be found in PEP 3112).

Table C-1 compares the various string types, showing how they will change from version 2.x to 3.x. The table also includes a mention of the new mutable bytearray type.

Table C-1. Strings in Python 2 and 3

Image

C.2.3. Single Class Type

Prior to Python 2.2, Python’s objects didn’t behave like classes in other languages: classes were “class” objects and instances were “instance” objects. This is in stark contrast to what people perceive as normal: classes are types and instances are objects of such types. Because of this “flaw,” you could not subclass data types and modify them. In Python 2.2, the core development team came up with new-style classes, which act more like what people expect. Furthermore, this change meant that regular Python types could be subclassed—a change described in Guido’s “Unifying Types and Classes in Python 2.2” essay. Python 3 supports only new-style classes.

C.2.4. Updated Syntax for Exceptions

Exception Handling

In the past, the syntax to catch an exception and the exception argument/instance had the following form:

except ValueError, e:

To catch multiple exceptions with the same handler, the following syntax was used:

except (ValueError, TypeError), e:

The required parentheses confused some users, who often attempted to write invalid code looking like this:

except ValueError, TypeError, e:

The (new) as keyword is intended to ensure that you do not become confused by the comma in the original syntax; however, the parentheses are still required when you’re trying to catch more than one type of exception using the same handler. Here are two equivalent examples of the new syntax that demonstrate this change:

except ValueError as e:

except (ValueError, TypeError) as e:

The remaining version 2.x releases beginning with 2.6 accept both forms when creating exception handlers, thereby facilitating the porting process. You can find more information about this change in PEP 3110.

Raising Exceptions

The most popular syntax for raising exceptions in Python 2.x is as follows:

raise ValueError, e

To truly emphasize that you are creating an instance of an exception, the only syntax supported in Python 3.x is the following:

raise ValueError(e)

This syntax really isn’t new at all. It was introduced over a decade ago in Python 1.5 (yes, you read that correctly) when exceptions changed from strings to classes, and we’re sure you’ll agree that the syntax for class instantiation looks a lot more like the latter than the former.

C.2.5. Updates to Integers

Single Integer Type

Python’s two different integer types, int and long, began their unification in Python 2.2. That change is now almost complete, with the new int behaving like a long. As a consequence, OverflowError exceptions no longer occur when you exceed the native integer size, and the trailing L has been dropped. This change is outlined in PEP 237. long still exists in Python 2.x but has disappeared in Python 3.0.

Changes to Division

The current division operator (/) doesn’t give the expected answer for those users who are new to programming, so it has been changed to do so. If this change has brought any controversy, it is simply that programmers are used to the floor division functionality. To see how the confusion arises, try to convince a programming newbie that 1 divided by 2 is 0 (1 / 2 == 0). The simplest way to describe this change is with examples. Following are some excerpted from “Keeping Up with Python: The 2.2 Release,” found in the July 2002 issue of Linux Journal. You can also find out more about this update in PEP 238.

Classic Division

The default Python 2.x division operation works this way: given two integer operands, / performs integer floor division (truncates the fraction as in the earlier example). If there is at least one float involved, true division occurs:

>>> 1 / 2          # floor
0
>>> 1.0 / 2.0      # true
0.5

True Division

In Python 3.x, given any two numeric operands, / will always return a float:

>>> 1 / 2          # true
0.5
>>> 1.0 / 2.0      # true
0.5

To try true division starting in Python 2.2, you can either import division from __future__ or use the -Qnew switch.

Floor Division

The double-slash division operator (//) was added in Python 2.2 to always perform floor division, regardless of operand type, and to begin the transition process:

>>> 1 // 2         # floor
0
>>> 1.0 // 2.0     # floor
0.0

Binary and Octal Literals

The minor integer literal changes were added in Python 2.6+ to make literal nondecimal (hexadecimal, octal, and new binary) formats consistent. Hex representation stayed the same, with its leading 0x or 0X (where the octal had formerly led with a single 0). This format proved confusing to some users, so it has been changed to 0o for consistency. Instead of 0177, you must now use 0o177. Finally, the new binary literal lets you provide the bits of an integer value, prefixed with a leading 0b, as in 0b0110. Python 3 does not accept 0177. You can find more information on integer literals updates in PEP 3127.

C.2.6. Iterables Everywhere

Another theme inherent to version 3.x is memory conservation. Using iterators is much more efficient than maintaining entire lists in memory, especially when the target action on the objects in question is iteration. There’s no need to waste memory when it’s not necessary. Thus, in Python 3, code that returned lists in earlier versions of the language no longer does so.

For example, the functions map(), filter(), range(), and zip(), plus the dictionary methods keys(), items(), and values(), all return some sort of iterator. Yes, this syntax can be more inconvenient if you want to glance at your data, but it’s better in terms of resource consumption. The changes are mostly under the hood—if you only use the functions’ return values to iterate over, you won’t notice a thing.

C.3. Migration Tools

As you have seen, most of the Python 3.x changes do not represent some wild mutation of the familiar Python syntax. Instead, the changes are just enough to break the old code base. Of course, the changes affect users, so a good transition plan is clearly needed—and most good plans come with good tools or aids to smooth the way. Such tools include (but are not limited to) the following: the 2to3 code converter, the latest Python 2.x release (at least 2.6), and the external (non-standard library) 3to2 tool and six library. We’ll cover the first two here and let you investigate the latter pair on your own.

C.3.1. The 2to3 Tool

The 2to3 tool will take Python 2.x code and attempt to generate a working equivalent in Python 3.x. Here are some of the actions it performs:

• Converts a print statement to a print() function

• Removes the L long suffix

• Replaces <> with !=

• Changes single backquoted strings ('...') to repr(...)

This tool does a lot of the manual labor—but not everything; the rest is up to you. You can read more about porting suggestions and the 2to3 tool in the “What’s New in Python 3.0” document as well as at the tool’s Web page (http://docs.python.org/3.0/library/2to3.html). In Appendix D, “Python 3 Migration with 2.6+,” we’ll also briefly mention a companion tool named 3to2.

C.3.2. Python 2.6+

Because of the compatibility issue, the releases of Python that lead up to 3.0 play a much more significant role in the transition. Of particular note is Python 2.6, the first and most pivotal of such releases. For users, it represents the first time that they can start coding against the version 3.x family of releases, because many 3.x features have been backported to version 2.x.

Whenever possible, the final version 2.x releases (2.6 and newer) incorporate new features and syntax from version 3.x, while remaining compatible with existing code by not removing older features or syntax. Such features are described in the “What’s New in Python 2.x” document for all such releases. We detail some of these migration features in Appendix D.

C.4. Conclusion

Overall, the changes outlined in this appendix do have a high impact in terms of updates required to the interpreter, but they should not radically change the way programmers write their Python code. It’s simply a matter of changing old habits, such as using parentheses with print—thus, print(). Once you’ve gotten these changes under your belt, you’re well on your way to being able to effectively jump to the new platform. It can be a bit startling at first, but these changes have been coming for some time. Don’t panic: Python 2.x will live on for a long time to come. The transition will be slow, deliberate, pain resistant, and even-keeled. Welcome to the dawn of the next generation!

C.5. References

Andrew Kuchling, “Python Warts,” July 2003, http://web.archive.org/web/20070607112039, http://www.amk.ca/python/writing/warts.html

A. M. Kuchling, “What’s New in Python 2.6,” June 2011 (for 2.6.7), http://docs.python.org/whatsnew/2.6.html.

A. M. Kuchling, “What’s New in Python 2.7,” December 2011 (for 2.7.2), http://docs.python.org/whatsnew/2.7.html.

Wesley J. Chun, “Keeping Up with Python: The 2.2 Release,” July 2002, http://www.linuxjournal.com/article/5597.

PEP Index, http://www.python.org/dev/peps.

“Unicode HOWTO,” December 2008, http://docs.python.org/3.0/howto/unicode.html.

Guido van Rossum, “Python Regrets,” July 2002, http://www.python.org/doc/essays/ppt/regrets/PythonRegrets.pdf.

Guido van Rossum, “Unifying Types and Classes in Python 2.2,” April 2002, http://www.python.org/2.2.3/descrintro.html.

Guido van Rossum, “What’s New in Python 3.0,” December 2008, http://docs.python.org/3.0/whatsnew/3.0.html.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.131.160.69