Creating Strings of Characters

Computers may have been invented to do arithmetic, but these days, most of them spend a lot of their time processing text. Many programs create text, store it, search it, and move it from one place to another.

In Python, text is represented as a string, which is a sequence of characters (letters, digits, and symbols). The type whose values are sequences of characters is str. The characters consist of those from the Latin alphabet found on most North American keyboards, as well as Chinese morphograms, chemical symbols, musical symbols, and much more.

In Python, we indicate that a value is a string by putting either single or double quotes around it. As we will see in Using Special Characters in Strings, single and double quotes are equivalent except for strings that contain quotes. You can use whichever you prefer. (For docstrings, the Python style guidelines say that double quotes are preferred.) Here are two examples:

 >>>​​ ​​'Aristotle'
 'Aristotle'
 >>>​​ ​​"Isaac Newton"
 'Isaac Newton'

The opening and closing quotes must match:

 >>>​​ '​​Charles​​ ​​Darwin​​"
  File "​​<stdin>​​", line 1
  'Charles Darwin"
  ^
 SyntaxError: EOL while scanning string literal

EOL stands for “end of line.” The previous error indicates that the end of the line was reached before the end of the string (which should be marked with a closing single quote) was found.

Strings can contain any number of characters, limited only by computer memory. The shortest string is the empty string, containing no characters at all:

 >>>​​ ​​''
 ''
 >>>​​ ​​""
 ''

Operations on Strings

Python has a built-in function, len, that returns the number of characters between the opening and closing quotes:

 >>>​​ ​​len(​​'Albert Einstein'​​)
 15
 >>>​​ ​​len(​​'123!'​​)
 4
 >>>​​ ​​len(​​' '​​)
 1
 >>>​​ ​​len(​​''​​)
 0

We can add two strings using the + operator, which produces a new string containing the same characters as in the two operands:

 >>>​​ ​​'Albert'​​ ​​+​​ ​​' Einstein'
 'Albert Einstein'

When + has two string operands, it is referred to as the concatenation operator. Operator + is probably the most overloaded operator in Python. So far, we’ve applied it to integers, floating-point numbers, and strings, and we’ll apply it to several more types in later chapters.

As the following example shows, adding an empty string to another string produces a new string that is just like the nonempty operand:

 >>>​​ ​​"Alan Turing"​​ ​​+​​ ​​''
 'Alan Turing'
 >>>​​ ​​""​​ ​​+​​ ​​'Grace Hopper'
 'Grace Hopper'

Here is an interesting question: Can operator + be applied to a string and a numeric value? If so, would addition or concatenation occur? We’ll give it a try:

 >>>​​ ​​'NH'​​ ​​+​​ ​​3
 Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
 TypeError: Can't convert 'int' object to str implicitly

This is the second time that we have encountered a type error. The first time, in Using Local Variables for Temporary Storage, the problem was that we didn’t pass the right number of parameters to a function. Here, Python took exception to our attempts to combine values of different data types because it didn’t know which version of + we want: the one that adds numbers or the one that concatenates strings. Because the first operand was a string, Python expected the second operand to also be a string but instead it was an integer. Now consider this example:

 >>>​​ ​​9​​ ​​+​​ ​​' planets'
 Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
 TypeError: unsupported operand type(s) for +: 'int' and 'str'

Here, because Python saw a 9 first, it expected the second operand to also be numeric. The order of the operands affects the error message.

The concatenation operator must be applied to two strings. If you want to join a string with a number, you could apply function str to the number to get its string representation, and then apply the concatenation:

 >>>​​ ​​'Four score and '​​ ​​+​​ ​​str(7)​​ ​​+​​ ​​' years ago'
 'Four score and 7 years ago'

Function int can be applied to a string whose contents look like an integer, and float can be applied to a string whose contents are numeric:

 >>>​​ ​​int(​​'0'​​)
 0
 >>>​​ ​​int(​​"11"​​)
 11
 >>>​​ ​​int(​​'-324'​​)
 -324
 >>>​​ ​​float(​​'-324'​​)
 -324.0
 >>>​​ ​​float(​​"56.34"​​)
 56.34

It isn’t always possible to get an integer or a floating-point representation of a string, and when an attempt to do so fails, an error occurs:

 >>>​​ ​​int(​​'a'​​)
 Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
 ValueError: invalid literal for int() with base 10: 'a'
 >>>​​ ​​float(​​'b'​​)
 Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
 ValueError: could not convert string to float: 'b'

In addition to +, len, int, and float, operator * can be applied to strings. A string can be repeated using operator * and an integer, like this:

 >>>​​ ​​'AT'​​ ​​*​​ ​​5
 'ATATATATAT'
 >>>​​ ​​4​​ ​​*​​ ​​'-'
 '----'

If the integer is less than or equal to zero, the operator yields the empty string:

 >>>​​ ​​'GC'​​ ​​*​​ ​​0
 ''
 >>>​​ ​​'TATATATA'​​ ​​*​​ ​​-3
 ''

Strings are values, so you can assign a string to a variable. Also, operations on strings can be applied to those variables:

 >>>​​ ​​sequence​​ ​​=​​ ​​'ATTGTCCCCC'
 >>>​​ ​​len(sequence)
 10
 >>>​​ ​​new_sequence​​ ​​=​​ ​​sequence​​ ​​+​​ ​​'GGCCTCCTGC'
 >>>​​ ​​new_sequence
 'ATTGTCCCCCGGCCTCCTGC'
 >>>​​ ​​new_sequence​​ ​​*​​ ​​2
 'ATTGTCCCCCGGCCTCCTGCATTGTCCCCCGGCCTCCTGC'
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.141.19.185