Chapter 18
Imagine writing a program that counts the number of times each word appears in a large file. We could store the counts in a list, but then how do we keep track of which word uses which index? We could store the words in a separate second list, so that the count for the word in words[i] is in counts[i], but then we would have to search the entire word list every time we came to another word to count. A dictionary is much more efficient.
Listing 18.1: Word Frequency
1 # wordfreq.py
2
3 import string
4
5 def getwords(fname):
6 with open(fname) as f:
7 s = f.read().lower()
8 s = s.translate(s.maketrans("", "", string.punctuation))
9 return s.split()
10
11 def frequency(words):
12 d = {}
13 for word in words:
14 if word in d:
15 d[word] += 1
16 else:
17 d[word] = 1
18 return d
19
20 def display(d):
21 for key in sorted(d):
22 print(key, d[key])
23
24 def main():
25 words = getwords("moby.txt")
26 display(frequency(words))
27
28 main()
The display() function in Listing 18.1 is an exception to our habit of printing only in main(). However, its name alerts readers that it will produce output.
This program may run more slowly inside IDLE than at the command line. To run a Python program from the command line in Windows, open a command prompt, use cd to navigate to the location of wordfreq.py, and type this, changing “32” to whatever version you have installed:
c:python32python wordfreq.py
In Linux or OS X, open a Terminal, use cd to locate wordfreq.py, and run
python wordfreq.py
Depending on your installation, you may need to use python3 instead of python.
A dictionary is a mapping that stores keys with associated values. For every key in a dictionary, there is exactly one value. Thus, we think of dictionaries as sets of “key:value” pairs. To look up data in the dictionary, you provide the key, and the dictionary will provide the value.
Dictionaries are written inside curly brackets {}, with a colon between each key-value pair. For example, in this dictionary:
ages = {"Alice":38, "Bob":39, "Chuck":35, "Dave":34}
the keys are the string character names, and each name is associated to one integer value. A pair of empty brackets {} denotes an empty dictionary.
Dictionaries are also called associative maps because they associate keys with values. Key-value pairs are not stored in any particular order in a dictionary, and so dictionaries support different operations and methods than the sequence types we have been using, such as strings, lists, and range() objects.
Dictionary keys must be immutable, but there is no requirement that all keys have the same type. Values in a dictionary may be of any type. Dictionaries themselves are mutable objects in Python.
The main operations in a dictionary d are to either set or retrieve the value stored for a key:
d[key] = value |
Set value associated with key in d to value. |
d[key] |
Get value associated with key in d. |
|
Raises KeyError exception if key not in d. |
Dictionaries are specifically designed so that the above operations are very fast.
Some of the other operations available are:
len(d) |
Number of items in d. |
del d[key] |
Delete entry for key in d. |
key in d |
True if key is in d. |
key not in d |
True if key is not in d. |
Because dictionaries are mutable, there are operations like del that change the contents of the dictionary.
Dictionaries are objects and therefore also have their own methods:
d.get(key[, default]) |
Get value for key if key in d; otherwise, |
|
return default (or None). |
d.pop(key[, default]) |
Get and delete value for key if key in d; |
|
otherwise return default |
d.keys() |
View of keys in d. |
d.values() |
View of values in d. |
d.items() |
View of (key, value) pairs in d. |
The .get() method is similar to using d[key], except that it returns a default value (or None) instead of raising a KeyError exception. The .keys(), .values(), and .items() methods return special dictionary views that are iterable and dynamic, meaning that they update as the contents of the dictionary changes. The list() type converter will convert a view to a list.
Python conveniently allows dictionaries to be used in for loops:
for <variable> in <dict>: # loop over each key
<body>
Remember that a dictionary’s key-value pairs are not stored in any particular order, so this type of loop cannot control the order in which the keys are found.
If you need to loop over a dictionary in sorted order, as in Listing 18.1, use the built-in sorted() function:
sorted(x) |
Sorted list from iterable x |
Doing Exercise 18.5 first may be helpful.
This exercise provides an alternate solution to Exercise 12.9 to count the nucleotide bases in a DNA string.
Welcome to PyDict
[a]dd or change a definition
[l]ookup a word
[q]uit
Your choice: l
Word to look up: python
Not in dictionary
[a]dd or change a definition
[l]ookup a word
[q]uit
Your choice: a
Word: python
Definition: A long, large snake.
[a]dd or change a definition
[l]ookup a word
[q]uit
Your choice: l
Word to look up: python
Definition: A long, large snake.
[a]dd or change a definition
[l]ookup a word
[q]uit
Your choice: q
Thanks!
3.142.171.137