Though named tuples allow you to reference their members by name, they’re still just tuples, not classes. For some of the benefits of named tuples, plus the capabilities that traditional Python classes provide, you can use Python 3.7’s new data classes12 from the Python Standard Library’s dataclasses
module.
Data classes are among Python 3.7’s most important new features. They help you build classes faster by using more concise notation and by autogenerating “boilerplate” code that’s common in most classes. They could become the preferred way to define many Python classes. In this section, we’ll present data-class fundamentals. At the end of the section, we’ll provide links to more information.
Most classes you’ll define provide an __init__
method to create and initialize an object’s attributes and a __repr__
method to specify an object’s custom string representation. If a class has many data attributes, creating these methods can be tedious.
Data classes autogenerate the data attributes and the __init__
and __repr__
methods for you. This can be particularly useful for classes that primarily aggregate related data items. For example, in an application that processes CSV records, you might want a class that represents each record’s fields as data attributes in an object. You’ll see in an exercise that data classes can be generated dynamically from a list of field names.
Data classes also autogenerate method __eq__
, which overloads the ==
operator. Any class that has an __eq__
method also implicitly supports !=
. All classes inherit class object
’s default __ne__
(not equals) method implementation, which returns the opposite of __eq__
(or NotImplemented
if the class does not define __eq__
). Data classes do not automatically generate methods for the <
, <=
, >
and >=
comparison operators, but they can.
Card
Data ClassLet’s reimplement class Card
from Section 10.6.2 as a data class. The new class is defined in carddataclass.py
. As you’ll see, defining a data class requires some new syntax. In the subsequent subsections, we’ll use our new Card
data class in class DeckOfCards
to show that it’s interchangeable with the original Card
class, then discuss some of the benefits of data classes over named tuples and traditional Python classes.
dataclasses
and typing
ModulesThe Python Standard Library’s dataclasses
module defines decorators and functions for implementing data classes. We’ll use the @dataclass
decorator (imported at line 4) to specify that a new class is a data class and causes various code to be written for you. Recall that our original Card
class defined class variables FACES
and SUITS
, which are lists of the strings used to initialize Card
s. We use ClassVar
and List
from the Python Standard Library’s typing
module (imported at line 5) to indicate that FACES
and SUITS
are class variables that refer to lists. We’ll say more about these momentarily:
1 # carddataclass.py
2 """Card data class with class attributes, data attributes,
3 autogenerated methods and explicitly defined methods."""
4 from dataclasses import dataclass
5 from typing import ClassVar, List
6
@dataclass
DecoratorTo specify that a class is a data class, precede its definition with the @dataclass
decorator:13
7 @dataclass
8 class Card:
Optionally, the @dataclass
decorator may specify parentheses containing arguments that help the data class determine what autogenerated methods to include. For example, the decorator @dataclass(order=True)
would cause the data class to autogenerate overloaded comparison operator methods for <
, <=
, >
and >=
. This might be useful, for example, if you need to sort your data-class objects.
Unlike regular classes, data classes declare both class attributes and data attributes inside the class, but outside the class’s methods. In a regular class, only class attributes are declared this way, and data attributes typically are created in __init__
. Data classes require additional information, or hints, to distinguish class attributes from data attributes, which also affects the autogenerated methods’ implementation details.
Lines 9–11 define and initialize the class attributes FACES
and SUITS
:
9 FACES: ClassVar[List[str]] = ['Ace', '2', '3', '4', '5', '6', '7',
10 '8', '9', '10', 'Jack', 'Queen', 'King']
11 SUITS: ClassVar[List[str]] = ['Hearts', 'Diamonds', 'Clubs', 'Spades']
12
In lines 9 and 11, The notation
: ClassVar[List[str]]
is a variable annotation14
,15 (sometimes called a type hint) specifying that FACES
is a class attribute (ClassVar
) which refers to a list of strings (List[str]
). SUITS
also is a class attribute which refers to a list of strings.
Class variables are initialized in their definitions and are specific to the class, not individual objects of the class. Methods __init__
, __repr__
and __eq__
, however, are for use with objects of the class. When a data class generates these methods, it inspects all the variable annotations and includes only the data attributes in the method implementations.
Normally, we create an object’s data attributes in the class’s __init__
method (or methods called by __init__
) via assignments of the form self.
attribute_name =
value. Because a data class autogenerates its __init__
method, we need another way to specify data attributes in a data class’s definition. We cannot simply place their names inside the class, which generates a NameError
, as in:
In [1]: from dataclasses import dataclass
In [2]: @dataclass
...: class Demo:
...: x # attempting to create a data attribute x
...:
-------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-2-79ffe37b1ba2> in <module>()
----> 1 @dataclass
2 class Demo:
3 x # attempting to create a data attribute x
4
<ipython-input-2-79ffe37b1ba2> in Demo()
1 @dataclass
2 class Demo:
----> 3 x # attempting to create a data attribute x
4
NameError: name 'x' is not defined
Like class attributes, each data attribute must be declared with a variable annotation. Lines 13–14 define the data attributes face
and suit
. The variable annotation ":
str"
indicates that each should refer to string objects:
13 face: str
14 suit: str
Data classes are classes, so they may contain properties and methods and participate in class hierarchies. For this Card
data class, we defined the same read-only image_name
property and custom special methods __str__
and __format__
as in our original Card
class earlier in the chapter:
15 @property
16 def image_name(self):
17 """Return the Card's image file name."""
18 return str(self).replace(' ', '_') + '.png'
19
20 def __str__(self):
21 """Return string representation for str()."""
22 return f'{self.face} of {self.suit}'
23
24 def __format__(self, format):
25 """Return formatted string representation."""
26 return f'{str(self):{format}}'
You can specify variable annotations using built-in type names (like str
, int
and float
), class types or types defined by the typing
module (such as ClassVar
and List
shown earlier). Even with type annotations, Python is still a dynamically typed language. So, type annotations are not enforced at execution time. So, even though a Card
’s face
is meant to be a string, you can assign any type of object to face
, as you’ll do in a Self Check exercise.
(Fill-In) Data classes require _________ that specify each class attribute’s or data attribute’s data type.
Answer: variable annotations.
(Fill-In) The _________ decorator specifies that a new class is a data class.
Answer: @dataclass
.
(True/False) The Python Standard Library’s annotations
module defines the variable annotations that are required in data class definitions.
Answer: False. The typing
module defines the variable annotations that are required in data-class definitions.
(True/False) Data classes have auto-generated <
, <=, >
and >=
operators, by default.
Answer: False. The ==
and !=
operators are autogenerated by default. The <
, <=, >
and >=
operators are autogenerated only if the @dataclass
decorator specifies the keyword argument order=True
.
Card
Data ClassLet’s demonstrate the new Card
data class. First, create a Card
:
In [1]: from carddataclass import Card
In [2]: c1 = Card(Card.FACES[0], Card.SUITS[3])
Next, let’s use Card
’s autogenerated __repr__
method to display the Card
:
In [3]: c1
Out[3]: Card(face='Ace', suit='Spades')
Our custom __str__
method, which print
calls when passing it a Card
object, returns a string of the form '
face of
suit'
:
In [4]: print(c1)
Ace of Spades
Let’s access our data class’s attributes and read-only property:
In [5]: c1.face
Out[5]: 'Ace'
In [6]: c1.suit
Out[6]: 'Spades'
In [7]: c1.image_name
Out[7]: 'Ace_of_Spades.png'
Next, let’s demonstrate that Card
objects can be compared via the autogenerated ==
operator and inherited !=
operator. First, create two additional Card
objects—one identical to the first and one different:
In [8]: c2 = Card(Card.FACES[0], Card.SUITS[3])
In [9]: c2
Out[9]: Card(face='Ace', suit='Spades')
In [10]: c3 = Card(Card.FACES[0], Card.SUITS[0])
In [11]: c3
Out[11]: Card(face='Ace', suit='Hearts')
Now, compare the objects using ==
and !=
:
In [12]: c1 == c2
Out[12]: True
In [13]: c1 == c3
Out[13]: False
In [14]: c1 != c3
Out[14]: True
Our Card
data class is interchangeable with the Card
class developed earlier in this chapter. To demonstrate this, we created the deck2.py
file containing a copy of class DeckOfCards
from earlier in the chapter and imported the Card
data class into the file. The following snippets import
class DeckOfCards
, create an object of the class and print
it. Recall that print
implicitly calls the DeckOfCards
__str__
method, which formats each Card
in a field of 19 characters, resulting in a call to each Card
’s __format__
method. Read each row left-to-right to confirm that all the Card
s are displayed in order from each suit (Hearts
, Diamonds
, Clubs
and Spades
):
In [15]: from deck2 import DeckOfCards # uses Card data class
In [16]: deck_of_cards = DeckOfCards()
In [17]: print(deck_of_cards)
Ace of Hearts 2 of Hearts 3 of Hearts 4 of Hearts
5 of Hearts 6 of Hearts 7 of Hearts 8 of Hearts
9 of Hearts 10 of Hearts Jack of Hearts Queen of Hearts
King of Hearts Ace of Diamonds 2 of Diamonds 3 of Diamonds
4 of Diamonds 5 of Diamonds 6 of Diamonds 7 of Diamonds
8 of Diamonds 9 of Diamonds 10 of Diamonds Jack of Diamonds
Queen of Diamonds King of Diamonds Ace of Clubs 2 of Clubs
3 of Clubs 4 of Clubs 5 of Clubs 6 of Clubs
7 of Clubs 8 of Clubs 9 of Clubs 10 of Clubs
Jack of Clubs Queen of Clubs King of Clubs Ace of Spades
2 of Spades 3 of Spades 4 of Spades 5 of Spades
6 of Spades 7 of Spades 8 of Spades 9 of Spades
10 of Spades Jack of Spades Queen of Spades King of Spades
(IPython Session) Python is a dynamically typed language, so variable annotations are not enforced on objects of data classes. To prove this, create a Card
object, then assign the integer 100
to its face
attribute and display the Card
. Display the face
attribute’s type before and after the assignment
Answer:
In [1]: from carddataclass import Card
In [2]: c = Card('Ace', 'Spades')
In [3]: c
Out[3]: Card(face='Ace', suit='Spades')
In [4]: type(c.face)
Out[4]: str
In [5]: c.face = 100
In [6]: c
Out[6]: Card(face=100, suit='Spades')
In [7]: type(c.face)
Out[7]: int
Data classes offer several advantages over named tuples16:
Although each named tuple technically represents a different type, a named tuple is a tuple and all tuples can be compared to one another. So, objects of different named tuple types could compare as equal if they have the same number of members and the same values for those members. Comparing objects of different data classes always returns False
, as does comparing a data class object to a tuple object.
If you have code that unpacks a tuple, adding more members to that tuple breaks the unpacking code. Data class objects cannot be unpacked. So you can add more data attributes to a data class without breaking existing code.
A data class can be a base class or a subclass in an inheritance hierarchy.
Data classes also offer various advantages over the traditional Python classes you saw earlier in this chapter:
A data class autogenerates __init__
, __repr__
and __eq__
, saving you time.
A data class can autogenerate the special methods that overload the <
, <=
, >
and >=
comparison operators.
When you change data attributes defined in a data class, then use it in a script or interactive session, the autogenerated code updates automatically. So, you have less code to maintain and debug.
The required variable annotations for class attributes and data attributes enable you to take advantage of static code analysis tools. So, you might be able to eliminate additional errors before they can occur at execution time.
Some static code analysis tools and IDEs can inspect variable annotations and issue warnings if your code uses the wrong type. This can help you locate logic errors in your code before you execute it. In an end-of-chapter exercise, we ask you to use the static code analysis tool MyPy to demonstrate such warnings.
Data classes have additional capabilities, such as creating “frozen” instances which do not allow you to assign values to a data class object’s attributes after the object is created. For a complete list of data class benefits and capabilities, see
https://www.python.org/dev/peps/pep-0557/
and
https://docs.python.org/3/library/dataclasses.html
We’ll ask you to experiment with additional data class features in this chapter’s exercises.
3.133.131.168