10.13 A Brief Intro to Python 3.7’s New Data Classes

Though named tuples allow you to reference their members by name, they’re still just tuples, not classes. For some of the benefits of named tuples, plus the capabilities that traditional Python classes provide, you can use Python 3.7’s new data classes12 from the Python Standard Library’s dataclasses module.

Data classes are among Python 3.7’s most important new features. They help you build classes faster by using more concise notation and by autogenerating “boilerplate” code that’s common in most classes. They could become the preferred way to define many Python classes. In this section, we’ll present data-class fundamentals. At the end of the section, we’ll provide links to more information.

Data Classes Autogenerate Code

Most classes you’ll define provide an __init__ method to create and initialize an object’s attributes and a __repr__ method to specify an object’s custom string representation. If a class has many data attributes, creating these methods can be tedious.

Data classes autogenerate the data attributes and the __init__ and __repr__ methods for you. This can be particularly useful for classes that primarily aggregate related data items. For example, in an application that processes CSV records, you might want a class that represents each record’s fields as data attributes in an object. You’ll see in an exercise that data classes can be generated dynamically from a list of field names.

Data classes also autogenerate method __eq__, which overloads the == operator. Any class that has an __eq__ method also implicitly supports !=. All classes inherit class object’s default __ne__ (not equals) method implementation, which returns the opposite of __eq__ (or NotImplemented if the class does not define __eq__). Data classes do not automatically generate methods for the <, <=, > and >= comparison operators, but they can.

10.13.1 Creating a Card Data Class

Let’s reimplement class Card from Section 10.6.2 as a data class. The new class is defined in carddataclass.py. As you’ll see, defining a data class requires some new syntax. In the subsequent subsections, we’ll use our new Card data class in class DeckOfCards to show that it’s interchangeable with the original Card class, then discuss some of the benefits of data classes over named tuples and traditional Python classes.

Importing from the dataclasses and typing Modules

The Python Standard Library’s dataclasses module defines decorators and functions for implementing data classes. We’ll use the @dataclass decorator (imported at line 4) to specify that a new class is a data class and causes various code to be written for you. Recall that our original Card class defined class variables FACES and SUITS, which are lists of the strings used to initialize Cards. We use ClassVar and List from the Python Standard Library’s typing module (imported at line 5) to indicate that FACES and SUITS are class variables that refer to lists. We’ll say more about these momentarily:

1 # carddataclass.py
2 """Card data class with class attributes, data attributes,
3 autogenerated methods and explicitly defined methods."""
4 from dataclasses import dataclass
5 from typing import ClassVar, List
6

Using the @dataclass Decorator

To specify that a class is a data class, precede its definition with the @dataclass decorator:13

7 @dataclass
8 class Card:

Optionally, the @dataclass decorator may specify parentheses containing arguments that help the data class determine what autogenerated methods to include. For example, the decorator @dataclass(order=True) would cause the data class to autogenerate overloaded comparison operator methods for <, <=, > and >=. This might be useful, for example, if you need to sort your data-class objects.

Variable Annotations: Class Attributes

Unlike regular classes, data classes declare both class attributes and data attributes inside the class, but outside the class’s methods. In a regular class, only class attributes are declared this way, and data attributes typically are created in __init__. Data classes require additional information, or hints, to distinguish class attributes from data attributes, which also affects the autogenerated methods’ implementation details.

Lines 9–11 define and initialize the class attributes FACES and SUITS:

 9 FACES: ClassVar[List[str]] = ['Ace', '2', '3', '4', '5', '6', '7',
10                               '8', '9', '10', 'Jack', 'Queen', 'King']
11 SUITS: ClassVar[List[str]] = ['Hearts', 'Diamonds', 'Clubs', 'Spades']
12

In lines 9 and 11, The notation

         : ClassVar[List[str]]

is a variable annotation14 ,15 (sometimes called a type hint) specifying that FACES is a class attribute (ClassVar) which refers to a list of strings (List[str]). SUITS also is a class attribute which refers to a list of strings.

Class variables are initialized in their definitions and are specific to the class, not individual objects of the class. Methods __init__, __repr__ and __eq__, however, are for use with objects of the class. When a data class generates these methods, it inspects all the variable annotations and includes only the data attributes in the method implementations.

Variable Annotations: Data Attributes

Normally, we create an object’s data attributes in the class’s __init__ method (or methods called by __init__) via assignments of the form self.attribute_name = value. Because a data class autogenerates its __init__ method, we need another way to specify data attributes in a data class’s definition. We cannot simply place their names inside the class, which generates a NameError, as in:

In [1]: from dataclasses import dataclass

In [2]: @dataclass
   ...: class Demo:
   ...:     x # attempting to create a data attribute x
   ...:
-------------------------------------------------------------------------
NameError                               Traceback (most recent call last)
<ipython-input-2-79ffe37b1ba2> in <module>()
----> 1 @dataclass
      2 class Demo:
      3     x # attempting to create a data attribute x
      4

<ipython-input-2-79ffe37b1ba2> in Demo()
      1 @dataclass
      2 class Demo:
----> 3     x # attempting to create a data attribute x
      4
NameError: name 'x' is not defined

Like class attributes, each data attribute must be declared with a variable annotation. Lines 13–14 define the data attributes face and suit. The variable annotation ": str" indicates that each should refer to string objects:

13   face: str
14   suit: str

Defining a Property and Other Methods

Data classes are classes, so they may contain properties and methods and participate in class hierarchies. For this Card data class, we defined the same read-only image_name property and custom special methods __str__ and __format__ as in our original Card class earlier in the chapter:

15    @property
16    def image_name(self):
17        """Return the Card's image file name."""
18        return str(self).replace(' ', '_') + '.png'
19
20    def __str__(self):
21        """Return string representation for str()."""
22        return f'{self.face} of {self.suit}'
23
24    def __format__(self, format):
25        """Return formatted string representation."""
26        return f'{str(self):{format}}'

Variable Annotation Notes

You can specify variable annotations using built-in type names (like str, int and float), class types or types defined by the typing module (such as ClassVar and List shown earlier). Even with type annotations, Python is still a dynamically typed language. So, type annotations are not enforced at execution time. So, even though a Card’s face is meant to be a string, you can assign any type of object to face, as you’ll do in a Self Check exercise.

tick mark Self Check

  1. (Fill-In) Data classes require _________ that specify each class attribute’s or data attribute’s data type.
    Answer: variable annotations.

  2. (Fill-In) The _________ decorator specifies that a new class is a data class.
    Answer: @dataclass.

  3. (True/False) The Python Standard Library’s annotations module defines the variable annotations that are required in data class definitions.
    Answer: False. The typing module defines the variable annotations that are required in data-class definitions.

  4. (True/False) Data classes have auto-generated <, <=, > and >= operators, by default.
    Answer: False. The == and != operators are autogenerated by default. The <, <=, > and >= operators are autogenerated only if the @dataclass decorator specifies the keyword argument order=True.

10.13.2 Using the Card Data Class

Let’s demonstrate the new Card data class. First, create a Card:

In [1]: from carddataclass import Card

In [2]: c1 = Card(Card.FACES[0], Card.SUITS[3])

Next, let’s use Card’s autogenerated __repr__ method to display the Card:

In [3]: c1
Out[3]: Card(face='Ace', suit='Spades')

Our custom __str__ method, which print calls when passing it a Card object, returns a string of the form 'face of suit':

In [4]: print(c1)
Ace of Spades

Let’s access our data class’s attributes and read-only property:

In [5]: c1.face
Out[5]: 'Ace'

In [6]: c1.suit
Out[6]: 'Spades'

In [7]: c1.image_name
Out[7]: 'Ace_of_Spades.png'

Next, let’s demonstrate that Card objects can be compared via the autogenerated == operator and inherited != operator. First, create two additional Card objects—one identical to the first and one different:

In [8]: c2 = Card(Card.FACES[0], Card.SUITS[3])

In [9]: c2
Out[9]: Card(face='Ace', suit='Spades')

In [10]: c3 = Card(Card.FACES[0], Card.SUITS[0])

In [11]: c3
Out[11]: Card(face='Ace', suit='Hearts')

Now, compare the objects using == and !=:

In [12]: c1 == c2
Out[12]: True

In [13]: c1 == c3
Out[13]: False

In [14]: c1 != c3
Out[14]: True

Our Card data class is interchangeable with the Card class developed earlier in this chapter. To demonstrate this, we created the deck2.py file containing a copy of class DeckOfCards from earlier in the chapter and imported the Card data class into the file. The following snippets import class DeckOfCards, create an object of the class and print it. Recall that print implicitly calls the DeckOfCards __str__ method, which formats each Card in a field of 19 characters, resulting in a call to each Card’s __format__ method. Read each row left-to-right to confirm that all the Cards are displayed in order from each suit (Hearts, Diamonds, Clubs and Spades):

In [15]: from deck2 import DeckOfCards # uses Card data class
In [16]: deck_of_cards = DeckOfCards()

In [17]: print(deck_of_cards)
Ace of Hearts      2 of Hearts       3 of Hearts      4 of Hearts
5 of Hearts        6 of Hearts       7 of Hearts      8 of Hearts
9 of Hearts        10 of Hearts      Jack of Hearts   Queen of Hearts
King of Hearts     Ace of Diamonds   2 of Diamonds    3 of Diamonds
4 of Diamonds      5 of Diamonds     6 of Diamonds    7 of Diamonds
8 of Diamonds      9 of Diamonds     10 of Diamonds   Jack of Diamonds
Queen of Diamonds  King of Diamonds  Ace of Clubs     2 of Clubs
3 of Clubs         4 of Clubs        5 of Clubs       6 of Clubs
7 of Clubs         8 of Clubs        9 of Clubs       10 of Clubs
Jack of Clubs      Queen of Clubs    King of Clubs    Ace of Spades
2 of Spades        3 of Spades       4 of Spades      5 of Spades
6 of Spades        7 of Spades       8 of Spades      9 of Spades
10 of Spades       Jack of Spades    Queen of Spades  King of Spades

tick mark Self Check

  1. (IPython Session) Python is a dynamically typed language, so variable annotations are not enforced on objects of data classes. To prove this, create a Card object, then assign the integer 100 to its face attribute and display the Card. Display the face attribute’s type before and after the assignment
    Answer:

    In [1]: from carddataclass import Card
    
    In [2]: c = Card('Ace', 'Spades')
    
    In [3]: c
    Out[3]: Card(face='Ace', suit='Spades')
    
    In [4]: type(c.face)
    Out[4]: str
    
    In [5]: c.face = 100
    
    In [6]: c
    Out[6]: Card(face=100, suit='Spades')
    
    In [7]: type(c.face)
    Out[7]: int
    

10.13.3 Data Class Advantages over Named Tuples

Data classes offer several advantages over named tuples16:

  • Although each named tuple technically represents a different type, a named tuple is a tuple and all tuples can be compared to one another. So, objects of different named tuple types could compare as equal if they have the same number of members and the same values for those members. Comparing objects of different data classes always returns False, as does comparing a data class object to a tuple object.

  • If you have code that unpacks a tuple, adding more members to that tuple breaks the unpacking code. Data class objects cannot be unpacked. So you can add more data attributes to a data class without breaking existing code.

  • A data class can be a base class or a subclass in an inheritance hierarchy.

10.13.4 Data Class Advantages over Traditional Classes

Data classes also offer various advantages over the traditional Python classes you saw earlier in this chapter:

  • A data class autogenerates __init__, __repr__ and __eq__, saving you time.

  • A data class can autogenerate the special methods that overload the <, <=, > and >= comparison operators.

  • When you change data attributes defined in a data class, then use it in a script or interactive session, the autogenerated code updates automatically. So, you have less code to maintain and debug.

  • The required variable annotations for class attributes and data attributes enable you to take advantage of static code analysis tools. So, you might be able to eliminate additional errors before they can occur at execution time.

  • Some static code analysis tools and IDEs can inspect variable annotations and issue warnings if your code uses the wrong type. This can help you locate logic errors in your code before you execute it. In an end-of-chapter exercise, we ask you to use the static code analysis tool MyPy to demonstrate such warnings.

More Information

Data classes have additional capabilities, such as creating “frozen” instances which do not allow you to assign values to a data class object’s attributes after the object is created. For a complete list of data class benefits and capabilities, see

https://www.python.org/dev/peps/pep-0557/

and

https://docs.python.org/3/library/dataclasses.html

We’ll ask you to experiment with additional data class features in this chapter’s exercises.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.131.168