This content originally appeared on Level Up Coding - Medium and was authored by Sandipan Dutta
Let us take a look at Python Dataclasses, introduced in Python 3.7.
Dataclasses are a new addition to python modules that are going to make your life a lot easier as a Python developer. Even though it is not an advanced topic, we will cover it in this series. This module provides a decorator and functions for automatically adding generated special methods such as __init__() and __repr__() to user-defined classes. It was originally described in PEP 557.
So far in this series
- Python Classes, Objects, and MRO
- Dataclasses (current post)
Why and how?
Now for any new feature addition, there must be something in the existing system that needs more functionality or is just not up to the mark(in case of a replacement). So we need to know why python.org introduced the Dataclasses.
Without using Dataclass:
Assume we are writing a new twitter-like application and this application has a class named “Tweets” for keeping track of tweet details. Let us take a look at what this fast and near production-ready class will be.
Oh, Wait! Your product manager now wants you to add the functionality for adding comments in a tweet post and you need to introduce a new class variable !!!!
Now you need to rewrite almost the entire class. Another 2hr wasted.
Using Dataclass:
Okay, now let's try with @dataclass.
from dataclasses import dataclass, field
from datetime import datetime
from uuid import UUID, uuid4
@dataclass(frozen=True, order=True)
class Tweets:
""" Class to store tweets of users """
tweet_body: str = None
tweet_time: datetime = datetime.utcnow()
tweet_id: UUID = uuid4()
tweet_lang: str = 'en-IN'
tweet_place: str = 'IN'
tweet_retweet_count: int = 0
tweet_hashtags: list[str] = field(default_factory=list)
tweet_user_id: str = None
tweet_user_name: str = None
That's it. We are done. Python has taken care of all the boilerplate code by itself. And for any new class variable addition, we can just declare like above and forget about it. Let us take a look at all the methods that are available in this class.
>>> inspect.getmembers(Tweets, predicate=inspect.isfunction)
[('__delattr__', <function Tweets.__delattr__ at 0x0000015EE175A440>),
('__eq__', <function Tweets.__eq__ at 0x0000015EE1759EA0>), ('__ge__', <function Tweets.__ge__ at 0x0000015EE175A320>), ('__gt__', <function Tweets.__gt__ at 0x0000015EE175A200>), ('__hash__', <function Tweets.__hash__ at 0x0000015EE175A4D0>), ('__init__', <function Tweets.__init__ at 0x0000015EE1759C60>), ('__le__', <function Tweets.__le__ at 0x0000015EE175A0E0>), ('__lt__', <function Tweets.__lt__ at 0x0000015EE1759FC0>), ('__repr__', <function Tweets.__repr__ at 0x0000015EE1759BD0>), ('__setattr__', <function Tweets.__setattr__ at0x0000015EE175A3B0>)]
It seems python has already created the __setattr__() , and __delattr__()which is making it immutable. Python did this because we added the frozen=True , and it added the __le__(), __lt__(), __gt__(), __ge__() as we added the order=True (notice we didn't even implement these four in our old style class declaration as the code was already quite lengthy).
For default values, remember to declare those with default values after the ones that don't have a default value. For the default value of mutable objects(here it’s the list object), you need to use a default_factory. This can be declared by using the field() which you can import from dataclass. This is very important as we do not want all the instances of that class to use the same list.
If you don't want to have a __eq__(), __init__(), or __hash__() in your class just do this,
@dataclass(init=False, repr=False, eq=Flase, unsafe_hash=False)
class Tweets():
We have control over what should be returned from the __repr__() and __eq__() method too.
tweet_retweet_count: int = field(repr=False, compare=False,
default=0)
As we made repr=False this field is not included in the string returned by the generated __repr__() method and for compare=False it will exclude the tweet_retweet_count in the generated equality and comparison methods (__eq__(), __gt__() , __lt__() , etc. ).
For post init processing we have __post_init__() .
@dataclass
class ResponseOnTweet:
tweet_total_response_count: int = field(init=False)
tweet_retweet_count: int = 0
tweet_comment_count: int = 0
tweet_like_count: int = 0
def __post_init__(self):
self.tweet_total_response_count = self.tweet_retweet_count
+ self.tweet_comment_count + self.tweet_like_count
Conclusion
Isn’t it fun? Well, I certainly enjoy using dataclasses. I think you are also never going back to the old-style methods unless you want something really specific and custom. With this, I will end this post and in the next one, we will look into the Decorators in Python, as always if you have any suggestions or thoughts feel free to reach me on Twitter or LinkedIn. See you soon.
Advanced Python: Dataclasses was originally published in Level Up Coding on Medium, where people are continuing the conversation by highlighting and responding to this story.
This content originally appeared on Level Up Coding - Medium and was authored by Sandipan Dutta
Sandipan Dutta | Sciencx (2022-03-08T14:08:20+00:00) Advanced Python: Dataclasses. Retrieved from https://www.scien.cx/2022/03/08/advanced-python-dataclasses/
Please log in to upload a file.
There are no updates yet.
Click the Upload button above to add an update.