-
Notifications
You must be signed in to change notification settings - Fork 0
Python Style
Click to expand!
- Global variables
- Exceptions
- Comprehensions & Generator Expressions
- Default Iterators and Operators
- Default Argument Values
- Properties
- True/False Evaluations
- Deprecated Language Features
- Type Annotated Code
- Black
- Comments and Docstrings
- TODO Comments
-
Naming
- Names to Avoid
- Naming Conventions
- File Naming
- [Guidelines derived from Guido's Recommendations](#guidelines-derived-from--guido--https---enwikipediaorg-wiki-guido-van-rossum--s-recommendations)
- Function length
- Type Annotations
Avoid global variables.
Variables that are declared at the module level or as class attributes.
Occasionally useful.
Has the potential to change module behavior during the import, because assignments to global variables are done when the module is first imported.
Avoid global variables.
While they are technically variables, module-level constants are permitted and encouraged. For example: MAX_HOLY_HANDGRENADE_COUNT = 3. Constants must be named using all caps with underscores. See Naming below.
If needed, globals should be declared at the module level and made internal to the module by prepending an _ to the name. External access must be done through public module-level functions. See Naming below.
Exceptions are allowed but must be used carefully.
Exceptions are a means of breaking out of the normal flow of control of a code block to handle errors or other exceptional conditions.
The control flow of normal operation code is not cluttered by error-handling code. It also allows the control flow to skip multiple frames when a certain condition occurs, e.g., returning from N nested functions in one step instead of having to carry-through error codes.
May cause the control flow to be confusing. Easy to miss error cases when making library calls.
Exceptions must follow certain conditions:
-
Raise exceptions like this:
raise MyError('Error message')orraise MyError(). Do not use the two-argument form (raise MyError, 'Error message'). -
Make use of built-in exception classes when it makes sense. For example, raise a
ValueErrorto indicate a programming mistake like a violated precondition (such as if you were passed a negative number but required a positive one). Do not useassertstatements for validating argument values of a public API.assertis used to ensure internal correctness, not to enforce correct usage nor to indicate that some unexpected event occurred. If an exception is desired in the latter cases, use a raise statement. For example:Yes: def connect_to_next_port(self, minimum): """Connects to the next available port. Args: minimum: A port value greater or equal to 1024. Returns: The new minimum port. Raises: ConnectionError: If no available port is found. """ if minimum < 1024: # Note that this raising of ValueError is not mentioned in the doc # string's "Raises:" section because it is not appropriate to # guarantee this specific behavioral reaction to API misuse. raise ValueError(f'Min. port must be at least 1024, not {minimum}.') port = self._find_next_open_port(minimum) if not port: raise ConnectionError( f'Could not connect to service on port {minimum} or higher.') assert port >= minimum, ( f'Unexpected port {port} when minimum was {minimum}.') return port
No: def connect_to_next_port(self, minimum): """Connects to the next available port. Args: minimum: A port value greater or equal to 1024. Returns: The new minimum port. """ assert minimum >= 1024, 'Minimum port must be at least 1024.' port = self._find_next_open_port(minimum) assert port is not None return port
-
Libraries or packages may define their own exceptions. When doing so they must inherit from an existing exception class. Exception names should end in
Errorand should not introduce stutter (foo.FooError). -
Never use catch-all
except:statements, or catchExceptionorStandardError, unless you are- re-raising the exception, or
- creating an isolation point in the program where exceptions are not propagated but are recorded and suppressed instead, such as protecting a thread from crashing by guarding its outermost block.
Python is very tolerant in this regard and
except:will really catch everything including misspelled names, sys.exit() calls, Ctrl+C interrupts, unittest failures and all kinds of other exceptions that you simply don't want to catch. -
Minimize the amount of code in a
try/exceptblock. The larger the body of thetry, the more likely that an exception will be raised by a line of code that you didn't expect to raise an exception. In those cases, thetry/exceptblock hides a real error. -
Use the
finallyclause to execute code whether or not an exception is raised in thetryblock. This is often useful for cleanup, i.e., closing a file. -
When capturing an exception, use
asrather than a comma. For example:try: raise Error() except Error as error: pass
Okay to use for simple cases.
List, Dict, and Set comprehensions as well as generator expressions provide a
concise and efficient way to create container types and iterators without
resorting to the use of traditional loops, map(), filter(), or lambda.
Simple comprehensions can be clearer and simpler than other dict, list, or set creation techniques. Generator expressions can be very efficient, since they avoid the creation of a list entirely.
Complicated comprehensions or generator expressions can be hard to read.
Okay to use for simple cases. Each portion must fit on one line: mapping
expression, for clause, filter expression. Multiple for clauses or filter
expressions are not permitted. Use loops instead when things get more
complicated.
Yes:
result = [mapping_expr for value in iterable if filter_expr]
result = [{'key': value} for value in iterable
if a_long_filter_expression(value)]
result = [complicated_transform(x)
for x in iterable if predicate(x)]
descriptive_name = [
transform({'key': key, 'value': value}, color='black')
for key, value in generate_iterable(some_input)
if complicated_condition_is_met(key, value)
]
result = []
for x in range(10):
for y in range(5):
if x * y > 10:
result.append((x, y))
return {x: complicated_transform(x)
for x in long_generator_function(parameter)
if x is not None}
squares_generator = (x**2 for x in range(10))
unique_names = {user.name for user in users if user is not None}
eat(jelly_bean for jelly_bean in jelly_beans
if jelly_bean.color == 'black')No:
result = [complicated_transform(
x, some_argument=x+1)
for x in iterable if predicate(x)]
result = [(x, y) for x in range(10) for y in range(5) if x * y > 10]
return ((x, y, z)
for x in range(5)
for y in range(5)
if x != y
for z in range(5)
if y != z)Use default iterators and operators for types that support them, like lists, dictionaries, and files.
Container types, like dictionaries and lists, define default iterators and membership test operators ("in" and "not in").
The default iterators and operators are simple and efficient. They express the operation directly, without extra method calls. A function that uses default operators is generic. It can be used with any type that supports the operation.
You can't tell the type of objects by reading the method names (e.g. has_key()
means a dictionary). This is also an advantage.
Use default iterators and operators for types that support them, like lists,
dictionaries, and files. The built-in types define iterator methods, too. Prefer
these methods to methods that return lists, except that you should not mutate a
container while iterating over it. Never use Python 2 specific iteration methods
such as dict.iter*() unless necessary.
Yes: for key in adict: ...
if key not in adict: ...
if obj in alist: ...
for line in afile: ...
for k, v in adict.items(): ...
for k, v in six.iteritems(adict): ...No: for key in adict.keys(): ...
if not adict.has_key(key): ...
for line in afile.readlines(): ...
for k, v in dict.iteritems(): ...Okay in most cases.
You can specify values for variables at the end of a function's parameter list,
e.g., def foo(a, b=0):. If foo is called with only one argument, b is set
to 0. If it is called with two arguments, b has the value of the second
argument.
Often you have a function that uses lots of default values, but on rare occasions you want to override the defaults. Default argument values provide an easy way to do this, without having to define lots of functions for the rare exceptions. As Python does not support overloaded methods/functions, default arguments are an easy way of "faking" the overloading behavior.
Default arguments are evaluated once at module load time. This may cause problems if the argument is a mutable object such as a list or a dictionary. If the function modifies the object (e.g., by appending an item to a list), the default value is modified.
Okay to use with the following caveat:
Do not use mutable objects as default values in the function or method definition.
Yes: def foo(a, b=None):
if b is None:
b = []
Yes: def foo(a, b: Optional[Sequence] = None):
if b is None:
b = []
Yes: def foo(a, b: Sequence = ()): # Empty tuple OK since tuples are immutable
...No: def foo(a, b=[]):
...
No: def foo(a, b=time.time()): # The time the module was loaded???
...
No: def foo(a, b=FLAGS.my_thing): # sys.argv has not yet been parsed...
...
No: def foo(a, b: Mapping = {}): # Could still get passed to unchecked code
...Use properties for accessing or setting data where you would normally have used simple, lightweight accessor or setter methods.
A way to wrap method calls for getting and setting an attribute as a standard attribute access when the computation is lightweight.
Readability is increased by eliminating explicit get and set method calls for simple attribute access. Allows calculations to be lazy. Considered the Pythonic way to maintain the interface of a class. In terms of performance, allowing properties bypasses needing trivial accessor methods when a direct variable access is reasonable. This also allows accessor methods to be added in the future without breaking the interface.
Must inherit from object in Python 2. Can hide side-effects much like operator
overloading. Can be confusing for subclasses.
Use properties in new code to access or set data where you would normally have
used simple, lightweight accessor or setter methods. Properties should be
created with the @property decorator.
Inheritance with properties can be non-obvious if the property itself is not overridden. Thus one must make sure that accessor methods are called indirectly to ensure methods overridden in subclasses are called by the property.
Yes: import math
class Square:
"""A square with two properties: a writable area and a read-only perimeter.
To use:
>>> sq = Square(3)
>>> sq.area
9
>>> sq.perimeter
12
>>> sq.area = 16
>>> sq.side
4
>>> sq.perimeter
16
"""
def __init__(self, side):
self.side = side
@property
def area(self):
"""Area of the square."""
return self._get_area()
@area.setter
def area(self, area):
return self._set_area(area)
def _get_area(self):
"""Indirect accessor to calculate the 'area' property."""
return self.side ** 2
def _set_area(self, area):
"""Indirect setter to set the 'area' property."""
self.side = math.sqrt(area)
@property
def perimeter(self):
return self.side * 4Use the "implicit" false if at all possible.
Python evaluates certain values as False when in a boolean context. A quick
"rule of thumb" is that all "empty" values are considered false, so 0, None, [], {}, '' all evaluate as false in a boolean context.
Conditions using Python booleans are easier to read and less error-prone. In most cases, they're also faster.
May look strange to C/C++ developers.
Use the "implicit" false if possible, e.g., if foo: rather than if foo != []:. There are a few caveats that you should keep in mind though:
-
Always use
if foo is None:(oris not None) to check for aNonevalue. E.g., when testing whether a variable or argument that defaults toNonewas set to some other value. The other value might be a value that's false in a boolean context! -
Never compare a boolean variable to
Falseusing==. Useif not x:instead. If you need to distinguishFalsefromNonethen chain the expressions, such asif not x and x is not None:. -
For sequences (strings, lists, tuples), use the fact that empty sequences are false, so
if seq:andif not seq:are preferable toif len(seq):andif not len(seq):respectively. -
When handling integers, implicit false may involve more risk than benefit (i.e., accidentally handling
Noneas 0). You may compare a value which is known to be an integer (and is not the result oflen()) against the integer 0.Yes: if not users: print('no users') if foo == 0: self.handle_zero() if i % 10 == 0: self.handle_multiple_of_ten() def f(x=None): if x is None: x = []
No: if len(users) == 0: print('no users') if foo is not None and not foo: self.handle_zero() if not i % 10: self.handle_multiple_of_ten() def f(x=None): x = x or []
-
Note that
'0'(i.e.,0as string) evaluates to true.
Use string methods instead of the string module where possible. Use function
call syntax instead of apply. Use list comprehensions and for loops instead
of filter and map when the function argument would have been an inlined
lambda anyway. Use for loops instead of reduce.
Current versions of Python provide alternative constructs that people find generally preferable.
We do not use any Python version which does not support these features, so there is no reason not to use the new styles.
Yes: words = foo.split(':')
[x[1] for x in my_list if x[2] == 5]
map(math.sqrt, data) # Ok. No inlined lambda expression.
fn(*args, **kwargs)No: words = string.split(foo, ':')
map(lambda x: x[1], filter(lambda x: x[2] == 5, my_list))
apply(fn, args, kwargs)You can annotate Python 3 code with type hints according to PEP-484, and type-check the code at build time with a type checking tool like pytype or pyright.
Type annotations can be in the source or in a stub pyi file. Whenever possible, annotations should be in the source. Use pyi files for third-party or extension modules.
Type annotations (or "type hints") are for function or method arguments and return values:
def func(a: int) -> List[int]:You can also declare the type of a variable using similar PEP-526 syntax:
a: SomeType = some_func()Or by using a type comment in code that must support legacy Python versions.
a = some_func() # type: SomeTypeType annotations improve the readability and maintainability of your code. The type checker will convert many runtime errors to build-time errors, and reduce your ability to use Power Features.
You will have to keep the type declarations up to date. You might see type errors that you think are valid code. Use of a type checker may reduce your ability to use Power Features.
You are encouraged to enable Python type analysis when updating code. When adding or modifying public APIs, include type annotations and enable checking via pytype or pyright in the build system. As static analysis is relatively new to Python, it is acknowledged that undesired side-effects (such as wrongly inferred types) may prevent adoption by some projects. In those situations, authors are encouraged to add a comment with a TODO or link to a bug describing the issue(s) currently preventing type annotation adoption in the BUILD file or in the code itself as appropriate.
Black is a PEP 8 compliant opinionated formatter. Black reformats entire
files in place. It is not configurable. It doesn't take previous
formatting into account. Your main option of configuring Black is that
it doesn't reformat blocks that start with # fmt: off and
end with # fmt: on. # fmt: on/off have to be on the same level of
indentation. To learn more about Black's opinions, to go
the_black_code_style.
It is extremely recommended to read teh Black Readme
Be sure to use the right style for module, function, method docstrings and inline comments.
Python uses docstrings to document code. A docstring is a string that is the
first statement in a package, module, class or function. These strings can be
extracted automatically through the __doc__ member of the object and are used
by pydoc.
(Try running pydoc on your module to see how it looks.) Always use the three
double-quote """ format for docstrings (per
PEP 257).
A docstring should be organized as a summary line (one physical line not
exceeding 80 characters) terminated by a period, question mark, or exclamation
point. When writing more (encouraged), this must be followed by a blank line,
followed by the rest of the docstring starting at the same cursor position as
the first quote of the first line. There are more formatting guidelines for
docstrings below.
Files should start with a docstring for a better understanding and maintenance of the code, the header of different modules should follow some standard format and information.
| Name | Value |
|---|---|
| File | Name of the file |
| Project | Name of the repository |
| Created Date | Date the file was created |
| Authors | List of Authors |
| Modified By | Name of the last author to edit the file |
| Last Modified Date | Date of the last modification to the file |
| Summary | A short summary about what the file does |
'''
File: example.py
Project: coasteye
Created Date: Tuesday, 8th September 2020 3:27 PM
-----
Authors: Rick Astley
Gabe Newell
-----
Modified By: Rick Astley
Date Modified: Wednesday, 18th May, 2033 3:33 AM
-----
Summary: A summary of the module or program, it should contain an
overall description of the module or program. Should be
terminated by a period.
'''In this section, "function" means a method, function, or generator.
A function must have a docstring, unless it meets all of the following criteria:
- not externally visible
- very short
- obvious
A docstring should give enough information to write a call to the function
without reading the function's code. The docstring should be descriptive-style
("""Fetches rows from a Bigtable.""") rather than imperative-style ("""Fetch rows from a Bigtable."""), except for @property data descriptors, which
should use the same style as attributes. A docstring
should describe the function's calling syntax and its semantics, not its
implementation. For tricky code, comments alongside the code are more
appropriate than using docstrings.
A method that overrides a method from a base class may have a simple docstring
sending the reader to its overridden method's docstring, such as """See base class.""". The rationale is that there is no need to repeat in many places
documentation that is already present in the base method's docstring. However,
if the overriding method's behavior is substantially different from the
overridden method, or details need to be provided (e.g., documenting additional
side effects), a docstring with at least those differences is required on the
overriding method.
Certain aspects of a function should be documented in special sections, listed below. Each section begins with a heading line, which ends with a colon. All sections other than the heading should maintain a hanging indent of two or four spaces (be consistent within a file). These sections can be omitted in cases where the function's name and signature are informative enough that it can be aptly described using a one-line docstring.
Args:
: List each parameter by name. A description should follow the name, and be
separated by a colon followed by either a space or newline. If the
description is too long to fit on a single 80-character line, use a hanging
indent of 2 or 4 spaces more than the parameter name (be consistent with the
rest of the docstrings in the file). The description should include required
type(s) if the code does not contain a corresponding type annotation. If a
function accepts *foo (variable length argument lists) and/or **bar
(arbitrary keyword arguments), they should be listed as *foo and **bar.
Returns: (or Yields: for generators)
: Describe the type and semantics of the return value. If the function only
returns None, this section is not required. It may also be omitted if the
docstring starts with Returns or Yields (e.g. """Returns row from Bigtable as a tuple of strings.""") and the opening sentence is sufficient to
describe return value.
Raises: : List all exceptions that are relevant to the interface followed by a description. Use a similar exception name + colon + space or newline and hanging indent style as described in Args:. You should not document exceptions that get raised if the API specified in the docstring is violated (because this would paradoxically make behavior under violation of the API part of the API).
def fetch_smalltable_rows(table_handle: smalltable.Table,
keys: Sequence[Union[bytes, str]],
require_all_keys: bool = False,
) -> Mapping[bytes, Tuple[str]]:
"""Fetches rows from a Smalltable.
Retrieves rows pertaining to the given keys from the Table instance
represented by table_handle. String keys will be UTF-8 encoded.
Args:
table_handle: An open smalltable.Table instance.
keys: A sequence of strings representing the key of each table
row to fetch. String keys will be UTF-8 encoded.
require_all_keys: Optional; If require_all_keys is True only
rows with values set for all keys will be returned.
Returns:
A dict mapping keys to the corresponding table row data
fetched. Each row is represented as a tuple of strings. For
example:
{b'Serak': ('Rigel VII', 'Preparer'),
b'Zim': ('Irk', 'Invader'),
b'Lrrr': ('Omicron Persei 8', 'Emperor')}
Returned keys are always bytes. If a key from the keys argument is
missing from the dictionary, then that row was not found in the
table (and require_all_keys must have been False).
Raises:
IOError: An error occurred accessing the smalltable.
"""Similarly, this variation on Args: with a line break is also allowed:
def fetch_smalltable_rows(table_handle: smalltable.Table,
keys: Sequence[Union[bytes, str]],
require_all_keys: bool = False,
) -> Mapping[bytes, Tuple[str]]:
"""Fetches rows from a Smalltable.
Retrieves rows pertaining to the given keys from the Table instance
represented by table_handle. String keys will be UTF-8 encoded.
Args:
table_handle:
An open smalltable.Table instance.
keys:
A sequence of strings representing the key of each table row to
fetch. String keys will be UTF-8 encoded.
require_all_keys:
Optional; If require_all_keys is True only rows with values set
for all keys will be returned.
Returns:
A dict mapping keys to the corresponding table row data
fetched. Each row is represented as a tuple of strings. For
example:
{b'Serak': ('Rigel VII', 'Preparer'),
b'Zim': ('Irk', 'Invader'),
b'Lrrr': ('Omicron Persei 8', 'Emperor')}
Returned keys are always bytes. If a key from the keys argument is
missing from the dictionary, then that row was not found in the
table (and require_all_keys must have been False).
Raises:
IOError: An error occurred accessing the smalltable.
"""Classes should have a docstring below the class definition describing the class.
If your class has public attributes, they should be documented here in an
Attributes section and follow the same formatting as a
function's Args section.
class SampleClass:
"""Summary of class here.
Longer class information....
Longer class information....
Attributes:
likes_spam: A boolean indicating if we like SPAM or not.
eggs: An integer count of the eggs we have laid.
"""
def __init__(self, likes_spam=False):
"""Inits SampleClass with blah."""
self.likes_spam = likes_spam
self.eggs = 0
def public_method(self):
"""Performs operation blah."""The final place to have comments is in tricky parts of the code. If you're going to have to explain it at the next code review, you should comment it now. Complicated operations get a few lines of comments before the operations commence. Non-obvious ones get comments at the end of the line.
# We use a weighted dictionary search to find out where i is in
# the array. We extrapolate position based on the largest num
# in the array and the array size and then do binary search to
# get the exact number.
if i & (i-1) == 0: # True if i is 0 or a power of 2.To improve legibility, these comments should start at least 2 spaces away from
the code with the comment character #, followed by at least one space before
the text of the comment itself.
On the other hand, never describe the code. Assume the person reading the code knows Python (though not what you're trying to do) better than you do.
# BAD COMMENT: Now go through the b array and make sure whenever i occurs
# the next element is i+1Pay attention to punctuation, spelling, and grammar; it is easier to read well-written comments than badly written ones.
Comments should be as readable as narrative text, with proper capitalization and punctuation. In many cases, complete sentences are more readable than sentence fragments. Shorter comments, such as comments at the end of a line of code, can sometimes be less formal, but you should be consistent with your style.
Although it can be frustrating to have a code reviewer point out that you are using a comma when you should be using a semicolon, it is very important that source code maintain a high level of clarity and readability. Proper punctuation, spelling, and grammar help with that goal.
Use TODO comments for code that is temporary, a short-term solution, or
good-enough but not perfect.
A TODO comment begins with the string TODO in all caps and a parenthesized
name, e-mail address, or other identifier
of the person or issue with the best context about the problem. This is followed
by an explanation of what there is to do.
The purpose is to have a consistent TODO format that can be searched to find
out how to get more details. A TODO is not a commitment that the person
referenced will fix the problem. Thus when you create a
TODO, it is almost always your name
that is given.
# TODO(Michael): Use a "*" here for string repetition.
# TODO(YourNameHere) Change this to use relations.If your TODO is of the form "At a future date do something" make sure that you
either include a very specific date ("Fix by November 2009") or a very specific
event ("Remove this code when all clients can handle XML responses.").
module_name, package_name, ClassName, method_name, ExceptionName,
function_name, GLOBAL_CONSTANT_NAME, global_var_name, instance_var_name,
function_parameter_name, local_var_name.
Function names, variable names, and filenames should be descriptive; eschew abbreviation. In particular, do not use abbreviations that are ambiguous or unfamiliar to readers outside your project, and do not abbreviate by deleting letters within a word.
Always use a .py filename extension. Never use dashes.
-
single character names, except for specifically allowed cases:
- counters or iterators (e.g.
i,j,k,v, et al) -
eas an exception identifier intry/exceptstatements. -
fas a file handle inwithstatements
Please be mindful not to abuse single-character naming. Generally speaking, descriptiveness should be proportional to the name's scope of visibility. For example,
imight be a fine name for 5-line code block but within multiple nested scopes, it is likely too vague. - counters or iterators (e.g.
-
dashes (
-) in any package/module name -
__double_leading_and_trailing_underscore__names (reserved by Python) -
offensive terms
-
"Internal" means internal to a module, or protected or private within a class.
-
Prepending a single underscore (
_) has some support for protecting module variables and functions (linters will flag protected member access). While prepending a double underscore (__aka "dunder") to an instance variable or method effectively makes the variable or method private to its class (using name mangling) we discourage its use as it impacts readability and testability and isn't really private. -
Place related classes and top-level functions together in a module. Unlike Java, there is no need to limit yourself to one class per module.
-
Use CapWords for class names, but lower_with_under.py for module names. Although there are some old modules named CapWords.py, this is now discouraged because it's confusing when the module happens to be named after a class. ("wait -- did I write
import StringIOorfrom StringIO import StringIO?") -
Underscores may appear in unittest method names starting with
testto separate logical components of the name, even if those components use CapWords. One possible pattern istest<MethodUnderTest>_<state>; for exampletestPop_EmptyStackis okay. There is no One Correct Way to name test methods.
Python filenames must have a .py extension and must not contain dashes (-).
This allows them to be imported and unittested. If you want an executable to be
accessible without the extension, use a symbolic link or a simple bash wrapper
containing exec "$0.py" "$@".
Guidelines derived from Guido's Recommendations
| Type | Public | Internal |
|---|---|---|
| Packages | lower_with_under |
|
| Modules | lower_with_under |
_lower_with_under |
| Classes | CapWords |
_CapWords |
| Exceptions | CapWords |
|
| Functions | lower_with_under() |
_lower_with_under() |
| Global/Class Constants | CAPS_WITH_UNDER |
_CAPS_WITH_UNDER |
| Global/Class Variables | lower_with_under |
_lower_with_under |
| Instance Variables | lower_with_under |
_lower_with_under (protected) |
| Method Names | lower_with_under() |
_lower_with_under() (protected) |
| Function/Method Parameters | lower_with_under |
|
| Local Variables | lower_with_under |
Prefer small and focused functions.
It is recognized that long functions are sometimes appropriate, so no hard limit is placed on function length. If a function exceeds about 40 lines, think about whether it can be broken up without harming the structure of the program.
Even if your long function works perfectly now, someone modifying it in a few months may add new behavior. This could result in bugs that are hard to find. Keeping your functions short and simple makes it easier for other people to read and modify your code.
You could find long and complicated functions when working with some code. Do not be intimidated by modifying existing code: if working with such a function proves to be difficult, you find that errors are hard to debug, or you want to use a piece of it in several different contexts, consider breaking up the function into smaller and more manageable pieces.
- Familiarize yourself with PEP-484.
- In methods, only annotate
self, orclsif it is necessary for proper type information. e.g.,@classmethod def create(cls: Type[T]) -> T: return cls() - If any other variable or a returned type should not be expressed, use
Any. - You are not required to annotate all the functions in a module.
- At least annotate your public APIs.
- Use judgment to get to a good balance between safety and clarity on the one hand, and flexibility on the other.
- Annotate code that is prone to type-related errors (previous bugs or complexity).
- Annotate code that is hard to understand.
- Annotate code as it becomes stable from a types perspective. In many cases, you can annotate all the functions in mature code without losing too much flexibility.
After annotating, many function signatures will become "one parameter per line".
def my_method(self,
first_var: int,
second_var: Foo,
third_var: Optional[Bar]) -> int:
...Always prefer breaking between variables, and not, for example, between variable names and type annotations. However, if everything fits on the same line, go for it.
def my_method(self, first_var: int) -> int:
...If the combination of the function name, the last parameter, and the return type is too long, indent by 4 in a new line.
def my_method(
self, first_var: int) -> Tuple[MyLongType1, MyLongType1]:
...When the return type does not fit on the same line as the last parameter, the
preferred way is to indent the parameters by 4 on a new line and align the
closing parenthesis with the def.
Yes:
def my_method(
self, other_arg: Optional[MyLongType]
) -> Dict[OtherLongType, MyLongType]:
...pylint
allows you to move the closing parenthesis to a new line and align with the
opening one, but this is less readable.
No:
def my_method(self,
other_arg: Optional[MyLongType]
) -> Dict[OtherLongType, MyLongType]:
...As in the examples above, prefer not to break types. However, sometimes they are too long to be on a single line (try to keep sub-types unbroken).
def my_method(
self,
first_var: Tuple[List[MyLongType1],
List[MyLongType2]],
second_var: List[Dict[
MyLongType3, MyLongType4]]) -> None:
...If a single name and type is too long, consider using an alias for the type. The last resort is to break after the colon and indent by 4.
Yes:
def my_function(
long_variable_name:
long_module_name.LongTypeName,
) -> None:
...No:
def my_function(
long_variable_name: long_module_name.
LongTypeName,
) -> None:
...If you need to use a class name from the same module that is not yet defined -- for example, if you need the class inside the class declaration, or if you use a class that is defined below -- use a string for the class name.
class MyClass:
def __init__(self,
stack: List["MyClass"]) -> None:As per
PEP-008, use
spaces around the = only for arguments that have both a type annotation and
a default value.
Yes:
def func(a: int = 0) -> int:
...No:
def func(a:int=0) -> int:
...In the Python type system, NoneType is a "first class" type, and for typing
purposes, None is an alias for NoneType. If an argument can be None, it
has to be declared! You can use Union, but if there is only one other type,
use Optional.
Use explicit Optional instead of implicit Optional. Earlier versions of PEP
484 allowed a: Text = None to be interpretted as a: Optional[Text] = None,
but that is no longer the preferred behavior.
Yes:
def func(a: Optional[Text], b: Optional[Text] = None) -> Text:
...
def multiple_nullable_union(a: Union[None, Text, int]) -> Text
...No:
def nullable_union(a: Union[None, Text]) -> Text:
...
def implicit_optional(a: Text = None) -> Text:
...You can declare aliases of complex types. The name of an alias should be CapWorded. If the alias is used only in this module, it should be _Private.
For example, if the name of the module together with the name of the type is too long:
_ShortName = module_with_long_name.TypeWithLongName
ComplexMap = Mapping[Text, List[Tuple[int, int]]]Other examples are complex nested types and multiple return variables from a function (as a tuple).
You can disable type checking on a line with the special comment # type: ignore.
pytype has a disable option for specific errors (similar to lint):
# pytype: disable=attribute-errorIf an internal variable has a type that is hard or impossible to infer, you can specify its type in a couple ways.
Type Comments:
: Use a # type: comment on the end of the line
a = SomeUndecoratedFunction() # type: FooAnnotated Assignments : Use a colon and type between the variable name and value, as with function arguments.
a: Foo = SomeUndecoratedFunction()Typed lists can only contain objects of a single type. Typed tuples can either have a single repeated type or a set number of elements with different types. The latter is commonly used as the return type from a function.
a = [1, 2, 3] # type: List[int]
b = (1, 2, 3) # type: Tuple[int, ...]
c = (1, "2", 3.5) # type: Tuple[int, Text, float]The Python type system has
generics. The factory
function TypeVar is a common way to use them.
Example:
from typing import List, TypeVar
T = TypeVar("T")
...
def next(l: List[T]) -> T:
return l.pop()A TypeVar can be constrained:
AddableType = TypeVar("AddableType", int, float, Text)
def add(a: AddableType, b: AddableType) -> AddableType:
return a + bA common predefined type variable in the typing module is AnyStr. Use it for
multiple annotations that can be bytes or unicode and must all be the same
type.
from typing import AnyStr
def check_length(x: AnyStr) -> AnyStr:
if len(x) <= 42:
return x
raise ValueError()The proper type for annotating strings depends on what versions of Python the code is intended for.
For Python 3 only code, prefer to use str. Text is also acceptable. Be
consistent in using one or the other.
For Python 2 compatible code, use Text. In some rare cases, str may make
sense; typically to aid compatibility when the return types aren't the same
between the two Python versions. Avoid using unicode: it doesn't exist in
Python 3.
The reason this discrepancy exists is because str means different things
depending on the Python version.
No:
def py2_code(x: str) -> unicode:
...For code that deals with binary data, use bytes.
def deals_with_binary_data(x: bytes) -> bytes:
...For Python 2 compatible code that processes text data (str or unicode in
Python 2, str in Python 3), use Text. For Python 3 only code that process
text data, prefer str.
from typing import Text
...
def py2_compatible(x: Text) -> Text:
...
def py3_only(x: str) -> str:
...If the type can be either bytes or text, use Union, with the appropriate text
type.
from typing import Text, Union
...
def py2_compatible(x: Union[bytes, Text]) -> Union[bytes, Text]:
...
def py3_only(x: Union[bytes, str]) -> Union[bytes, str]:
...If all the string types of a function are always the same, for example if the return type is the same as the argument type in the code above, use AnyStr.
Writing it like this will simplify the process of porting the code to Python 3.
For classes from the typing module, always import the class itself. You are
explicitly allowed to import multiple specific classes on one line from the
typing module. Ex:
from typing import Any, Dict, OptionalGiven that this way of importing from typing adds items to the local
namespace, any names in typing should be treated similarly to keywords, and
not be defined in your Python code, typed or not. If there is a collision
between a type and an existing name in a module, import it using import x as y.
from typing import Any as AnyTypeUse conditional imports only in exceptional cases where the additional imports needed for type checking must be avoided at runtime. This pattern is discouraged; alternatives such as refactoring the code to allow top level imports should be preferred.
Imports that are needed only for type annotations can be placed within an if TYPE_CHECKING: block.
- Conditionally imported types need to be referenced as strings, to be forward compatible with Python 3.6 where the annotation expressions are actually evaluated.
- Only entities that are used solely for typing should be defined here; this includes aliases. Otherwise it will be a runtime error, as the module will not be imported at runtime.
- The block should be right after all the normal imports.
- There should be no empty lines in the typing imports list.
- Sort this list as if it were a regular imports list.
import typing
if typing.TYPE_CHECKING:
import sketch
def f(x: "sketch.Sketch"): ...Circular dependencies that are caused by typing are code smells. Such code is a good candidate for refactoring. Although technically it is possible to keep circular dependencies, the build system will not let you do so because each module has to depend on the other.
Replace modules that create circular dependency imports with Any. Set an
alias with a meaningful name, and use the real type name from
this module (any attribute of Any is Any). Alias definitions should be separated
from the last import by one line.
from typing import Any
some_mod = Any # some_mod.py imports this module.
...
def my_method(self, var: "some_mod.SomeType") -> None:
...When annotating, prefer to specify type parameters for generic types; otherwise,
the generics' parameters will be assumed to be Any.
def get_names(employee_ids: List[int]) -> Dict[int, Any]:
...# These are both interpreted as get_names(employee_ids: List[Any]) -> Dict[Any, Any]
def get_names(employee_ids: list) -> Dict:
...
def get_names(employee_ids: List) -> Dict:
...If the best type parameter for a generic is Any, make it explicit, but
remember that in many cases TypeVar might be more
appropriate:
def get_names(employee_ids: List[Any]) -> Dict[Any, Text]:
"""Returns a mapping from employee ID to employee name for given IDs."""T = TypeVar('T')
def get_names(employee_ids: List[T]) -> Dict[T, Text]:
"""Returns a mapping from employee ID to employee name for given IDs."""BE CONSISTENT.
If you're editing code, take a few minutes to look at the code around you and determine its style. If they use spaces around all their arithmetic operators, you should too. If their comments have little boxes of hash marks around them, make your comments have little boxes of hash marks around them too.
The point of having style guidelines is to have a common vocabulary of coding so people can concentrate on what you're saying rather than on how you're saying it. We present global style rules here so people know the vocabulary, but local style is also important. If code you add to a file looks drastically different from the existing code around it, it throws readers out of their rhythm when they go to read it. Avoid this.