Python type annotations and why you should use them
In this post we’ll take closer look at the Python type annotations and some immediate benefits that come with using them. First part will give you some overview of the type annotation basics and second part will show you how to leverage them to find bugs, speed up your code and create a basic web service to access the library in just a few lines of code.
Python type annotations
Python is dynamically typed
Let’s start with the obvious. Python types are checked at runtime, which means that until given line of code is run you won’t know if there are type errors in it.
if always_false():
v = "me" + 1
else:
v = 1 + 1
# up untill now it's totally "ok"
v = "U" + v
# and now we have a TypeError
Dynamic typing has its benefits. It leads to faster development cycle and less verbose code. Static types have their own benefits too:
- they document the code,
- they allow static checker to catch some bugs before execution,
- autocomplete in code editors works much better,
- code meta-analysis is much easier
- better optimization when compiling the code,
- allows auto-generated APIs and documentation
Python added support for type hints in version 3.5. They are:
- totally optional,
- not enforced at runtime.
Which means you can write your code without them, use them everywhere or just sprinkle them here and there. You can also ignore them.
Basic annotations
Let’s now look at some basic type annotation examples.
# variables
pi: float = 3.142
# functions
def coach(name: str, boost_more: bool = False) -> str:
return name + ', you got it!' + ('!!' if boost_more else '')
# classes
class B():
pass
b: B = B()
# built-in container types
bases: list = ["A", "T", "C", "G"]
coordinates: tuple = (2, 7)
factor: dict = {0: "WT", 1: "MT"}
Composite types
Next important group of type annotations are composite types. Most commonly used example of this group are containers:
from typing import List, Tuple, Dict
bases: List[str] = ["A", "T", "C", "G"]
coordinates: Tuple[int, int] = (2, 7)
factor: Dict[int, str] = {0: "WT", 1: "MT"}
In Python 3.9 you can simply use: list[str]
, tuple[int, int]
and dict[int, str]
.
Functions as arguments and type aliases
Python also allows for passing functions as variables. To annotate function use Callable
:
from typing import Callable
def coach_kamil(method):
return method('Kamil', True)
# with types:
def coach_kamil(method: Callable[[str, bool], str]) -> str:
return method('Kamil', True)
If it seems like annotations are getting too complicated and hard to read you can use type aliases:
CoachMethod = Callable[[str, bool], str]
def coach_kamil(method: CoachMethod) -> str:
return method('Kamil', True)
Optional arguments and Union types
Commonly used Python pattern is providing an optional argument which defaults to None
. To annotate a type which can be either something or None
use Optional
:
def do_work(task: Optional[Task] = None):
if task is not None:
task()
Optional
is just a Union
type, which let’s you specify the list of types the variable can be e.g. Union[float, int]
. Here is how you could define the Optional
type with Union
:
Optional[T] == Union[T, None]
Code example
Now let’s introduce a small, practical code example that will demonstrate usage of type annotations and will be used in the rest of the post. You may skip the example for now and get back to it if you want.
The code you will find below is a trivial implementation of algorithm for finding open reading frame (ORF) in DNA code. ORF is just a fragment of the DNA that may be translated into protein, but that knowledge is not necessary to go on. What you just need to know is that algorithm finds a substring which starts and ends with one of the given input strings of length 3.
from collections import namedtuple
from typing import Optional
from typing import List
from typing import Literal # new in python 3.8
from typing import Iterable
# for classes that provide the `__iter__()` method
from typing import Collection
# for classes that provide the `__iter__()`, `__len__()` and
# `__contains__()` methods
# we define a class which will hold found ORFs
Orf = namedtuple('Orf', ['seq', 'start', 'strand'])
BASE_MAP = str.maketrans('ATCG', 'TAGC')
def complementary(sequence: str) -> str:
return sequence.translate(BASE_MAP)
# method implementing search for the orfs on one dna strand
def find_orf_one_strand(
sequence: str,
start_codons: Collection[str],
stop_codons: Collection[str],
strand: Literal['f', 'r']
) -> Iterable[Orf]:
for offset in range(3): # for all reading frames
current_sequence_start: Optional[int] = None
for i in range(offset, len(sequence) - 2, 3):
if current_sequence_start is not None:
# extending current sequence until stop_codon
# is found
if sequence[i:i+3] in stop_codons:
yield Orf(
sequence[current_sequence_start:i+3],
current_sequence_start,
strand
)
current_sequence_start = None
elif sequence[i:i+3] in start_codons:
# staring a new sequence
current_sequence_start = i
# main method which will call find_orf_one_strand based on `strand` argument,
# either on both strands or just one
def find_orf(
sequence: str,
start_codons: List[str] = ['TTG','CTG','ATG'],
stop_codons: List[str] = ['TAA','TAG','TGA'],
strand: Literal['b', 'f', 'r'] = 'b'
) -> List[Orf]:
result: List[Orf] = []
if strand in ('b', 'f'):
result.extend(find_orf_one_strand(
sequence, start_codons, stop_codons, 'f'
))
if strand in ('b', 'r'):
result.extend(find_orf_one_strand(
complementary(sequence), start_codons, stop_codons, 'r'
))
return result
Static code checking with mypy
Now let’s take a look at the basic use of type annotations - static code analysis. We will use mypy library. You can install it using pip
by running pip install mypy
. Now the usage is as simple as running mypy python_file_to_check.py
.
Let’s put some incorrect call to our library in an example file and run a check on it:
from orf_finder import find_orf
orfs = find_orf(
"ATGGGGGTAGACATTCAGATGAATATATATTAGATGTTTTTTTAG",
start_codons="ATG"
)
Do you know what the problem is? mypy
does:
snippets.py:5: error: Argument "start_codons" to "find_orf"
has incompatible type "str"; expected "List[str]"
We used a string instead of a list as start_codons
argument - error which would not break this code at runtime, but could possibly lead to many problems.
Compiling python modules with mypyc
Now let’s move to another part of mypy
library - the mypyc
compiler. It’s an experimental program which can create a compiled *.so
library which you can import to your code as any other python module. Creators of the library advertise compiled code to be up to 4 times faster than the original. The drawback is that mypyc
is work in progress and many things still do not work.
Let’s put it to test by compiling our example - ORF Finder. It’s as simple as running:
mypyc orf_finder.py
Now let’s run a simple test:
# non-compiled version
python -m timeit -s 'from orf_finder import find_orf; seq = "ATGGGGGTAGACATTCAGATGAATATATATTAGATGTTTTTTTAG"' 'find_orf(seq)'
20000 loops, best of 5: 15.3 usec per loop
# compiled version
python -m timeit -s 'from orf_finder import find_orf; seq = "ATGGGGGTAGACATTCAGATGAATATATATTAGATGTTTTTTTAG"' 'find_orf(seq)'
50000 loops, best of 5: 9.23 usec per loop
As you can see the running time of the compiled library is around 40% faster, which for me is a bargain for those few annotations and running one command.
Making an API using FastAPI
Let’s imagine that we wrote the library and now we want to allow other people to use it as a web service. Naturally we need to write some web server code implementing API to wrap our library and provide some documentation for the API. What if we could just skip it? FastAPI library allows to leverage type annotations to just run our library as a web app. It will also generate request validators and API documentation.
Here’s the listing of all extra code needed to make that happen:
from fastapi import FastAPI, Query
app = FastAPI()
@app.get('/orf/{sequence}') # here we define an endpoint with one path parameter
def find_orf(
sequence: str,
start_codons: List[str] = Query(['TTG','CTG','ATG']),
# we need to use Query annotation to let library know about our list query argument
stop_codons: List[str] = Query(['TAA','TAG','TGA']),
strand: Literal['b', 'f', 'r'] = 'b'
) -> List[Orf]:
...
Now we can run our app using async web server e.g uvicorn
:
# installing the libraries
pip install fastapi uvicorn[standard]
# running the server
uvicorn main:app --reload
Now let’s run a few queries:
curl localhost:8000/orf/ATGGGGGTAGACATGAATATATATTAGATGTTTTTTTAG
# [["ATGGGGGTAGACATTCAGATGAATATATATTAG",0,"f"],["ATGTTTTTTTAG",33,"f"],["CTGTAA",9,"r"]]
curl localhost:8000/orf/ATGGGGGTAGACATGAATATATATTAGATGTTTTTTTAG?strand=f
# [["ATGGGGGTAGACATTCAGATGAATATATATTAG",0,"f"],["ATGTTTTTTTAG",33,"f"]]
curl localhost:8000/orf/ATGGGGGTAGACATGAAATATTAGATGTTTTTTTAG?strand=c
# {"detail":[{"loc":["query","strand"],"msg":"unexpected value; permitted: 'b', 'f', 'r'","type":"value_error.const","ctx":{"given":"c","permitted":["b","f","r"]}}]}
In the last call to the API you can see Literal['b', 'f', 'r']
annotation being used to validate the values used in the request and return the information about incorrect arguments. Moreover we can access the API documentation here: http://localhost:8000/docs
assuming localhost:8000
is our server address. Now we can simply deploy our app and share a url to our service with the users (assuming of course we don’t need any access control).
The end
I hope this post was informative and highlighted the typical pros of using type annotations, along with some not so obvious perks like code compilation and automatic API generation. There’s much more to types in Python and I encourage you to read more about it starting from the official documentation. The examples of using mypy
and FastAPI
also scratch only the surface so take a look at those great libraries too. Thanks for reading.