Python Catch Up

  • Tracebacks
  • Organizing Python Projects
  • Generators

Tracebacks

When things go wrong in Python you will get a wall of text trying to tell you what happened.

The general approach is to read them from bottom to top. The bottom will point to the exact error and each line above it will show where that was called from. In a larger project that will likely include both code that you have written as well as standard library and third-party code.

Example Traceback

In [1]:
# %load ../code/fail.py
import requests

response = requests.get('example.com')
---------------------------------------------------------------------------
MissingSchema                             Traceback (most recent call last)
<ipython-input-1-921a5b7f82e7> in <module>()
      2 import requests
      3 
----> 4 response = requests.get('example.com')

~/.virtualenvs/lecture/lib/python3.6/site-packages/requests/api.py in get(url, params, **kwargs)
     70 
     71     kwargs.setdefault('allow_redirects', True)
---> 72     return request('get', url, params=params, **kwargs)
     73 
     74 

~/.virtualenvs/lecture/lib/python3.6/site-packages/requests/api.py in request(method, url, **kwargs)
     56     # cases, and look like a memory leak in others.
     57     with sessions.Session() as session:
---> 58         return session.request(method=method, url=url, **kwargs)
     59 
     60 

~/.virtualenvs/lecture/lib/python3.6/site-packages/requests/sessions.py in request(self, method, url, params, data, headers, cookies, files, auth, timeout, allow_redirects, proxies, hooks, stream, verify, cert, json)
    492             hooks=hooks,
    493         )
--> 494         prep = self.prepare_request(req)
    495 
    496         proxies = proxies or {}

~/.virtualenvs/lecture/lib/python3.6/site-packages/requests/sessions.py in prepare_request(self, request)
    435             auth=merge_setting(auth, self.auth),
    436             cookies=merged_cookies,
--> 437             hooks=merge_hooks(request.hooks, self.hooks),
    438         )
    439         return p

~/.virtualenvs/lecture/lib/python3.6/site-packages/requests/models.py in prepare(self, method, url, headers, files, data, params, auth, cookies, hooks, json)
    303 
    304         self.prepare_method(method)
--> 305         self.prepare_url(url, params)
    306         self.prepare_headers(headers)
    307         self.prepare_cookies(cookies)

~/.virtualenvs/lecture/lib/python3.6/site-packages/requests/models.py in prepare_url(self, url, params)
    377             error = error.format(to_native_string(url, 'utf8'))
    378 
--> 379             raise MissingSchema(error)
    380 
    381         if not host:

MissingSchema: Invalid URL 'example.com': No schema supplied. Perhaps you meant http://example.com?

Organizing Python Projects

Our examples have all been small and for the most part unrelated. From that it might be unclear how you should organize a cohesive Python project. Python itself isn't very perscriptive so I'm going to give you some general advice but you should also do what feels natural to you.

Single File Project

There are more than a few useful Python projects which can do all they want to accomplish in a single file. You'll still want to create a directory for that file and pair it along with complemenraty files which will talk more about.

Example Python Project Structure

~/Projects/hilbert/
                   hilbert.py (Main file)
                   tests.py (Tests)
                   setup.cfg (Flake8 configuration)

Multi-File Package

There is no hard and fast rule about when and you should break out a large Python module into multiple files. That's something that you'll need to judge for yourself.

Example Larger Project Structure

~/Projects/hardy/
                   hardy/ (Main Source)
                         __init__.py
                         model.py
                         view.py
                         controller.py
                   tests/ (Tests organized to mirror sub-modules)
                         test_model.py
                         test_view.py
                         test_controller.py
                   setup.cfg (Flake8 configuration)

Namespace Packages

The __init__.py when turning a directory into an importable Python package is optional and creates a native namespace package. These can be used to split a large package into multiple installable packages (like a plugin system). In general it's a good idea to include the __init__.py unless you intend to use this feature.

If this sounds interesting to you can read more about them here: https://www.python.org/dev/peps/pep-0420/

Non-Source Files

Python projects might contain non-Python source files. It's ok for them to live inside of a directory which is also a Python package.

If you don't like that, then don't do it. If you do then go for it.

Tests

The Python community is a little split on whether tests should live inside the package or in its own directory. Django applications tend to put them inside for historical reasons but I've done both. Don't let this decision get in the way of writing tests. Do what feels natural to you.

Open Source Examples - PeeWee

https://github.com/coleifer/peewee

Open Source Examples - Beets

https://github.com/beetbox/beets

Generators

Generators provide a way in Python to create functions which return iterable values where the next value is not known or computed until requested. This can be used to create large (or infinite) series of values with holding the entire set in memory at once.

One of Our First Functions

In [2]:
def fibonacci(n):
    if n <= 2:
        return 1
    else:
        return fibonacci(n - 1) + fibonacci(n - 2)
In [3]:
# %load ../code/fibogen.py
def fibonacci():
    current, prev = None, None
    while True:
        if current is None or prev is None:
            yield 1
            current, prev = 1, current
        else:
            current, prev = current + prev, current
            yield current
In [4]:
values = fibonacci()
type(values)
Out[4]:
generator
In [5]:
next(values)
Out[5]:
1
In [6]:
next(values)
Out[6]:
1
In [7]:
next(values)
Out[7]:
2
In [8]:
next(values)
Out[8]:
3
In [9]:
next(values)
Out[9]:
5
In [10]:
for value in values:
    print(value)
    if value > 256:
        break
8
13
21
34
55
89
144
233
377
In [11]:
# %load ../code/fibogen.py
def fibonacci():
    current, prev = None, None
    while True:
        if current is None or prev is None:
            yield 1
            current, prev = 1, current
        else:
            current, prev = current + prev, current
            yield current

Non-Infinite Generator

In [12]:
def finite():
    yield 1
    yield 2
    yield 3
In [13]:
result = finite()
next(result)
Out[13]:
1
In [14]:
next(result)
Out[14]:
2
In [15]:
next(result)
Out[15]:
3
In [16]:
next(result)
---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
<ipython-input-16-709396a5599b> in <module>()
----> 1 next(result)

StopIteration: 

Yielding from Another Iterable (nyse.py)

Generator Expressions

Like list comprehensions, you can also create generator expressions. These follow a similar syntax as list comprehensions but using () rather than [].

In [17]:
(x ** 2 for x in range(5))
Out[17]:
<generator object <genexpr> at 0x7f5d097061a8>

Generator Gotchas

Generators and generator expressions are efficient and interesting but there are a few things to note when using them.

Generators Don't Have Length

Because all the values aren't known until it has been evaluated to the StopIteration you can't get the length of a generator by calling len.

In [18]:
squares = (x ** 2 for x in range(5))
In [19]:
len(squares)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-19-4081c3c4d628> in <module>()
----> 1 len(squares)

TypeError: object of type 'generator' has no len()
In [20]:
list(squares)
Out[20]:
[0, 1, 4, 9, 16]

Generators and Short-Circuiting

Because generators are lazily evaluated and boolean operations short circuit, you can't be sure that every item was yielded/evaluated (for better or worse).

In [21]:
from unittest.mock import Mock

a, b, c = Mock(), Mock(), Mock()

any(x.check() for x in [a, b, c])
Out[21]:
True
In [22]:
a.check.called, b.check.called, c.check.called
Out[22]:
(True, False, False)
In [23]:
a.reset_mock(), b.reset_mock(), c.reset_mock()

any([x.check() for x in [a, b, c]])
Out[23]:
True
In [24]:
a.check.called, b.check.called, c.check.called
Out[24]:
(True, True, True)

That's It

Thank you!