random
datetime
urllib
glob
re
logging
timeit
unittest
Python has a “batteries included” philosophy meaning the language library ships with a number of robust modules. Today we're going to look a few of them.
Being a good Python programmer isn't about being able to write any program in Python. It's also about knowing which things to write and which things are already written, much like being a good mathematician.
math
/cmath
sys
/os
/pathlib
xml
/csv
/json
itertools
/functools
/contexlib
random
¶The random
module contains classes and functions for generating random data of various types and distributions.
import random
random.random() # Random float from [0.0, 1.0)
0.9461393931961889
random.randint(1, 10) # Random integer from [1, 10]
8
random.uniform(3, 5) # Uniform float from [3, 5]
4.93753632999579
# Randomly selected item from the given list/iterable
random.choice(['a', 'b', 'c'])
'c'
datetime
¶The datetime
module defines datetime
, date
, time
, and timedelta
types for handling date arithmetic and date comparison. Note all of these types are immutable.
import datetime
today = datetime.date.today()
tomorrow = today + datetime.timedelta(days=1)
if today < tomorrow:
print(today.isoformat())
2018-04-15
These objects have a strftime
method which allows you to represent the date/time in various formats. All allowable formatting directives can be found here: http://docs.python.org/library/datetime.html#strftime-strptime-behavior
import datetime
today = datetime.date.today()
now = datetime.datetime.now()
today.strftime('%a %B %d, %Y')
'Sun April 15, 2018'
now.strftime('%m-%d-%y %I:%M %p')
'04-15-18 10:47 AM'
The datetime.strptime
function takes a string representing a date and another string representing the format and returns a datetime
object if it can be found.
import datetime
a = datetime.datetime.strptime('02-26-02','%m-%d-%y')
type(a)
datetime.datetime
a.date()
datetime.date(2002, 2, 26)
# Raises a ValueError if format that doesn't match
a = datetime.datetime.strptime('02-26-02','%d/%m/%Y')
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-10-3fab116d6009> in <module>() 1 # Raises a ValueError if format that doesn't match ----> 2 a = datetime.datetime.strptime('02-26-02','%d/%m/%Y') /usr/lib/python3.6/_strptime.py in _strptime_datetime(cls, data_string, format) 563 """Return a class cls instance based on the input string and the 564 format string.""" --> 565 tt, fraction = _strptime(data_string, format) 566 tzname, gmtoff = tt[-2:] 567 args = tt[:6] + (fraction,) /usr/lib/python3.6/_strptime.py in _strptime(data_string, format) 360 if not found: 361 raise ValueError("time data %r does not match format %r" % --> 362 (data_string, format)) 363 if len(data_string) != found.end(): 364 raise ValueError("unconverted data remains: %s" % ValueError: time data '02-26-02' does not match format '%d/%m/%Y'
urllib
¶urllib
is a standard module for accessing data over the internet. There are methods for retriving data as well as building urls and encoding query parameters.
# %load '../code/github.py'
import json
import urllib
import urllib.error
import urllib.request
search_url = 'https://api.github.com/search/repositories'
params = urllib.parse.urlencode({
'q': 'language:python',
'sort': 'stars',
'order': 'desc',
'per_page': 3,
})
url = '%s?%s' % (search_url, params)
try:
response = urllib.request.urlopen(url, timeout=10)
except urllib.error.HTTPError as e:
print('HTTPError getting Github data: %s' % e)
print(e.headers)
except urllib.error.URLError as e:
print('URLError getting Gitub data: %s' % e)
else:
content = response.read()
data = json.loads(content)
for repository in data['items']:
print('{full_name}: \u2605 {stargazers_count}'.format(**repository))
vinta/awesome-python: ★ 48457 rg3/youtube-dl: ★ 35917 toddmotto/public-apis: ★ 35557
glob
¶The glob
module is used to find filenames matching a given pattern. You can use *
and ?
wildcard characters as well as []
character ranges. The rules used match the Unix shell.
import glob
glob.glob('*.ipynb')
['MA792-002-Python-2.ipynb', 'MA792-002-Python-1.ipynb', 'MA792-002-Python-6.ipynb', 'MA792-002-Python-5.ipynb', 'MA792-002-Python-3.ipynb', 'MA792-002-Python-4.ipynb']
re
¶Python's support for regular expressions are contained in the r
e module. Regular expresssions (or regex) are used for matching character patterns in strings. While they are very useful and powerful they can also get quite complicated.
Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.
—Jamie Zawinski
import re
simple_phone = re.compile(r'[1-9]\d{2}(-|\.|\s)\d{4}')
sample_text = '''
Selling all of my old textbooks. $20 or best offer.
Call 555-1234 or email books@example.com if interested.
'''
match = simple_phone.search(sample_text)
match.group(0)
'555-1234'
match.start(), match.end()
(59, 67)
logging
¶print
statements are great but sometimes you need something a little bit more robust. The logging
module contains support for logging programs via console output, file output (including file rotation), socket ouput, email output via SMTP, and HTTP output.
You can configure multiple loggers for a program each with different formats, outputs, and logging levels.
!cat ../code/log.conf
[loggers] keys=root [handlers] keys=consoleHandler,fileHandler [formatters] keys=simpleFormatter [logger_root] level=DEBUG handlers=consoleHandler,fileHandler propagate=0 [handler_consoleHandler] class=StreamHandler level=DEBUG formatter=simpleFormatter args=(sys.stdout,) [handler_fileHandler] class=FileHandler level=INFO formatter=simpleFormatter args=('example.log',) [formatter_simpleFormatter] format="%(asctime)s - %(levelname)s - %(message)s"
import logging
import logging.config
logging.config.fileConfig("../code/log.conf")
logger = logging.getLogger() # root by default
logger.debug("debug message")
logger.info("info message")
logger.warning("warning message")
logger.error("error message")
logger.critical("critical message")
"2018-04-15 10:48:26,724 - DEBUG - debug message" "2018-04-15 10:48:26,726 - INFO - info message" "2018-04-15 10:48:26,728 - WARNING - warning message" "2018-04-15 10:48:26,729 - ERROR - error message" "2018-04-15 10:48:26,731 - CRITICAL - critical message"
timeit
/cProfile
¶timeit
is a module for timing/profiling small pieces of python code.
cProfile
is a more robust module for profiling which is combined with pstats to configure the profiler output statistics. profile
is a pure-Python module with the same API as cProfile
introduces additional overhead compared to cProfile
which is written as a C extension.
Note there is a bug/license issue which excludes pstats
from the default Python install on Ubuntu. You'll need to install the python-profiler
package.
See https://bugs.launchpad.net/ubuntu/+source/python-defaults/+bug/123755
timeit
Example (timeme.py)¶profile
Example (profileme.py)¶unittest
¶The unittest
module is a testing framework based on Kent Beck's Smalltalk testing framework. This same style of testing framework, called xUnit, has been written in most every language.
Sample program with some known flaws
Test suite to expose those flaws
Be sure to check the documenation on the Python website at http://docs.python.org/library/
Common third-party Python libraries