lib.util Module

Some useful utility functions that won’t generally be used to solve any Project Euler problems but make low-level contributions to this Python project.

lib.util – A collection of utility functions for this Python project

lib.util.load_dataset(root, dataset, separator='n', data_type=<class 'str'>)

Load an existing dataset from a file

The existing datasets are grouped by root:

  • "general" - problem agnostic datasets (e.g. famous numeric sequences)
  • "problems" - problem specific datasets (i.e. provided by Project Euler)

The dataset is a filename that is expected to be found at data/root/dataset with a .txt extension. The separator will be used to split the dataset. Finally, the data_type parameter can be used to typecast each element in the dataset.

Parameters:
  • root (str) – the root, or type, of dataset to load from
  • dataset (str) – the specific dataset (without a .txt extension) to load
  • separator (str) – the string to split elements on
  • data_type (Union[Type[str], Type[int], Type[float]]) – the underlying data-type of the elements in this dataset
Return type:

List[~DT]

Returns:

a list of elements from the specified dataset with typecasting

Raises:
  • TypeError – if root, dataset, separator or data_type are not str variables
  • FileNotFoundError – if the requested file (root, dataset) doesn’t exist
  • ValueError – if an element of the dataset cannot be typecast to data_type

Note

if separator is "" (i.e. the empty string), no splitting occurs.

Note

DT is the type that was specified for data_type. That is, this function returns a list of elements, the type of which is set by data_type.

Warning

there is a bug in the generation of this documentation. The default value of separator is the newline character (i.e. "\n"), not the literal "n" as reported above.

lib.util.memoize(func, maxsize=128, typed=False)

Convenience function to memoize an existing function

This is a simple wrapper around Python’s functools.lru_cache which provides memoization to arbitrary functions. The purpose of memoization is to cache the result of evaluating func on the provided arguments so that repeated calls do not re-evaluate the function unnecessarily. It is a useful building block in dynamic programming algorithms.

Note

it only makes sense to apply memoization to deterministic functions.

Python’s functools.lru_cache employs the least recently used, or LRU, paradigm. There are two optional arguments.

First, the maxsize argument sets the size of the cache. Powers of two are optimal values. A value of None means there is no limit and all input values will result in a cache entry.

Warning

having no restriction on the cache size can lead to memory leaks. Caution is advised.

Second, the typed argument will enforce strict type checking. If typed is True, then even if two objects can be coerced into equivalence, they will be considered as separate inputs, and thus, have separate cache records. For example, if typed is False (the default), then func(3) and func(3.0) would share a cache value, whereas is typed is True then they would have individual cache values.

Parameters:
  • func (Callable) – the function to memoize
  • maxsize (int) – the size of the LRU cache
  • typed (bool) – whether to strictly enforce type-checking or not
Return type:

Callable

Returns:

a memoized function, functionally equivalent to func

lib.util.wrap(para, m, n)

Wrap a paragraph on a width of \(n\) characters, indented by \(m\) spaces

Parameters:
  • para (str) – the paragraph to wrap on a given width
  • m (int) – the number of spaces of indentation
  • n (int) – the maximum line width to work to
Return type:

str

Returns:

the paragraph wrapped as specified

Raises:
  • TypeError – if para is not a str variable
  • TypeError – if \(m,n\) are not int variables
  • ValueError – if \(m \lt 0\)
  • ValueError – if \(n \le 0\)