Utility functions used in the fastai library
from fastcore.test import *
from nbdev.showdoc import *
from fastcore.nb_imports import *

Basics

ifnone[source]

ifnone(a, b)

b if a is None else a

Since b if a is None else a is such a common pattern, we wrap it in a function. However, be careful, because python will evaluate both a and b when calling ifnone (which it doesn't do if using the if version directly).

test_eq(ifnone(None,1), 1)
test_eq(ifnone(2   ,1), 2)

maybe_attr[source]

maybe_attr(o, attr)

getattr(o,attr,o)

Return the attribute attr for object o. If the attribute doesn't exist, then return the object o instead.

class myobj: myattr='foo'

test_eq(maybe_attr(myobj, 'myattr'), 'foo')
test_eq(maybe_attr(myobj, 'another_attr'), myobj)

basic_repr[source]

basic_repr(flds=None)

Lookup a user-supplied list of attributes (flds) of an object and generate a string with the name of each attribute and its corresponding value. The format of this string is key=value, where key is the name of the attribute, and value is the value of the attribute. For each value, attempt to use the __name__ attribute, otherwise fall back to using the value's __repr__ when constructing the string.

class SomeClass:
    a=1
    b='foo'
    __repr__=basic_repr('a,b')
    __name__='some-class'
    
class AnotherClass:
    c=SomeClass()
    d='bar'
    __repr__=basic_repr(['c', 'd'])
    
sc = SomeClass()    
ac = AnotherClass()

test_eq(repr(sc), 'SomeClass(a=1, b=foo)')
test_eq(repr(ac), 'AnotherClass(c=some-class, d=bar)')

get_class[source]

get_class(nm, *fld_names, sup=None, doc=None, funcs=None, **flds)

Dynamically create a class, optionally inheriting from sup, containing fld_names

_t = get_class('_t', 'a', b=2)
t = _t()
test_eq(t.a, None)
test_eq(t.b, 2)
t = _t(1, b=3)
test_eq(t.a, 1)
test_eq(t.b, 3)
t = _t(1, 3)
test_eq(t.a, 1)
test_eq(t.b, 3)
test_eq(repr(t), '_t(a=1, b=3)')
test_eq(t, pickle.loads(pickle.dumps(t)))

Most often you'll want to call mk_class, since it adds the class to your module. See mk_class for more details and examples of use (which also apply to get_class).

mk_class[source]

mk_class(nm, *fld_names, sup=None, doc=None, funcs=None, mod=None, **flds)

Create a class using get_class and add to the caller's module

Any kwargs will be added as class attributes, and sup is an optional (tuple of) base classes.

mk_class('_t', a=1, sup=GetAttr)
t = _t()
test_eq(t.a, 1)
assert(isinstance(t,GetAttr))

A __init__ is provided that sets attrs for any kwargs, and for any args (matching by position to fields), along with a __repr__ which prints all attrs. The docstring is set to doc. You can pass funcs which will be added as attrs with the function names.

def foo(self): return 1
mk_class('_t', 'a', sup=GetAttr, doc='test doc', funcs=foo)

t = _t(3, b=2)
test_eq(t.a, 3)
test_eq(t.b, 2)
test_eq(t.foo(), 1)
test_eq(t.__doc__, 'test doc')
t
<__main__._t at 0x7efe2eb5ef90>

wrap_class[source]

wrap_class(nm, *fld_names, sup=None, doc=None, funcs=None, **flds)

Decorator: makes function a method of a new class nm passing parameters to mk_class

@wrap_class('_t', a=2)
def bar(self,x): return x+1

t = _t()
test_eq(t.a, 2)
test_eq(t.bar(3), 4)

class ignore_exceptions[source]

ignore_exceptions()

Context manager to ignore exceptions

with ignore_exceptions(): 
    # Exception will be ignored
    raise Exception

NoOp

These are used when you need a pass-through function.

noop[source]

noop(x=None, *args, **kwargs)

Do nothing

noop()
test_eq(noop(1),1)

noops[source]

noops(x=None, *args, **kwargs)

Do nothing (method)

mk_class('_t', foo=noops)
test_eq(_t().foo(1),1)

Attribute Helpers

These functions reduce boilerplate when setting or manipulating attributes or properties of objects.

dict2obj[source]

dict2obj(d)

Convert (possibly nested) dicts (or lists of dicts) to SimpleNamespace

This is a convenience to give you "dotted" access to (possibly nested) dictionaries, e.g:

d1 = dict(a=1, b=dict(c=2,d=3))
d2 = dict2obj(d1)
test_eq(d2.b.c, 2)
d2
namespace(a=1, b=namespace(c=2, d=3))

It can also be used on lists of dicts.

ds = L(d1, d1)
test_eq(dict2obj(ds)[0].b.c, 2)

store_attr[source]

store_attr(names=None, but=None, **attrs)

Store params named in comma-separated names from calling context into attrs in self

In it's most basic form, you can use store_attr to shorten code like this:

class T:
    def __init__(self, a,b,c): self.a,self.b,self.c = a,b,c

...to this:

class T:
    def __init__(self, a,b,c): store_attr('a,b,c', self)

This class behaves as if we'd used the first form:

t = T(1,c=2,b=3)
assert t.a==1 and t.b==3 and t.c==2

In addition, it stores the attrs as a dict in __stored_args__, which you can use for display, logging, and so forth.

test_eq(t.__stored_args__, {'a':1, 'b':3, 'c':2})

Since you normally want to use the first argument (often called self) for storing attributes, it's optional:

class T:
    def __init__(self, a,b,c): store_attr('a,b,c')

t = T(1,c=2,b=3)
assert t.a==1 and t.b==3 and t.c==2

You can inherit from a class using store_attr, and just call it again to add in any new attributes added in the derived class:

class T2(T):
    def __init__(self, d, **kwargs):
        super().__init__(**kwargs)
        store_attr('d')

t = T2(d=1,a=2,b=3,c=4)
assert t.a==2 and t.b==3 and t.c==4 and t.d==1

You can skip passing a list of attrs to store. In this case, all arguments passed to the method are stored:

class T:
    def __init__(self, a,b,c): store_attr()

t = T(1,c=2,b=3)
assert t.a==1 and t.b==3 and t.c==2
class T4(T):
    def __init__(self, d, **kwargs):
        super().__init__(**kwargs)
        store_attr()

t = T4(4, a=1,c=2,b=3)
assert t.a==1 and t.b==3 and t.c==2 and t.d==4

You can skip some attrs by passing but:

class T:
    def __init__(self, a,b,c): store_attr(but=['a'])

t = T(1,c=2,b=3)
assert t.b==3 and t.c==2
assert not hasattr(t,'a')

You can also pass keywords to store_attr, which is identical to setting the attrs directly, but also stores them in __stored_args__.

class T:
    def __init__(self): store_attr(a=1)

t = T()
assert t.a==1

You can also use store_attr inside functions.

def create_T(a, b):
    t = SimpleNamespace()
    store_attr(self=t)
    return t

t = create_T(a=1, b=2)
assert t.a==1 and t.b==2

attrdict[source]

attrdict(o, *ks)

Dict from each k in ks to getattr(o,k)

class T:
    def __init__(self, a,b,c): store_attr()

t = T(1,c=2,b=3)
test_eq(attrdict(t,'b','c'), {'b':3, 'c':2})

properties[source]

properties(*ps)

Change attrs in cls with names in ps to properties

class T:
    def a(self): return 1
    def b(self): return 2
properties(T,'a')

test_eq(T().a,1)
test_eq(T().b(),2)

camel2snake[source]

camel2snake(name)

Convert CamelCase to snake_case

test_eq(camel2snake('ClassAreCamel'), 'class_are_camel')
test_eq(camel2snake('Already_Snake'), 'already__snake')

snake2camel[source]

snake2camel(s)

Convert snake_case to CamelCase

test_eq(snake2camel('a_b_cc'), 'ABCc')

class2attr[source]

class2attr(cls_name)

Return the snake-cased name of the class. Additionally, remove the substring cls_name only if it is a substring at the end of the string.

class Parent:
    @property
    def name(self): return class2attr(self, 'Parent')

class ChildOfParent(Parent): pass
class ParentChildOf(Parent): pass

p = Parent()
cp = ChildOfParent()
cp2 = ParentChildOf()

test_eq(p.name, 'parent')
test_eq(cp.name, 'child_of')
test_eq(cp2.name, 'parent_child_of')

hasattrs[source]

hasattrs(o, attrs)

Test whether o contains all attrs

assert hasattrs(1,('imag','real'))
assert not hasattrs(1,('imag','foo'))

Extensible Types

ShowPrint is a base class that defines a show method, which is used primarily for callbacks in fastai that expect this method to be defined.

Int, Float, and Str extend int, float and str respectively by adding an additional show method by inheriting from ShowPrint.

The code for Int is shown below:

Examples:

Int(0).show()
Float(2.0).show()
Str('Hello').show()
0
2.0
Hello

Collection functions

Functions that manipulate popular python collections.

tuplify[source]

tuplify(o, use_list=False, match=None)

Make o a tuple

test_eq(tuplify(None),())
test_eq(tuplify([1,2,3]),(1,2,3))
test_eq(tuplify(1,match=[1,2,3]),(1,1,1))

detuplify[source]

detuplify(x)

If x is a tuple with one thing, extract it

test_eq(detuplify(()),None)
test_eq(detuplify([1]),1)
test_eq(detuplify([1,2]), [1,2])
test_eq(detuplify(np.array([[1,2]])), np.array([[1,2]]))

replicate[source]

replicate(item, match)

Create tuple of item copied len(match) times

t = [1,1]
test_eq(replicate([1,2], t),([1,2],[1,2]))
test_eq(replicate(1, t),(1,1))

uniqueify[source]

uniqueify(x, sort=False, bidir=False, start=None)

Return the unique elements in x, optionally sort-ed, optionally return the reverse correspondence, optionally prepended with a list or tuple of elements.

test_eq(set(uniqueify([1,1,0,5,0,3])),{0,1,3,5})
test_eq(uniqueify([1,1,0,5,0,3], sort=True),[0,1,3,5])
test_eq(uniqueify([1,1,0,5,0,3], start=[7,8,6]), [7,8,6,1,0,5,3])
v,o = uniqueify([1,1,0,5,0,3], bidir=True)
test_eq(v,[1,0,5,3])
test_eq(o,{1:0, 0: 1, 5: 2, 3: 3})
v,o = uniqueify([1,1,0,5,0,3], sort=True, bidir=True)
test_eq(v,[0,1,3,5])
test_eq(o,{0:0, 1: 1, 3: 2, 5: 3})

setify[source]

setify(o)

Turn any list like-object into a set.

test_eq(setify(None),set())
test_eq(setify('abc'),{'abc'})
test_eq(setify([1,2,2]),{1,2})
test_eq(setify(range(0,3)),{0,1,2})
test_eq(setify({1,2}),{1,2})

merge[source]

merge(*ds)

Merge all dictionaries in ds

test_eq(merge(), {})
test_eq(merge(dict(a=1,b=2)), dict(a=1,b=2))
test_eq(merge(dict(a=1,b=2), dict(b=3,c=4), None), dict(a=1, b=3, c=4))

is_listy[source]

is_listy(x)

isinstance(x, (tuple,list,L,slice,Generator))

assert is_listy((1,))
assert is_listy([1])
assert is_listy(L([1]))
assert is_listy(slice(2))
assert not is_listy(array([1]))

range_of[source]

range_of(x)

All indices of collection x (i.e. list(range(len(x))))

test_eq(range_of([1,1,1,1]), [0,1,2,3])

groupby[source]

groupby(x, key)

Like itertools.groupby but doesn't need to be sorted, and isn't lazy

test_eq(groupby('aa ab bb'.split(), itemgetter(0)), {'a':['aa','ab'], 'b':['bb']})

last_index[source]

last_index(x, o)

Finds the last index of occurence of x in o (returns -1 if no occurence)

test_eq(last_index(9, [1, 2, 9, 3, 4, 9, 10]), 5)
test_eq(last_index(6, [1, 2, 9, 3, 4, 9, 10]), -1)

shufflish[source]

shufflish(x, pct=0.04)

Randomly relocate items of x up to pct of len(x) from their starting location

l = list(range(100))
l2 = array(shufflish(l))
test_close(l2[:50 ].mean(), 25, eps=5)
test_close(l2[-50:].mean(), 75, eps=5)
test_ne(l,l2)

Reindexing Collections

class ReindexCollection[source]

ReindexCollection(coll, idxs=None, cache=None, tfm=noop) :: GetAttr

Reindexes collection coll with indices idxs and optional LRU cache of size cache

This is useful when constructing batches or organizing data in a particular manner (i.e. for deep learning). This class is primarly used in organizing data for language models in fastai.

Reindexing

You can supply a custom index upon instantiation with the idxs argument, or you can call the reindex method to supply a new index for your collection.

Here is how you can reindex a list such that the elements are reversed:

rc=ReindexCollection(['a', 'b', 'c', 'd', 'e'], idxs=[4,3,2,1,0])
list(rc)
['e', 'd', 'c', 'b', 'a']

Alternatively, you can use the reindex method:

ReindexCollection.reindex[source]

ReindexCollection.reindex(idxs)

Replace self.idxs with idxs

rc=ReindexCollection(['a', 'b', 'c', 'd', 'e'])
rc.reindex([4,3,2,1,0])
list(rc)
['e', 'd', 'c', 'b', 'a']

LRU Cache

You can optionally specify a LRU cache, which uses functools.lru_cache upon instantiation:

sz = 50
t = ReindexCollection(L.range(sz), cache=2)

#trigger a cache hit by indexing into the same element multiple times
t[0], t[0]
t._get.cache_info()
CacheInfo(hits=1, misses=1, maxsize=2, currsize=1)

You can optionally clear the LRU cache by calling the cache_clear method:

ReindexCollection.cache_clear[source]

ReindexCollection.cache_clear()

Clear LRU cache

sz = 50
t = ReindexCollection(L.range(sz), cache=2)

#trigger a cache hit by indexing into the same element multiple times
t[0], t[0]
t.cache_clear()
t._get.cache_info()
CacheInfo(hits=0, misses=0, maxsize=2, currsize=0)
ReindexCollection.shuffle[source]

ReindexCollection.shuffle()

Randomly shuffle indices

Note that an ordered index is automatically constructed for the data structure even if one is not supplied.

rc=ReindexCollection(['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h'])
rc.shuffle()
list(rc)
['a', 'e', 'b', 'h', 'c', 'd', 'g', 'f']

Tests

sz = 50
t = ReindexCollection(L.range(sz), cache=2)
test_eq(list(t), range(sz))
test_eq(t[sz-1], sz-1)
test_eq(t._get.cache_info().hits, 1)
t.shuffle()
test_eq(t._get.cache_info().hits, 1)
test_ne(list(t), range(sz))
test_eq(set(t), set(range(sz)))
t.cache_clear()
test_eq(t._get.cache_info().hits, 0)
test_eq(t.count(0), 1)

fastuple

A tuple with extended functionality.

class fastuple[source]

fastuple(x=None, *rest) :: tuple

A tuple with elementwise ops and more friendly init behavior

Friendly init behavior

Common failure modes when trying to initialize a tuple in python:

tuple(3)
> TypeError:'int' object is not iterable

or

tuple(3, 4)
> TypeError:tuple expected at most 1 arguments, got 2

However, fastuple allows you to define tuples like this and in the usual way:

test_eq(fastuple(3), (3,))
test_eq(fastuple(3,4), (3, 4))
test_eq(fastuple((3,4)), (3, 4))

Elementwise operations

fastuple.add[source]

fastuple.add(*args)

+ is already defined in tuple for concat, so use add instead

test_eq(fastuple.add((1,1),(2,2)), (3,3))
test_eq_type(fastuple(1,1).add(2), fastuple(3,3))
test_eq(fastuple('1','2').add('2'), fastuple('12','22'))
fastuple.mul[source]

fastuple.mul(*args)

* is already defined in tuple for replicating, so use mul instead

test_eq_type(fastuple(1,1).mul(2), fastuple(2,2))

Other Elementwise Operations

Additionally, the following elementwise operations are available:

  • le: less than
  • eq: equal
  • gt: greater than
  • min: minimum of
test_eq(fastuple(3,1).le(1), (False, True))
test_eq(fastuple(3,1).eq(1), (False, True))
test_eq(fastuple(3,1).gt(1), (True, False))
test_eq(fastuple(3,1).min(2), (2,1))

You can also do other elemntwise operations like negate a fastuple, or subtract two fastuples:

test_eq(-fastuple(1,2), (-1,-2))
test_eq(~fastuple(1,0,1), (False,True,False))

test_eq(fastuple(1,1)-fastuple(2,2), (-1,-1))

Other Tests

test_eq(type(fastuple(1)), fastuple)
test_eq_type(fastuple(1,2), fastuple(1,2))
test_ne(fastuple(1,2), fastuple(1,3))
test_eq(fastuple(), ())

Infinite Lists

These lists are useful for things like padding an array or adding index column(s) to arrays.

class Inf[source]

Inf()

Infinite lists

Inf defines the following properties:

  • count: itertools.count()
  • zeros: itertools.cycle([0])
  • ones : itertools.cycle([1])
  • nones: itertools.cycle([None])
test_eq([o for i,o in zip(range(5), Inf.count)],
        [0, 1, 2, 3, 4])

test_eq([o for i,o in zip(range(5), Inf.zeros)],
        [0]*5)

test_eq([o for i,o in zip(range(5), Inf.ones)],
        [1]*5)

test_eq([o for i,o in zip(range(5), Inf.nones)],
        [None]*5)

Operator Functions

in_[source]

in_(a, b=nan)

Same as operator.in_, or returns partial if 1 arg

assert in_('c', ('b', 'c', 'a'))
assert in_(4, [2,3,4,5])
assert in_('t', 'fastai')
test_fail(in_('h', 'fastai'))

# use in_ as a partial
assert in_('fastai')('t')
assert in_([2,3,4,5])(4)
test_fail(in_('fastai')('h'))

In addition to in_, the following functions are provided matching the behavior of the equivalent versions in operator: lt gt le ge eq ne add sub mul truediv is_ is_not.

lt(3,5),gt(3,5),is_(None,None),in_(0,[1,2])
(True, False, True, False)

Similarly to _in, they also have additional functionality: if you only pass one param, they return a partial function that passes that param as the second positional parameter.

lt(5)(3),gt(5)(3),is_(None)(None),in_([1,2])(0)
(True, False, True, False)

true[source]

true(*args, **kwargs)

Predicate: always True

assert true(1,2,3)
assert true(False)
assert true(None)
assert true([])

stop[source]

stop(e=StopIteration)

Raises exception e (by default StopException) even if in an expression

def tst():
    try: 
        stop() 
    except StopIteration: 
        return True
    
def tst2():
    try: 
        stop(e=ValueError) 
    except ValueError: 
        return True


assert tst()
assert tst2()

gen[source]

gen(func, seq, cond=true)

Like (func(o) for o in seq if cond(func(o))) but handles StopIteration

test_eq(gen(noop, Inf.count, lt(5)),
        range(5))
test_eq(gen(operator.neg, Inf.count, gt(-5)),
        [0,-1,-2,-3,-4])
test_eq(gen(lambda o:o if o<5 else stop(), Inf.count),
        range(5))

chunked[source]

chunked(it, chunk_sz=None, drop_last=False, n_chunks=None)

Return batches from iterator it of size chunk_sz (or return n_chunks total)

Note that you must pass either chunk_sz, or n_chunks, but not both.

t = L.range(10)
test_eq(chunked(t,3),      [[0,1,2], [3,4,5], [6,7,8], [9]])
test_eq(chunked(t,3,True), [[0,1,2], [3,4,5], [6,7,8],    ])

t = map(lambda o:stop() if o==6 else o, Inf.count)
test_eq(chunked(t,3), [[0, 1, 2], [3, 4, 5]])
t = map(lambda o:stop() if o==7 else o, Inf.count)
test_eq(chunked(t,3), [[0, 1, 2], [3, 4, 5], [6]])

t = np.arange(10)
test_eq(chunked(t,3),      L([0,1,2], [3,4,5], [6,7,8], [9]))
test_eq(chunked(t,3,True), L([0,1,2], [3,4,5], [6,7,8],    ))

Functions on Functions

Utilities for functional programming or for defining, modifying, or debugging functions.

trace[source]

trace(f)

Add set_trace to an existing function f

We can faciltate debugging with the python debugger (pdb) by annotating your function with the @trace decorator:

@trace
def myfunc(x):return x +1

Now, when the function is called it will drop you into the debugger. Note, you must issue the s command when you begin to step into the function that is being traced:

> myfunc(3)
> <ipython-input-257-13ccd7456a8e>(6)_inner()
      3     "Add `set_trace` to an existing function `f`"
      4     def _inner(*args,**kwargs):5         set_trace()----> 6         return f(*args,**kwargs)
      7     return _inner

ipdb> s
--Call--
> <ipython-input-260-b3d1b2a13afb>(1)myfunc()
----> 1 @trace
      2 def myfunc(x):return x +1

compose[source]

compose(*funcs, order=None)

Create a function that composes all functions in funcs, passing along remaining *args and **kwargs to all

f1 = lambda o,p=0: (o*2)+p
f2 = lambda o,p=1: (o+1)/p
test_eq(f2(f1(3)), compose(f1,f2)(3))
test_eq(f2(f1(3,p=3),p=3), compose(f1,f2)(3,p=3))
test_eq(f2(f1(3,  3),  3), compose(f1,f2)(3,  3))

f1.order = 1
test_eq(f1(f2(3)), compose(f1,f2, order="order")(3))

maps[source]

maps(*args, retain=noop)

Like map, except funcs are composed first

test_eq(maps([1]), [1])
test_eq(maps(operator.neg, [1,2]), [-1,-2])
test_eq(maps(operator.neg, operator.neg, [1,2]), [1,2])

partialler[source]

partialler(f, *args, order=None, **kwargs)

Like functools.partial but also copies over docstring

def _f(x,a=1):
    "test func"
    return x+a
_f.order=1

f = partialler(_f, a=2)
test_eq(f.order, 1)
f = partialler(_f, a=2, order=3)
test_eq(f.__doc__, "test func")
test_eq(f.order, 3)
test_eq(f(3), _f(3,2))

mapped[source]

mapped(f, it)

map f over it, unless it's not listy, in which case return f(it)

test_eq(mapped(_f,1),2)
test_eq(mapped(_f,[1,2]),[2,3])
test_eq(mapped(_f,(1,)),(2,))

instantiate[source]

instantiate(t)

Instantiate t if it's a type, otherwise do nothing

test_eq_type(instantiate(int), 0)
test_eq_type(instantiate(1), 1)

using_attr[source]

using_attr(f, attr)

Change function f to operate on attr

t = Path('/a/b.txt')
f = using_attr(str.upper, 'name')
test_eq(f(t), 'B.TXT')

Self (with an uppercase S)

A Concise Way To Create Lambdas

This is a concise way to create lambdas that are calling methods on an object (note the capitalization!)

Self.sum()), for instance, is a shortcut for lambda o: o.sum().

f = Self.sum()
x = array([3.,1])
test_eq(f(x), 4.)

# This is equivalent to above
f = lambda o: o.sum()
x = array([3.,1])
test_eq(f(x), 4.)

f = Self.argmin()
arr = np.array([1,2,3,4,5])
test_eq(f(arr), arr.argmin())

f = Self.sum().is_integer()
x = array([3.,1])
test_eq(f(x), True)

f = Self.sum().real.is_integer()
x = array([3.,1])
test_eq(f(x), True)

f = Self.imag()
test_eq(f(3), 0)

f = Self[1]
test_eq(f(x), 1)

Extensions to Pathlib.Path

An extension of the standard python libary Pathlib.Path. These extensions are accomplished by monkey patching additional methods onto Pathlib.Path.

Path.readlines[source]

Path.readlines(hint=-1, encoding='utf8')

Read the content of fname

Path.read[source]

Path.read(size=-1, encoding='utf8')

Read the content of fname

Path.write[source]

Path.write(txt, encoding='utf8')

Write txt to self, creating directories as needed

with tempfile.NamedTemporaryFile() as f:
    fn = Path(f.name)
    fn.write('t')
    t = fn.read()
    test_eq(t,'t')
    t = fn.readlines()
    test_eq(t,['t'])

Path.save[source]

Path.save(fn:Path, o)

Save a pickle file, to a file name or opened file

Path.load[source]

Path.load(fn:Path)

Load a pickle file from a file name or opened file

with tempfile.NamedTemporaryFile() as f:
    fn = Path(f.name)
    fn.save('t')
    t = fn.load()
test_eq(t,'t')

Path.ls[source]

Path.ls(n_max=None, file_type=None, file_exts=None)

Contents of path as a list

We add an ls() method to pathlib.Path which is simply defined as list(Path.iterdir()), mainly for convenience in REPL environments such as notebooks.

path = Path()
t = path.ls()
assert len(t)>0
t1 = path.ls(10)
test_eq(len(t1), 10)
t2 = path.ls(file_exts='.ipynb')
assert len(t)>len(t2)
t[0]
Path('.ipynb_checkpoints')

You can also pass an optional file_type MIME prefix and/or a list of file extensions.

lib_path = (path/'../fastcore')
txt_files=lib_path.ls(file_type='text')
assert len(txt_files) > 0 and txt_files[0].suffix=='.py'
ipy_files=path.ls(file_exts=['.ipynb'])
assert len(ipy_files) > 0 and ipy_files[0].suffix=='.ipynb'
txt_files[0],ipy_files[0]
(Path('../fastcore/logargs.py'), Path('04_transform.ipynb'))

Path.__repr__[source]

Path.__repr__()

Return repr(self).

fastai also updates the repr of Path such that, if Path.BASE_PATH is defined, all paths are printed relative to that path (as long as they are contained in Path.BASE_PATH:

t = ipy_files[0].absolute()
try:
    Path.BASE_PATH = t.parent.parent
    test_eq(repr(t), f"Path('nbs/{t.name}')")
finally: Path.BASE_PATH = None

remove_patches_path[source]

remove_patches_path()

A context manager for disabling Path extensions.

You can temporarily disable the various Path extensions by using remove_patches_path as a context manager as illustrated below:

with remove_patches_path():
    assert not hasattr(Path, 'write')
assert hasattr(Path, 'write')

File Functions

Utilities (other than extensions to Pathlib.Path) for dealing with IO.

bunzip[source]

bunzip(fn)

bunzip fn, raising exception if output already exists

f = Path('files/test.txt')
if f.exists(): f.unlink()
bunzip('files/test.txt.bz2')
t = f.open().readlines()
test_eq(len(t),1)
test_eq(t[0], 'test\n')
f.unlink()

join_path_file[source]

join_path_file(file, path, ext='')

Return path/file if file is a string or a Path, file otherwise

path = Path.cwd()/'_tmp'/'tst'
f = join_path_file('tst.txt', path)
assert path.exists()
test_eq(f, path/'tst.txt')
with open(f, 'w') as f_: assert join_path_file(f_, path) == f_
shutil.rmtree(Path.cwd()/'_tmp')

urlread[source]

urlread(url)

Retrieve url

urljson[source]

urljson(url)

Retrieve url and decode json

run_proc[source]

run_proc(*args)

Pass args to subprocess.run, returning stdout, or raise IOError on failure

do_request[source]

do_request(url, post=False, headers=None, **data)

Call GET or json-encoded POST on url, depending on post

Sorting Objects From Before/After

Transforms and callbacks will have run_after/run_before attributes, this function will sort them to respect those requirements (if it's possible). Also, sometimes we want a tranform/callback to be run at the end, but still be able to use run_after/run_before behaviors. For those, the function checks for a toward_end attribute (that needs to be True).

sort_by_run[source]

sort_by_run(fs)

class Tst(): pass    
class Tst1():
    run_before=[Tst]
class Tst2():
    run_before=Tst
    run_after=Tst1
    
tsts = [Tst(), Tst1(), Tst2()]
test_eq(sort_by_run(tsts), [tsts[1], tsts[2], tsts[0]])

Tst2.run_before,Tst2.run_after = Tst1,Tst
test_fail(lambda: sort_by_run([Tst(), Tst1(), Tst2()]))

def tst1(x): return x
tst1.run_before = Tst
test_eq(sort_by_run([tsts[0], tst1]), [tst1, tsts[0]])
    
class Tst1():
    toward_end=True
class Tst2():
    toward_end=True
    run_before=Tst1
tsts = [Tst(), Tst1(), Tst2()]
test_eq(sort_by_run(tsts), [tsts[0], tsts[2], tsts[1]])

Other Helpers

class PrettyString[source]

PrettyString() :: str

Little hack to get strings to show properly in Jupyter.

Allow strings with special characters to render properly in Jupyter. Without calling print() strings with special characters are displayed like so:

with_special_chars='a string\nwith\nnew\nlines and\ttabs'
with_special_chars
'a string\nwith\nnew\nlines and\ttabs'

We can correct this with PrettyString:

PrettyString(with_special_chars)
a string
with
new
lines and	tabs

round_multiple[source]

round_multiple(x, mult, round_down=False)

Round x to nearest multiple of mult

test_eq(round_multiple(63,32), 64)
test_eq(round_multiple(50,32), 64)
test_eq(round_multiple(40,32), 32)
test_eq(round_multiple( 0,32),  0)
test_eq(round_multiple(63,32, round_down=True), 32)
test_eq(round_multiple((63,40),32), (64,32))

even_mults[source]

even_mults(start, stop, n)

Build log-stepped array from start to stop in n steps.

test_eq(even_mults(2,8,3), [2,4,8])
test_eq(even_mults(2,32,5), [2,4,8,16,32])
test_eq(even_mults(2,8,1), 8)

num_cpus[source]

num_cpus()

Get number of cpus

num_cpus()
64

add_props[source]

add_props(f, g=None, n=2)

Create properties passing each of range(n) to f

class _T(): a,b = add_props(lambda i,x:i*2)

t = _T()
test_eq(t.a,0)
test_eq(t.b,2)
class _T(): 
    def __init__(self, v): self.v=v
    def _set(i, self, v): self.v[i] = v
    a,b = add_props(lambda i,x: x.v[i], _set)

t = _T([0,2])
test_eq(t.a,0)
test_eq(t.b,2)
t.a = t.a+1
t.b = 3
test_eq(t.a,1)
test_eq(t.b,3)

class ContextManagers[source]

ContextManagers(mgrs) :: GetAttr

Wrapper for contextlib.ExitStack which enters a collection of context managers

Multiprocessing

set_num_threads[source]

set_num_threads(nt)

Get numpy (and others) to use nt threads

This sets the number of threads consistently for many tools, by:

  1. Set the following environment variables equal to nt: OPENBLAS_NUM_THREADS,NUMEXPR_NUM_THREADS,OMP_NUM_THREADS,MKL_NUM_THREADS
  2. Sets nt threads for numpy and pytorch.

class ProcessPoolExecutor[source]

ProcessPoolExecutor(max_workers=2, on_exc=print, pause=0, **kwargs) :: ProcessPoolExecutor

Same as Python's ProcessPoolExecutor, except can pass max_workers==0 for serial execution

parallel[source]

parallel(f, items, *args, n_workers=2, total=None, progress=None, pause=0, timeout=None, chunksize=1, **kwargs)

Applies func in parallel to items, using n_workers

def add_one(x, a=1): 
    time.sleep(random.random()/80)
    return x+a

inp,exp = range(50),range(1,51)
test_eq(parallel(add_one, inp, n_workers=2), exp)
test_eq(parallel(add_one, inp, n_workers=0), exp)
test_eq(parallel(add_one, inp, n_workers=1, a=2), range(2,52))
test_eq(parallel(add_one, inp, n_workers=0, a=2), range(2,52))

Use the pause parameter to ensure a pause of pause seconds between processes starting. This is in case there are race conditions in starting some process, or to stagger the time each process starts, for example when making many requests to a webserver.

from datetime import datetime
def print_time(i): 
    time.sleep(random.random()/1000)
    print(i, datetime.now())

parallel(print_time, range(5), n_workers=2, pause=0.25);
0 2020-09-14 13:38:06.317966
1 2020-09-14 13:38:06.568367
2 2020-09-14 13:38:06.818557
3 2020-09-14 13:38:07.069433
4 2020-09-14 13:38:07.319457

Note that f should accept a collection of items.

run_procs[source]

run_procs(f, f_done, args)

Call f for each item in args in parallel, yielding f_done

parallel_gen[source]

parallel_gen(items, n_workers=2, **kwargs)

Instantiate cls in n_workers procs & call each on a subset of items in parallel.

class _C:
    def __call__(self, o): return ((i+1) for i in o)

items = range(5)

res = L(parallel_gen(_C, items, n_workers=3))
idxs,dat1 = zip(*res.sorted(itemgetter(0)))
test_eq(dat1, range(1,6))

res = L(parallel_gen(_C, items, n_workers=0))
idxs,dat2 = zip(*res.sorted(itemgetter(0)))
test_eq(dat2, dat1)

cls is any class with __call__. It will be passed args and kwargs when initialized. Note that n_workers instances of cls are created, one in each process. items are then split in n_workers batches and one is sent to each cls. The function then returns a generator of tuples of item indices and results.

class TestSleepyBatchFunc:
    "For testing parallel processes that run at different speeds"
    def __init__(self): self.a=1
    def __call__(self, batch):
        for k in batch:
            time.sleep(random.random()/4)
            yield k+self.a

x = np.linspace(0,0.99,20)
res = L(parallel_gen(TestSleepyBatchFunc, x, n_workers=2))
test_eq(res.sorted().itemgot(1), x+1)

Notebook functions

ipython_shell[source]

ipython_shell()

Same as get_ipython but returns False if not in IPython

in_ipython[source]

in_ipython()

Check if code is running in some kind of IPython environment

in_colab[source]

in_colab()

Check if the code is running in Google Colaboratory

in_jupyter[source]

in_jupyter()

Check if the code is running in a jupyter notebook

in_notebook[source]

in_notebook()

Check if the code is running in a jupyter notebook

These variables are availabe as booleans in fastcore.utils as IN_IPYTHON, IN_JUPYTER, IN_COLAB and IN_NOTEBOOK.

IN_IPYTHON, IN_JUPYTER, IN_COLAB, IN_NOTEBOOK
(True, True, False, True)