Navigation
- README
- Pure-Python feature set
- Syntactic macro feature set
- Examples of creating dialects using
mcpyrate
- REPL server
- Troubleshooting
- Design notes
- Essays
- Additional reading
- Contribution guidelines
This is the pure-Python API of unpythonic
. Most features listed here need no macros, and are intended to be used directly.
The exception are the features marked [M], which are primarily intended as a code generation target API for macros. See the documentation for syntactic macros for details. Usually the relevant macro has the same name as the underlying implementation; for example, unpythonic.do
is the implementation, while unpythonic.syntax.do
is the macro. The purpose of the macro layer is to improve ease of use by removing accidental complexity, thus providing a more human-readable source code representation that compiles to calls to the underlying API. If you don't want to depend on mcpyrate
, feel free to use also these APIs as defined below (though, this may be less convenient).
let
,letrec
: local bindings in an expression [M]let
dlet
,blet
: let-over-def, like the classic let-over-lambda.letrec
- Lispylet: alternative syntax [M]
env
: the environmentassignonce
, a relative ofenv
.dyn
: dynamic assignment a.k.a. parameterize, special variables, fluid variables, "dynamic scoping".
frozendict
: an immutable dictionarycons
and friends: pythonic lispy linked listsbox
: a mutable single-item containerbox
Some
: immutable box, to explicitly indicate the presence of a value.ThreadLocalBox
Shim
: redirect attribute accesses- Container utilities:
get_abcs
,in_slice
,index_in_slice
Sequencing, run multiple expressions in any expression position (incl. inside a lambda
).
begin
: sequence side effectsdo
: stuff imperative code into an expression [M]pipe
,piped
,lazy_piped
: sequence functions
Batteries missing from the standard library.
- Batteries for functools:
curry
,compose
,withself
, and more.memoize
: a detailed explanation of the memoizer.curry
: a detailed explanation of the curry utility and its haskelly extra features.fix
: break infinite recursion cycles
- Batteries for itertools: multi-input folds, scans (lazy partial folds); unfold; lazy partial unpacking of iterables, etc.
- Batteries for network programming: message protocol, PTY/socket proxy, etc.
unpythonic.net.msg
: message protocol.
islice
: slice syntax support foritertools.islice
gmemoize
,imemoize
,fimemoize
: memoize generators, iterables and iterator factories.fup
: functional update;ShadowedSequence
: likecollections.ChainMap
, but for sequences.view
: writable, sliceable view into a sequence with scalar broadcast on assignment.mogrify
: update a mutable container in-places
,imathify
,gmathify
: lazy mathematical sequences with infix arithmeticsym
,gensym
,Singleton
: symbols and singletons
trampolined
,jump
: tail call optimization (TCO) / explicit continuationslooped
,looped_over
: loops in FP style (with TCO)- Relation to the TCO system
- FP loop over an iterable: the
looped_over
parametric decorator - Accumulator type and runtime cost
break
continue
- Prepackaged
break
andcontinue
- FP loops using a lambda as body
gtrampolined
: generators with TCO: tail-chaining; likeitertools.chain
, but from inside a generator.catch
,throw
: escape continuations (ec) (as in Lisp'scatch
/throw
, unlike C++ or Java)call_ec
: first-class escape continuations, like Racket'scall/ec
.
forall
: nondeterministic evaluation, a tuple comprehension with multiple body expressions.handlers
,restarts
: conditions and restarts, a.k.a. resumable exceptions.generic
,typed
,isoftype
: multiple dispatch: create generic functions with type annotation syntax; also some friendly utilities.
raisef
,tryf
:raise
andtry
as functions, useful inside a lambda.equip_with_traceback
, equip a manually created exception instance with a traceback.async_raise
: inject an exception to another thread (CPython only)reraise_in
,reraise
: automatically convert exception types
Function call and return value tools
def
as a code block:@call
: run a block of code immediately, in a new lexical scope.@callwith
: freeze arguments, choose function laterValues
: multiple and named return valuesvaluify
: convert pythonic multiple-return-values idiom oftuple
intoValues
.
almosteq
: floating-point almost-equalityfixpoint
: arithmetic fixed-point finderpartition_int
: partition integersulp
: unit in last place
callsite_filename
safeissubclass
, convenience function.pack
: multi-arg constructor for tuplenamelambda
: rename a functiontimer
: a context manager for performance testingformat_human_time
: seconds to days, hours, minutes, secondsETAEstimator
: estimate the time of completion of a long-running taskgetattrrec
,setattrrec
: access underlying data in an onion of wrappersarities
,kwargs
,resolve_bindings
: Function signature inspection utilitiesPopper
: a pop-while iterator
For many examples, see the unit tests, the docstrings of the individual features, and this guide.
This document doubles as the API reference, but despite maintenance on a best-effort basis, may occasionally be out-of-date at places. In case of conflicts in documentation, believe the unit tests first; specifically the code, not necessarily the comments. Everything else (comments, docstrings and this guide) should agree with the unit tests. So if something fails to work as advertised, check what the tests do - and optionally file an issue on GitHub so that the documentation can be fixed.
This document is up-to-date for v0.15.0.
Tools to bind identifiers in ways not ordinarily supported by Python.
NOTE: This is primarily a code generation target API for the let[]
family of macros, which make the constructs easier to use, and make the code look almost like normal Python. Below is the documentation for the raw API.
The let
constructs introduce bindings local to an expression, like Scheme's let
and letrec
.
In let
, the bindings are independent (do not see each other). A binding is of the form name=value
, where name
is a Python identifier, and value
is any expression.
Use a lambda e: ...
to supply the environment to the body:
# These six are the constructs covered in this section of documentation.
from unpythonic import let, letrec, dlet, dletrec, blet, bletrec
u = lambda lst: let(seen=set(),
body=lambda e:
[e.seen.add(x) or x for x in lst if x not in e.seen])
L = [1, 1, 3, 1, 3, 2, 3, 2, 2, 2, 4, 4, 1, 2, 3]
u(L) # --> [1, 3, 2, 4]
Generally speaking, body
is a one-argument function, which takes in the environment instance as the first positional parameter (by convention, named e
or env
). In typical inline usage, body
is lambda e: expr
.
Let over lambda. Here the inner lambda
is the definition of the function counter
:
from unpythonic import let, begin
counter = let(x=0,
body=lambda e:
lambda:
begin(e.set("x", e.x + 1), # can also use e << ("x", e.x + 1)
e.x))
counter() # --> 1
counter() # --> 2
For comparison, with the macro API, this becomes:
from unpythonic.syntax import macros, let, do
counter = let[[x << 0] in
(lambda:
do[x << x + 1,
x])]
counter() # --> 1
counter() # --> 2
(The parentheses around the lambda are just to make the expression into syntactically valid Python. You can also use brackets instead, denoting a multiple-expression let
body - which is also valid even if there is just one expression. The do
makes a multiple-expression lambda
body. For more, see the macro documentation.)
Compare the sweet-exp Racket (see SRFI-110 and sweet):
define counter
let ([x 0]) ; In Racket, the (λ (e) (...)) in "let" is implicit, and needs no explicit "e".
λ () ; Racket's λ has an implicit (begin ...), so we don't need a begin.
set! x {x + 1}
x
counter() ; --> 1
counter() ; --> 2
Let over def decorator @dlet
, to let over lambda more pythonically:
from unpythonic import dlet
@dlet(x=0)
def counter(*, env=None): # named argument "env" filled in by decorator
env.x += 1
return env.x
counter() # --> 1
counter() # --> 2
For comparison, with the macro API, this becomes:
from unpythonic.syntax import macros, dlet
@dlet(x << 0)
def counter():
x << x + 1
return x
counter() # --> 1
counter() # --> 2
The @blet
decorator is otherwise the same as @dlet
, but instead of decorating a function definition in the usual manner, it runs the def
block immediately, and upon exit, replaces the function definition with the return value. The name blet
is an abbreviation of block let, since the role of the def
is just a code block to be run immediately.
The name of this construct comes from the Scheme family of Lisps, and stands for let (mutually) recursive. The "mutually recursive" refers to the kind of scoping between the bindings in the same letrec
.
In plain English, in letrec
, the value of a binding may depend on other bindings in the same letrec
. The raw API in unpythonic
uses a lambda e: ...
to provide the environment:
from unpythonic import letrec
x = letrec(a=1,
b=lambda e:
e.a + 1,
body=lambda e:
e.b) # --> 2
The ordering of the definitions is respected, because Python 3.6 and later preserve the ordering of named arguments passed in a function call. See PEP 468.
For comparison, with the macro API, this becomes:
from unpythonic.syntax import macros, letrec
x = letrec[[a << 1,
b << a + 1] in
b]
In the non-macro letrec
, the value
of each binding is either a simple value (non-callable, and doesn't use the environment), or an expression of the form lambda e: valexpr
, providing access to the environment as e
. If valexpr
itself is callable, the binding must have the lambda e: ...
wrapper to prevent misinterpretation by the machinery when the environment initialization procedure runs.
In a non-callable valexpr
, trying to depend on a binding below it raises AttributeError
.
A callable valexpr
may depend on any bindings (also later ones) in the same letrec
. For example, here is a pair of mutually recursive functions:
from unpythonic import letrec
letrec(evenp=lambda e:
lambda x:
(x == 0) or e.oddp(x - 1),
oddp=lambda e:
lambda x:
(x != 0) and e.evenp(x - 1),
body=lambda e:
e.evenp(42)) # --> True
For comparison, with the macro API, this becomes:
from unpythonic.syntax import macros, letrec
letrec[[evenp << (lambda x:
(x == 0) or oddp(x - 1)),
oddp << (lambda x:
(x != 0) and evenp(x - 1))] in
evenp(42)] # --> True
Order-preserving list uniqifier:
from unpythonic import letrec, begin
u = lambda lst: letrec(seen=set(),
see=lambda e:
lambda x:
begin(e.seen.add(x),
x),
body=lambda e:
[e.see(x) for x in lst if x not in e.seen])
For comparison, with the macro API, this becomes:
from unpythonic.syntax import macros, letrec, do
u = lambda lst: letrec[[seen << set(),
see << (lambda x:
do[seen.add(x),
x])] in
[[see(x) for x in lst if x not in seen]]]
(The double brackets around the letrec
body are needed because brackets denote a multiple-expression letrec
body. So it is a multiple-expression body that contains just one expression, which is a list comprehension.)
The decorators @dletrec
and @bletrec
work otherwise exactly like @dlet
and @blet
, respectively, but the bindings are scoped like in letrec
(mutually recursive scope).
NOTE: This is primarily a code generation target API for the let[]
family of macros, which make the constructs easier to use. Below is the documentation for the raw API.
The lispylet
module was originally created to allow guaranteed left-to-right initialization of letrec
bindings in Pythons older than 3.6, hence the positional syntax and more parentheses. The only difference is the syntax; the behavior is identical with the other implementation. As of 0.15, the main role of lispylet
is to act as the run-time backend for the let
family of macros.
These constructs are available in the top-level unpythonic
namespace, with the ordered_
prefix: ordered_let
, ordered_letrec
, ordered_dlet
, ordered_dletrec
, ordered_blet
, ordered_bletrec
.
It is also possible to override the default let
constructs by the ordered_
variants, like this:
from unpythonic.lispylet import * # override the default "let" implementation
letrec((('a', 1),
('b', lambda e:
e.a + 1)), # may refer to any bindings above it in the same letrec
lambda e:
e.b) # --> 2
letrec((("evenp", lambda e:
lambda x: # callable, needs the lambda e: ...
(x == 0) or e.oddp(x - 1)),
("oddp", lambda e:
lambda x:
(x != 0) and e.evenp(x - 1))),
lambda e:
e.evenp(42)) # --> True
The syntax is let(bindings, body)
(respectively letrec(bindings, body)
), where bindings
is ((name, value), ...)
, and body
is like in the default variants. The same rules concerning name
and value
apply.
For comparison, with the macro API, the above becomes:
from unpythonic.syntax import macros, letrec
letrec[[a << 1,
b << a + 1] in
b]
letrec[[evenp << (lambda x:
(x == 0) or oddp(x - 1)),
oddp << (lambda x:
(x != 0) and evenp(x - 1))] in
evenp(42)] # --> True
(The transformations made by the macros may be the most apparent when comparing these examples. Note that the macros scope the let-bindings lexically, automatically figuring out which let
environment, if any, to refer to.)
Changed in v0.15.2. env
objects are now pickleable.
The environment used by all the let
constructs and assignonce
(but not by dyn
) is essentially a bunch with iteration, subscripting and context manager support. It is somewhat similar to types.SimpleNamespace
, but with many extra features. For details, see unpythonic.env.env
(and note the unfortunate module name).
Our env
allows things like:
let(x=1, y=2, z=3,
body=lambda e:
[(name, 2*e[name]) for name in e]) # --> [('y', 4), ('z', 6), ('x', 2)]
It also works as a bare bunch, and supports printing for debugging:
from unpythonic.env import env
e = env(s="hello", orange="fruit", answer=42)
print(e) # --> <env object at 0x7ff784bb4c88: {orange='fruit', s='hello', answer=42}>
print(e.s) # --> hello
d = {'monty': 'python', 'pi': 3.14159}
e = env(**d)
print(e) # --> <env object at 0x7ff784bb4c18: {pi=3.14159, monty='python'}>
print(e.monty) # --> python
Finally, it supports the context manager:
with env(x=1, y=2, z=3) as e:
print(e) # --> <env object at 0x7fde7411b080: {x=1, z=3, y=2}>
print(e.x) # --> 1
print(e) # empty!
When the with
block exits, the environment clears itself. The environment instance itself remains alive due to Python's scoping rules.
(This allows using with env(...) as e:
as a poor man's let
, if you have a block of statements you want to locally scope some names to, but don't want to introduce a def
.)
env
provides the collections.abc.Mapping
and collections.abc.MutableMapping
APIs.
As of v0.15.0, assignonce
is mostly a standalone curiosity that has never been integrated with the rest of unpythonic
. But anything that works with arbitrary subclasses of env
, for example mogrify
, works with it, too.
In Scheme terms, make define
and set!
look different:
from unpythonic import assignonce
with assignonce() as e:
e.foo = "bar" # new definition, ok
e.set("foo", "tavern") # explicitly rebind e.foo, ok
e << ("foo", "tavern") # same thing (but return e instead of new value, suitable for chaining)
e.foo = "quux" # AttributeError, e.foo already defined.
The assignonce
construct is a subclass of env
, so it shares most of the same features and allows similar usage.
The fact that in Python creating bindings and updating (rebinding) them look the same was already noted in 2000, in PEP 227, which introduced true closures to Python 2.1. For related history concerning the nonlocal
keyword, see PEP 3104.
Changed in v0.14.2. To bring this in line with SRFI-39, dyn
now supports rebinding, using assignment syntax such as dyn.x = 42
, and the function dyn.update(x=42, y=17, ...)
.
(As termed by Felleisen. Other names seen in the wild for variants of this feature include parameters (Scheme and Racket; not to be confused with function parameters), special variables (Common Lisp), fluid variables, fluid let (e.g. Emacs Lisp), and even the misnomer "dynamic scoping".)
The feature itself is dynamic assignment; the things it creates are dynamic variables (a.k.a. dynvars).
Dynvars are like global variables, but better-behaved. Useful for sending some configuration parameters through several layers of function calls without changing their API. Best used sparingly.
There's a singleton, dyn
:
from unpythonic import dyn, make_dynvar
make_dynvar(c=17) # top-level default value
def f(): # no "a" in lexical scope here
assert dyn.a == 2
def g():
with dyn.let(a=2, b="foo", c=42):
assert dyn.a == 2
assert dyn.c == 42
f()
with dyn.let(a=3): # dynamic assignments can be nested
assert dyn.a == 3
# now "a" has reverted to its previous value
assert dyn.a == 2
assert dyn.c == 17 # "c" has reverted to its default value
print(dyn.b) # AttributeError, dyn.b no longer exists
g()
Dynvars are created using with dyn.let(k0=v0, ...)
. The syntax is in line with the nature of the assignment, which is in effect for the dynamic extent of the with
. Exiting the with
block pops the dynamic environment stack. Inner dynamic environments shadow outer ones.
The point of dynamic assignment is that dynvars are seen also by code that is outside the lexical scope where the with dyn.let
resides. The use case is to avoid a function parameter definition cascade, when you need to pass some information through several layers that do not care about it. This is especially useful for passing "background" information, such as plotter settings in scientific visualization, or the macro expander instance in metaprogramming.
To give a dynvar a top-level default value, use make_dynvar(k0=v0, ...)
. Usually this is done at the top-level scope of the module for which that dynvar is meaningful. Each dynvar, of the same name, should only have one default set; the (dynamically) latest definition always overwrites. However, we do not prevent overwrites, because in some codebases the same module may run its top-level initialization code multiple times (e.g. if a module has a main()
for tests, and the file gets loaded both as a module and as the main program).
To rebind existing dynvars, use dyn.k = v
, or dyn.update(k0=v0, ...)
. Rebinding occurs in the closest enclosing dynamic environment that has the target name bound. If the name is not bound in any dynamic environment (including the top-level one), AttributeError
is raised.
CAUTION: Use rebinding of dynvars carefully, if at all. Stealth updates of dynvars defined in an enclosing dynamic extent can destroy any chance of statically reasoning about your code.
There is no set
function or <<
operator, unlike in the other unpythonic
environments.
Each thread has its own dynamic scope stack. There is also a global dynamic scope for default values, shared between threads.
A newly spawned thread automatically copies the then-current state of the dynamic scope stack from the main thread (not the parent thread!). Any copied bindings will remain on the stack for the full dynamic extent of the new thread. Because these bindings are not associated with any with
block running in that thread, and because aside from the initial copying, the dynamic scope stacks are thread-local, any copied bindings will never be popped, even if the main thread pops its own instances of them.
The source of the copy is always the main thread mainly because Python's threading
module gives no tools to detect which thread spawned the current one. (If someone knows a simple solution, a PR is welcome!)
Finally, there is one global dynamic scope shared between all threads, where the default values of dynvars live. The default value is used when dyn
is queried for the value outside the dynamic extent of any with dyn.let()
blocks. Having a default value is convenient for eliminating the need for if "x" in dyn
checks, since the variable will always exist (at any time after the global definition has been executed).
For more details, see the methods of dyn
; particularly noteworthy are asdict
and items
, which give access to a live view to dyn's contents in a dictionary format (intended for reading only!). The asdict
method essentially creates a collections.ChainMap
instance, while items
is an abbreviation for asdict().items()
. The dyn
object itself can also be iterated over; this creates a ChainMap
instance and redirects to iterate over it. dyn
also provides the collections.abc.Mapping
API.
To support dictionary-like idioms in iteration, dynvars can alternatively be accessed by subscripting; dyn["x"]
has the same meaning as dyn.x
, to allow things like:
print(tuple((k, dyn[k]) for k in dyn))
Finally, dyn
supports membership testing as "x" in dyn
, "y" not in dyn
, where the string is the name of the dynvar whose presence is being tested.
For some more details, see the unit tests.
This is essentially SRFI-39: Parameter objects for Python, using the MzScheme approach in the presence of multiple threads.
Racket's parameterize
behaves similarly. However, Racket seems to be the state of the art in many lispy language design related things, so its take on the feature may have some finer points I have not thought of.
On Common Lisp's special variables, see Practical Common Lisp by Peter Seibel, especially footnote 10 in the linked chapter, for a definition of terms. Similarly, dynamic variables in our dyn
have indefinite scope (because dyn
is implemented as a module-level global, accessible from anywhere), but dynamic extent.
So what we have in dyn
is almost exactly like Common Lisp's special variables, except we are missing convenience features such as setf
and a smart let
that auto-detects whether a variable is lexical or dynamic (if the name being bound is already in scope).
We provide some additional low-level containers beyond those provided by Python itself.
The class names are lowercase, because these are intended as low-level utility classes in principle on par with the builtins. The immutable containers are hashable. All containers are pickleable (if their contents are).
Changed in 0.14.2. A bug in frozendict
pickling has been fixed. Now also the empty frozendict
pickles and unpickles correctly.
Given the existence of dict
and frozenset
, this one is oddly missing from the language.
from unpythonic import frozendict
d = frozendict({'a': 1, 'b': 2})
d['a'] # OK
d['c'] = 3 # TypeError, not writable
Functional updates are supported:
d2 = frozendict(d, a=42)
assert d2['a'] == 42 and d2['b'] == 2
assert d['a'] == 1 # original not mutated
d3 = frozendict({'a': 1, 'b': 2}, {'a': 42}) # rightmost definition of each key wins
assert d3['a'] == 42 and d3['b'] == 2
# ...also using unpythonic.fupdate
d4 = fupdate(d3, a=23)
assert d4['a'] == 23 and d4['b'] == 2
assert d3['a'] == 42 and d3['b'] == 2 # ...of course without touching the original
Any mappings used when creating an instance are shallow-copied, so that the bindings of the frozendict
do not change even if the original input is later mutated:
d = {1:2, 3:4}
fd = frozendict(d)
d[5] = 6
assert d == {1: 2, 3: 4, 5: 6}
assert fd == {1: 2, 3: 4}
The usual caution concerning immutable containers in Python applies: the container protects only the bindings against changes. If the values themselves are mutable, the container cannot protect from mutations inside them.
All the usual read-access features work:
d7 = frozendict({1:2, 3:4})
assert 3 in d7
assert len(d7) == 2
assert set(d7.keys()) == {1, 3}
assert set(d7.values()) == {2, 4}
assert set(d7.items()) == {(1, 2), (3, 4)}
assert d7 == frozendict({1:2, 3:4})
assert d7 != frozendict({1:2})
assert d7 == {1:2, 3:4} # like frozenset, __eq__ doesn't care whether mutable or not
assert d7 != {1:2}
assert {k for k in d7} == {1, 3}
assert d7.get(3) == 4
assert d7.get(5, 0) == 0
assert d7.get(5) is None
In terms of collections.abc
, a frozendict
is a hashable immutable mapping:
assert issubclass(frozendict, Mapping)
assert not issubclass(frozendict, MutableMapping)
assert issubclass(frozendict, Hashable)
assert hash(d7) == hash(frozendict({1:2, 3:4}))
assert hash(d7) != hash(frozendict({1:2}))
The abstract superclasses are virtual, just like for dict
. We mean virtual in the sense of abc.ABCMeta
, i.e. a virtual superclass does not appear in the MRO.
Finally, frozendict
obeys the empty-immutable-container singleton invariant:
assert frozendict() is frozendict()
Laugh, it's funny.
Changed in v0.14.2. nil
is now a Singleton
, so it is treated correctly by pickle
. The nil
instance refresh code inside the cons
class has been removed, so the previous caveat about pickling a standalone nil
value no longer applies.
from unpythonic import (cons, nil, ll, llist,
car, cdr, caar, cdar, cadr, cddr,
member, lreverse, lappend, lzip,
BinaryTreeIterator)
c = cons(1, 2)
assert car(c) == 1 and cdr(c) == 2
# ll(...) is like [...] or (...), but for linked lists:
assert ll(1, 2, 3) == cons(1, cons(2, cons(3, nil)))
t = cons(cons(1, 2), cons(3, 4)) # binary tree
assert [f(t) for f in [caar, cdar, cadr, cddr]] == [1, 2, 3, 4]
# default iteration scheme is "single cell or linked list":
a, b = cons(1, 2) # unpacking a cons cell
a, b, c = ll(1, 2, 3) # unpacking a linked list
a, b, c, d = BinaryTreeIterator(t) # unpacking a binary tree: use a non-default iteration scheme
assert list(ll(1, 2, 3)) == [1, 2, 3]
assert tuple(ll(1, 2, 3)) == (1, 2, 3)
assert llist((1, 2, 3)) == ll(1, 2, 3) # llist() is like list() or tuple(), but for linked lists
l = ll(1, 2, 3)
assert member(2, l) == ll(2, 3)
assert not member(5, l)
assert lreverse(ll(1, 2, 3)) == ll(3, 2, 1)
assert lappend(ll(1, 2), ll(3, 4), ll(5, 6)) == ll(1, 2, 3, 4, 5, 6)
assert lzip(ll(1, 2, 3), ll(4, 5, 6)) == ll(ll(1, 4), ll(2, 5), ll(3, 6))
Cons cells are immutable à la Racket (no set-car!
/rplaca
, set-cdr!
/rplacd
). Accessors are provided up to caaaar
, ..., cddddr
.
Although linked lists are created with the functions ll
or llist
, the data type (for e.g. isinstance
) is cons
.
Iterators are supported, to walk over linked lists. This also gives sequence unpacking support. When next()
is called, we return the car
of the current cell the iterator points to, and the iterator moves to point to the cons cell in the cdr
, if any. When the cdr
is not a cons cell, it is the next (and last) item returned; except if it is nil
, then iteration ends without returning the nil
.
Python's builtin reversed
can be applied to linked lists; it will internally lreverse
the list (which is O(n)), then return an iterator to that. The llist
constructor is special-cased so that if the input is reversed(some_ll)
, it just returns the internal already reversed list. (This is safe because cons cells are immutable.)
Cons structures, by default, print in a pythonic format suitable for eval
(if all elements are):
print(cons(1, 2)) # --> cons(1, 2)
print(ll(1, 2, 3)) # --> ll(1, 2, 3)
print(cons(cons(1, 2), cons(3, 4)) # --> cons(cons(1, 2), cons(3, 4))
Cons structures can optionally print like in Lisps:
print(cons(1, 2).lispyrepr()) # --> (1 . 2)
print(ll(1, 2, 3).lispyrepr()) # --> (1 2 3)
print(cons(cons(1, 2), cons(3, 4)).lispyrepr()) # --> ((1 . 2) . (3 . 4))
For more, see the llist
submodule.
There is no copy
method or lcopy
function, because cons cells are immutable; which makes cons structures immutable.
However, for example, it is possible to cons
a new item onto an existing linked list; that is fine, because it produces a new cons structure - which shares data with the original, just like in Racket.
In general, copying cons structures can be error-prone. Given just a starting cell it is impossible to tell if a given instance of a cons structure represents a linked list, or something more general (such as a binary tree) that just happens to locally look like one, along the path that would be traversed if it was indeed a linked list.
The linked list iteration strategy does not recurse in the car
half, which could lead to incomplete copying. The tree strategy that recurses on both halves, on the other hand, will flatten nested linked lists and produce also the final nil
.
We provide a JackOfAllTradesIterator
as a compromise that understands both trees and linked lists. Nested lists will be flattened, and in a tree any nil
in a cdr
position will be omitted from the output. BinaryTreeIterator
and JackOfAllTradesIterator
use an explicit data stack instead of implicitly using the call stack for keeping track of the recursion. All cons
iterators work for arbitrarily deep cons structures without causing Python's call stack to overflow, and without the need for TCO.
cons
has no collections.abc
virtual superclasses (except the implicit Hashable
since cons
provides __hash__
and __eq__
), because general cons structures do not fit into the contracts represented by membership in those classes. For example, size cannot be known without iterating, and depends on which iteration scheme is used (e.g. nil
dropping, flattening); which scheme is appropriate depends on the content.
Changed in v0.14.2. The box
container API is now b.set(newvalue)
to rebind, returning the new value as a convenience. The equivalent syntactic sugar is b << newvalue
. The item inside the box can be extracted with b.get()
. The equivalent syntactic sugar is unbox(b)
.
Added in v0.14.2. ThreadLocalBox
: like box
, but with thread-local contents. It also holds a default object, which is used when a particular thread has not placed any object into the box.
Added in v0.14.2. Some
: like box
, but immutable. Useful to mark an optional attribute. Some(None)
indicates a value that is set to None
, in contrast to a bare None
, which can then be used indicate the absence of a value.
Changed in v0.14.2. Accessing the .x
attribute of a box
directly is now deprecated. It will continue to work with box
at least until 0.15, but it does not and cannot work with ThreadLocalBox
, which must handle things differently due to implementation reasons. Use the API mentioned above; it supports both kinds of boxes with the same syntax.
Consider this highly artificial example:
animal = "dog"
def f(x):
animal = "cat" # but I want to update the existing animal!
f(animal)
assert animal == "dog"
Many solutions exist. Common pythonic ones are abusing a list
to represent a box (and then trying to remember that it is supposed to hold only a single item), or (if the lexical structure of the particular piece of code allows it) using the global
or nonlocal
keywords to tell Python, on assignment, to overwrite a name that already exists in a surrounding scope.
As an alternative to the rampant abuse of lists, we provide a rackety box
, which is a minimalistic mutable container that holds exactly one item. Any code that has a reference to the box can update the data in it:
from unpythonic import box, unbox
cardboardbox = box("dog")
def f(thebox):
thebox << "cat" # send a cat into the box
f(cardboardbox)
assert unbox(cardboardbox) == "cat"
This simple example could have been handled by declaring global animal
in the body of f
, and then just assigning to animal
. The similar case with nested functions can be handled similarly, by declaring nonlocal animal
. But consider this:
from unpythonic import box, unbox
def f(x):
b = box(x)
g(b)
assert unbox(b) == "bobcat" # https://xkcd.com/325/
def g(thebox):
thebox << "bobcat"
f("dog")
Here g
effectively rebinds a local variable of f
- whether that is a good idea is a separate question, but technically speaking, this would not be possible without a container. As mentioned, abusing a list
is the standard Python (but not very pythonic!) solution. Using specifically a box
makes the intent explicit.
The box
API is summarized by:
from unpythonic import box, unbox
box1 = box("cat")
box2 = box("cat")
box3 = box("dog")
assert box1.get() == "cat" # .get() retrieves the current value
assert unbox(box1) == "cat" # unbox() is syntactic sugar, does the same thing
assert "cat" in box1 # content is "in" the box, also syntactically
assert "dog" not in box1
assert [x for x in box1] == ["cat"] # a box is iterable
assert len(box1) == 1 # a box always has length 1
assert box1 == "cat" # for equality testing, a box is considered equal to its content
assert unbox(box1) == "cat" # can also unbox the content before testing (good practice)
assert box2 == box1 # contents are equal, but
assert box2 is not box1 # different boxes
assert box3 != box1 # different contents
box3 << "fox" # replacing the item in the box (rebinding the contents)
assert "fox" in box3
assert "dog" not in box3
box3.set("fox") # same without syntactic sugar
assert "fox" in box3
The expression item in b
has the same meaning as unbox(b) == item
. Note box
is a mutable container, so it is not hashable.
The expression unbox(b)
has the same meaning as b.get()
, but because it is a function (instead of a method), it additionally sanity-checks that b
is a box, and if not, raises TypeError
.
The expression b << newitem
has the same meaning as b.set(newitem)
. In both cases, the new value is returned as a convenience.
We also provide an immutable box, Some
. This can be useful to represent optional data.
The idea is that the value, when present, is placed into a Some
, such as Some(42)
, Some("cat")
, Some(myobject)
. Then, the situation where the value is absent can be represented as a bare None
. So specifically, Some(None)
means that a value is present and this value is None
, whereas a bare None
means that there is no value.
It is like the Some
constructor of a Maybe
monad, but with no monadic magic. In this interpretation, the bare constant None
plays the role of Nothing
.
ThreadLocalBox
is otherwise exactly like box
, but magical: its contents are thread-local. It also holds a default object, which is set initially when the ThreadLocalBox
is instantiated. The default object is seen by threads that have not placed any object into the box.
from unpythonic import ThreadLocalBox, unbox
tlb = ThreadLocalBox("cat") # This `"cat"` becomes the default object.
assert unbox(tlb) == "cat"
def test_threadlocalbox_worker():
# This thread hasn't sent anything into the box yet,
# so it sees the default object.
assert unbox(tlb) == "cat"
tlb << "hamster" # Sending another object into the box...
assert unbox(tlb) == "hamster" # ...in this thread, now the box holds the new object.
t = threading.Thread(target=test_threadlocalbox_worker)
t.start()
t.join()
# But in the main thread, the box still holds the original object.
assert unbox(tlb) == "cat"
The method .setdefault(x)
changes the default object, and .getdefault()
retrieves the current default object. The method .clear()
clears the value sent to the box by the current thread, thus unshadowing the default for the current thread.
from unpythonic import ThreadLocalBox, unbox
tlb = ThreadLocalBox("gerbil")
# We haven't sent any object to the box, so we see the default object.
assert unbox(tlb) == "gerbil"
tlb.setdefault("cat") # change the default (cats always fill available boxes)
assert unbox(tlb) == "cat"
tlb << "tortoise" # Send an object to the box *for this thread*.
assert unbox(tlb) == "tortoise" # Now we see the object we sent. The default is shadowed.
def test_threadlocalbox_worker():
# Since this thread hasn't sent anything into the box yet,
# we see the current default object.
assert unbox(tlb) == "cat"
tlb << "dog" # But after we send an object into the box...
assert unbox(tlb) == "dog" # ...that's the object this thread sees.
t = threading.Thread(target=test_threadlocalbox_worker)
t.start()
t.join()
# In the main thread, this box still has the value the main thread sent there.
assert unbox(tlb) == "tortoise"
# But we can still see the default, if we want, by explicitly requesting it.
assert tlb.getdefault() == "cat"
tlb.clear() # When we clear the box in this thread...
assert unbox(tlb) == "cat" # ...this thread sees the current default object again.
Added in v0.14.2.
A Shim
is an attribute access proxy. The shim holds a box
(or a ThreadLocalBox
; your choice), and redirects attribute accesses on the shim to whatever object happens to currently be in the box. The point is that the object in the box can be replaced with a different one later (by sending another object into the box), and the code accessing the proxied object through the shim does not need to be aware that anything has changed.
For example, Shim
can combo with ThreadLocalBox
to redirect standard output only in particular threads. Place the stream object in a ThreadLocalBox
, shim that box, then replace sys.stdout
with the shim. See the source code of unpythonic.net.server
for an example that actually does (and cleanly undoes) this.
Since deep down, attribute access is the whole point of objects, Shim
is essentially a transparent object proxy. (For example, a method call is an attribute read (via a descriptor), followed by a function call.)
from unpythonic import Shim, box, unbox
class TestTarget:
def __init__(self, x):
self.x = x
def getme(self): # not very pythonic; just for demonstration.
return self.x
b = box(TestTarget(21))
s = Shim(b) # We could use a ThreadLocalBox just as well.
assert hasattr(s, "x")
assert hasattr(s, "getme")
assert s.x == 21
assert s.getme() == 21
# We can also add or rebind attributes through the shim.
s.y = "hi from injected attribute"
assert unbox(b).y == "hi from injected attribute"
s.y = "hi again"
assert unbox(b).y == "hi again"
b << TestTarget(42) # After we send a different object into the box held by the shim...
assert s.x == 42 # ...the shim accesses the new object.
assert s.getme() == 42
assert not hasattr(s, "y") # The new TestTarget instance doesn't have "y".
A shim can have an optional fallback object. It can be either any object, or a box
(or ThreadLocalBox
) if you want to replace the fallback later. For attribute reads (i.e. __getattr__
), if the object in the primary box does not have the requested attribute, Shim
will try to get it from the fallback. If fallback
is boxed, the attribute read takes place on the object in the box. If it is not boxed, the attribute read takes place directly on fallback
.
Any attribute writes (i.e. __setattr__
, binding or rebinding an attribute) always take place on the object in the primary box. That is, binding or rebinding of attributes is never performed on the fallback object.
from unpythonic import Shim, box, unbox
class Ex:
x = "hi from Ex"
class Wai:
y = "hi from Wai"
x, y = [box(obj) for obj in (Ex(), Wai())]
s = Shim(x, fallback=y)
assert s.x == "hi from Ex"
assert s.y == "hi from Wai" # no such attribute on Ex, fallback tried.
# Attribute writes (binding) always take place on the object in the primary box.
s.z = "hi from Ex again"
assert unbox(x).z == "hi from Ex again"
assert not hasattr(unbox(y), "z")
If you need to chain fallbacks, this can be done with foldr
:
from unpythonic import Shim, box, unbox, foldr
class Ex:
x = "hi from Ex"
class Wai:
x = "hi from Wai"
y = "hi from Wai"
class Zee:
x = "hi from Zee"
y = "hi from Zee"
z = "hi from Zee"
# There will be tried from left to right.
boxes = [box(obj) for obj in (Ex(), Wai(), Zee())]
*others, final_fallback = boxes
s = foldr(Shim, final_fallback, others) # Shim(box, fallback) <-> op(elt, acc)
assert s.x == "hi from Ex"
assert s.y == "hi from Wai"
assert s.z == "hi from Zee"
Or, since the operation takes just one elt
and an acc
, we can also use reducer
instead of foldr
, shortening this by one line:
from unpythonic import Shim, box, unbox, reducer
class Ex:
x = "hi from Ex"
class Wai:
x = "hi from Wai"
y = "hi from Wai"
class Zee:
x = "hi from Zee"
y = "hi from Zee"
z = "hi from Zee"
# There will be tried from left to right.
boxes = [box(obj) for obj in (Ex(), Wai(), Zee())]
s = reducer(Shim, boxes) # Shim(box, fallback) <-> op(elt, acc)
assert s.x == "hi from Ex"
assert s.y == "hi from Wai"
assert s.z == "hi from Zee"
Changed in v0.15.0. The sequence length argument in in_slice
, index_in_slice
is now named length
, not l
(ell). This avoids an E741 warning in flake8
, and is more descriptive.
Inspect the superclasses that a particular container type has:
from unpythonic import get_abcs
print(get_abcs(list))
This includes virtual superclasses, i.e. those that are not part of the MRO. This works by issubclass(cls, v)
on all classes defined in collections.abc
.
Reflection on slices:
from unpythonic import in_slice, index_in_slice
s = slice(1, 11, 2) # 1, 3, 5, 7, 9
assert in_slice(5, s)
assert not in_slice(6, s)
assert index_in_slice(5, s) == 2
An optional length argument can be given to interpret negative indices. See the docstrings for details.
Sequencing refers to running multiple expressions, in sequence, in place of one expression.
Keep in mind the only reason to ever need multiple expressions: side effects. Assignment is a side effect, too; it modifies the environment. In functional style, intermediate named definitions to increase readability are perhaps the most useful kind of side effect.
See also multilambda
in macros.
CAUTION: the begin
family of forms are provided for use in pure-Python projects only, and are a permanent part of the unpythonic
API for that purpose. They are somewhat simpler and less flexible than the do
family, described further below.
If your project uses macros, prefer the do[]
and do0[]
macros; those are the only sequencing constructs understood by other macros in unpythonic.syntax
that need to perform tail-position analysis (e.g. tco
, autoreturn
, continuations
). The do[]
and do0[]
macros also provide some convenience features, such as expression-local variables.
from unpythonic import begin, begin0
f1 = lambda x: begin(print("cheeky side effect"),
42 * x)
f1(2) # --> 84
f2 = lambda x: begin0(42 * x,
print("cheeky side effect"))
f2(2) # --> 84
The begin
and begin0
forms are actually tuples in disguise; evaluation of all items occurs before the begin
or begin0
form gets control. Items are evaluated left-to-right due to Python's argument passing rules.
We provide also lazy_begin
and lazy_begin0
, which use loops. The price is the need for a lambda wrapper for each expression to delay evaluation. See the module unpythonic.seq
for details.
NOTE: This is primarily a code generation target API for the do[]
and do0[]
macros, which make the constructs easier to use, and make the code look almost like normal Python. Below is the documentation for the raw API.
Basically, the do
family is a more advanced and flexible variant of the begin
family.
-
do
can bind names to intermediate results and then use them in later items. -
do
is effectively alet*
(technically,letrec
) where making a binding is optional, so that some items can have only side effects if so desired. There is no semantically distinctbody
; all items play the same role. -
Despite the name, there is no monadic magic.
Like in letrec
, use lambda e: ...
to access the environment, and to wrap callable values (to prevent misinterpretation by the machinery).
Unlike begin
(and begin0
), there is no separate lazy_do
(lazy_do0
), because using a lambda e: ...
wrapper for an item will already delay its evaluation; and the main point of do
/do0
is that there is an environment that holds local definitions. If you want a lazy variant, just wrap each item with a lambda e: ...
, also those that don't otherwise need it.
Like begin
and lazy_begin
, the do
form evaluates all items in order, and then returns the value of the last item.
from unpythonic import do, assign
y = do(assign(x=17), # create and set e.x
lambda e: print(e.x), # 17; uses environment, needs lambda e: ...
assign(x=23), # overwrite e.x
lambda e: print(e.x), # 23
42) # return value
assert y == 42
y = do(assign(x=17),
assign(z=lambda e: 2 * e.x),
lambda e: e.z)
assert y == 34
y = do(assign(x=5),
assign(f=lambda e: lambda x: x**2), # callable, needs lambda e: ...
print("hello from 'do'"), # value is None; not callable
lambda e: e.f(e.x))
assert y == 25
For comparison, with the macro API, this becomes:
from unpythonic.syntax import macros, do, local
y = do[local[x << 17], # create and set an x local to the environment
print(x),
x << 23, # overwrite x
print(x),
42] # return value
assert y == 42
y = do[local[x << 17],
local[z << 2 * x],
z]
assert y == 34
y = do[local[x << 5],
local[f << (lambda x: x**2)],
print("hello from 'do'"),
f(x)]
assert y == 25
In the macro version, all items are delayed automatically; that is, every item has an implicit lambda e: ...
. Note that instead of the assign
function, the macro version uses the syntax local[name << value]
to create an expression-local variable. Updating an existing variable in the do
environment is just name << value
. Finally, there is also delete[name]
.
When using the raw API, beware of this pitfall:
from unpythonic import do
do(lambda e: print("hello 2 from 'do'"), # delayed because lambda e: ...
print("hello 1 from 'do'"), # Python prints immediately before do()
"foo") # gets control, because technically, it is
# **the return value** that is an argument
# for do().
The above pitfall also applies to using escape continuations inside a do
. To do that, wrap the ec call into a lambda e: ...
to delay its evaluation until the do
actually runs:
from unpythonic import call_ec, do, assign
call_ec(
lambda ec:
do(assign(x=42),
lambda e: ec(e.x), # IMPORTANT: must delay this!
lambda e: print("never reached"))) # and this (as above)
This way, any assignments made in the do
(which occur only after do
gets control), performed above the line with the ec
call, will have been performed when the ec
is called.
For comparison, with the macro API, the last example becomes:
from unpythonic.syntax import macros, do, local
from unpythonic import call_ec
call_ec(
lambda ec:
do[local[x << 42],
ec(x),
print("never reached")])
In the macro version, all items are delayed automatically, so there do
/do0
gets control before any items are evaluated. The ec
fires when the do
evaluates that item, and the print
is indeed never reached.
Like begin0
and lazy_begin0
, the do0
form evaluates all items in order, and then returns the value of the first item.
It effectively does this internally:
from unpythonic import do, assign
y = do(assign(result=17),
print("assigned 'result' in env"),
lambda e: e.result) # return value
assert y == 17
So we can write:
from unpythonic import do0, assign
y = do0(17,
assign(x=42),
lambda e: print(e.x),
print("hello from 'do0'"))
assert y == 17
y = do0(assign(x=17), # the first item of do0 can be an assignment, too
lambda e: print(e.x))
assert y == 17
For comparison, with the macro API, this becomes:
from unpythonic.syntax import macros, do, local
y = do[local[result << 17],
print("assigned 'result' in env"),
result]
assert y == 17
y = do0[17,
local[x << 42],
print(x),
print("hello from 'do0'")]
assert y == 17
y = do0[local[x << 17],
print(x)]
assert y == 17
Changed in v0.15.0. Multiple return values and named return values, for unpacking to the args and kwargs of the next function in the pipe, as well as in the final return value from the pipe, are now represented as a Values
.
The variants pipe
and pipec
now expect a Values
initial value if you want to unpack it into the args and kwargs of the first function in the pipe. Otherwise, the initial value is sent as a single positional argument (notably tuples too).
The variants piped
and lazy_piped
automatically pack the initial arguments into a Values
.
The deprecated names getvalue
and runpipe
have been removed.
Changed in v0.14.2. Both getvalue
and runpipe
, used in the shell-like syntax, are now known by the single unified name exitpipe
. This is just a rename, with no functionality changes. The old names are now deprecated.
Similar to Racket's threading macros, but no macros. A pipe performs a sequence of operations, starting from an initial value, and then returns the final value. It is just function composition, but with an emphasis on data flow, which helps improve readability.
Both one-in-one-out (1-to-1) and n-in-m-out (n-to-m) pipes are provided. The 1-to-1 versions have names suffixed with 1
, and they are slightly faster than the general versions. The use case is one-argument functions that return one value.
In the n-to-m versions, when a function returns a Values
, it is unpacked to the args and kwargs of the next function in the pipeline. When a pipe exits, the Values
wrapper (if any) around the final result is discarded if it contains only one positional value. The main use case is computations that deal with multiple values, the number of which may also change during the computation (as long as the args/kwargs of each output Values
can be accepted as input by the next function in the pipe).
Additional examples can be found in the unit tests.
The function pipe
represents a self-contained pipeline that starts from a given value (or values), applies some operations in sequence, and then exits:
from unpythonic import pipe, Values
double = lambda x: 2 * x
inc = lambda x: x + 1
x = pipe(42, double, inc)
assert x == 85
To pass several positional values and/or named values, use a Values
object:
from unpythonic import pipe, Values
a, b = pipe(Values(2, 3),
lambda x, y: Values(x=(x + 1), y=(2 * y)),
lambda x, y: Values(x * 2, y + 1))
assert (a, b) == (6, 7)
In this example, we pass the initial values positionally into the first function in the pipeline; that function passes its return values by name; and the second function in the pipeline passes the final results positionally. Because there are only positional values in the final Values
object, it can be unpacked like a tuple.
The function pipec
is otherwise exactly like pipe
, but it curries the functions before applying them. This is useful with the passthrough feature of curry
.
With pipec
you can do things like:
from unpythonic import pipec, Values
a, b = pipec(Values(1, 2),
lambda x: x + 1, # extra values passed through by curry (positionals on the right)
lambda x, y: Values(x * 2, y + 1))
assert (a, b) == (4, 3)
For more on passthrough, see the section on curry
.
We also provide a shell-like syntax, with purely functional updates.
To set up a pipeline for use with the shell-like syntax, call piped
to load the initial value(s). It is possible to provide both positional and named values. Each use of the pipe operator applies the given function, but keeps the result inside the pipeline, ready to accept another function.
When done, pipe into the sentinel exitpipe
to exit the pipeline and return the current value(s):
from unpythonic import piped, exitpipe
x = piped(42) | double | inc | exitpipe
assert x == 85
p = piped(42) | double
assert p | inc | exitpipe == 85
assert p | exitpipe == 84 # p itself is never modified by the pipe system
Multiple values work like in pipe
, except the initial value(s) passed to piped
are automatically packed into a Values
. The pipe system then automatically unpacks a Values
object into the args/kwargs of the next function in the pipeline.
To return multiple positional values and/or named values, return a Values
object from your function.
When exitpipe
is applied, if the last function returned anything other than one positional value, you will get a Values
object.
from unpythonic import piped, exitpipe, Values
f = lambda x, y: Values(2 * x, y + 1)
g = lambda x, y: Values(x + 1, 2 * y)
x = piped(2, 3) | f | g | exitpipe # --> (5, 8)
assert x == Values(5, 8)
Unpacking works also here, because in the final result, there are only positional values:
from unpythonic import piped, exitpipe
a, b = piped(2, 3) | f | g | exitpipe # --> (5, 8)
assert (a, b) == (5, 8)
Lazy pipes are useful when you have mutable initial values. To perform the planned computation, pipe into the sentinel exitpipe
:
from unpythonic import lazy_piped1, exitpipe
lst = [1]
def append_succ(lis):
lis.append(lis[-1] + 1)
return lis # this return value is handed to the next function in the pipe
p = lazy_piped1(lst) | append_succ | append_succ # plan a computation
assert lst == [1] # nothing done yet
p | exitpipe # run the computation
assert lst == [1, 2, 3] # now the side effect has updated lst.
Lazy pipe as an unfold:
from unpythonic import lazy_piped, exitpipe
fibos = []
def nextfibo(a, b): # multiple arguments allowed
fibos.append(a) # store result by side effect
# New state, handed to the next function in the pipe.
# As of v0.15.0, use `Values(...)` to represent multiple return values.
# Positional args will be passed positionally, named ones by name.
return Values(a=b, b=(a + b))
p = lazy_piped(1, 1) # load initial state
for _ in range(10): # set up pipeline
p = p | nextfibo
assert (p | exitpipe) == Values(a=89, b=144) # run; check final state
assert fibos == [1, 1, 2, 3, 5, 8, 13, 21, 34, 55]
Things missing from the standard library.
memoize
, with exception caching.curry
, with passthrough like in Haskell.fix
: detect and break infinite recursion cycles. Added in v0.14.2.partial
with run-time type checking, which helps a lot with fail-fast in code that uses partial application. This function type-checks arguments against type annotations, then delegates tofunctools.partial
. Supportsunpythonic
's@generic
and@typed
functions, too. Added in v0.15.0.composel
,composer
: both left-to-right and right-to-left function composition, to help readability.- Changed in v0.15.0. For the benefit of code using the
with lazify
macro, the compose functions are now marked lazy. Arguments will be forced only when a lazy function in the chain actually uses them, or when an eager (not lazy) function is encountered in the chain. - Any number of positional and keyword arguments are supported, with the same rules as in the pipe system. Multiple return values, or named return values, represented as a
Values
, are automatically unpacked to the args and kwargs of the next function in the chain. composelc
,composerc
: curry each function before composing them. This comboes well with the passthrough of extra args/kwargs incurry
.- An implicit top-level curry context is inserted around all the functions except the one that is applied last, to allow passthrough to the top level while applying the composed function.
composel1
,composer1
: 1-in-1-out chains (faster).- suffix
i
to use with an iterable that contains the functions (composeli
,composeri
,composelci
,composerci
,composel1i
,composer1i
)
- Changed in v0.15.0. For the benefit of code using the
withself
: essentially, the Y combinator trick as a decorator. Allows a lambda to refer to itself.- The
self
argument is declared explicitly, but passed implicitly (as the first positional argument), just like theself
argument of a method.
- The
apply
: the lispy approach to starargs. Mainly useful with theprefix
macro.andf
,orf
,notf
: compose predicates (like Racket'sconjoin
,disjoin
,negate
).- Changed in v0.15.0. For the benefit of code using the
with lazify
macro,andf
andorf
are now marked lazy. Arguments will be forced only when a lazy predicate in the chain actually uses them, or when an eager (not lazy) predicate is encountered in the chain.
- Changed in v0.15.0. For the benefit of code using the
flip
: reverse the order of positional arguments.rotate
: a cousin offlip
. Permute the order of positional arguments in a cycle.to1st
,to2nd
,tokth
,tolast
,to
to help inserting 1-in-1-out functions into m-in-n-out compose chains. (Currying can eliminate the need for these.)identity
,const
which sometimes come in handy when programming with higher-order functions.
We will discuss memoize
, curry
and fix
in more detail shortly; but first, we will give some examples of the other utilities. Note that as always, more examples can be found in the unit tests.
from typing import NoReturn
from unpythonic import (fix, andf, orf, rotate,
foldl, foldr,
withself,
composel)
# detect and break infinite recursion cycles:
# a(0) -> b(1) -> a(2) -> b(0) -> a(1) -> b(2) -> a(0) -> ...
@fix()
def a(k):
return b((k + 1) % 3)
@fix()
def b(k):
return a((k + 1) % 3)
assert a(0) is NoReturn # the call does return, saying the original function wouldn't.
# andf, orf: short-circuiting predicate combinators
isint = lambda x: isinstance(x, int)
iseven = lambda x: x % 2 == 0
isstr = lambda s: isinstance(s, str)
assert andf(isint, iseven)(42) is True
assert andf(isint, iseven)(43) is False
pred = orf(isstr, andf(isint, iseven))
assert pred(42) is True
assert pred("foo") is True
assert pred(None) is False
# lambda that refers to itself
fact = withself(lambda self, n: n * self(n - 1) if n > 1 else 1)
assert fact(5) == 120
@rotate(-1) # cycle the argument slots to the left by one place, so "acc" becomes last
def zipper(acc, *rest): # so that we can use the *args syntax to declare this
return acc + (rest,) # even though the input is (e1, ..., en, acc).
myzipl = curry(foldl, zipper, ()) # same as (curry(foldl))(zipper, ())
myzipr = curry(foldr, zipper, ())
assert myzipl((1, 2, 3), (4, 5, 6), (7, 8)) == ((1, 4, 7), (2, 5, 8))
assert myzipr((1, 2, 3), (4, 5, 6), (7, 8)) == ((2, 5, 8), (1, 4, 7))
# composel: compose functions, applying the leftmost first
with_n = lambda *args: (partial(f, n) for n, f in args)
clip = lambda n1, n2: composel(*with_n((n1, drop), (n2, take)))
assert tuple(clip(5, 10)(range(20))) == tuple(range(5, 15))
In the last example, essentially we just want to clip 5 10 (range 20)
, the grouping of the parentheses being pretty much an implementation detail. Using the passthrough in curry
(more on which in the section on curry
, below), we can rewrite the last line as:
assert tuple(curry(clip, 5, 10, range(20)) == tuple(range(5, 15))
Changed in v0.15.0. Fix bug: memoize
is now thread-safe. Even when the same memoized function instance is called concurrently from multiple threads, exactly one thread will compute the result. If f
is recursive, the thread that acquired the lock is the one that is allowed to recurse into the memoized f
.
Memoization is a functional programming technique, meant to be used with pure functions. It caches the return value, so that for each unique set of arguments, the original function will be evaluated only once. All arguments must be hashable.
Our memoize
caches also exceptions, à la the Mischief package in Racket. If the memoized function is called again with arguments with which it raised an exception the first time, that same exception instance is raised again.
The decorator works also on instance methods, with results cached separately for each instance. This is essentially because self
is an argument, and custom classes have a default __hash__
. Hence it doesn't matter that the memo lives in the memoized
closure on the class object (type), where the method is, and not directly on the instances. The memo itself is shared between instances, but calls with a different value of self
will create unique entries in it. (This approach does have the expected problem: if lots of instances are created and destroyed, and a memoized method is called for each, the memo will grow without bound.)
For a solution that performs memoization at the instance level, see this ActiveState recipe (and to demystify the magic contained therein, be sure you understand descriptors).
There are some important differences to the nearest equivalents in the standard library, functools.cache
(Python 3.9+) and functools.lru_cache
:
-
memoize
binds arguments like Python itself does, so given this definition:from unpythonic import memoize @memoize def f(a, b): return a + b
the calls
f(1, 2)
,f(1, b=2)
,f(a=1, b=2)
, andf(b=2, a=1)
all hit the same cache key.As of Python 3.9, in
functools.lru_cache
this is not so; see the internal functionfunctools._make_key
infunctools.py
, where the comments explicitly say so. -
memoize
caches exceptions, too. A pure function that crashed for some combination of arguments, if given the same inputs again, will just crash again with the same error, so there is no reason to run it again. -
memoize
has no maximum cache size or hit/miss statistics counting. -
memoize
does not have atyped
mode to treat42
and42.0
as different keys to the memo. The function arguments are hashed, and both anint
and an equalfloat
happen to hash to the same value.The
typed
mode of the standard library functions is actually a form of dispatch. Hence, you can use@generic
(which see), and@memoize
each individual multimethod:from unpythonic import generic, memoize @generic @memoize def thrice(x: int): return 3 * x @generic @memoize def thrice(x: float): return 3.0 * x
Without using
@generic
, the essential idea is:from unpythonic import memoize def thrice(x): # the dispatcher if isinstance(x, int): return thrice_int(x) elif isinstance(x, float): return thrice_float(x) raise TypeError(f"unsupported argument: {type(x)} with value {repr(x)}") @memoize def thrice_int(x): return 3 * x @memoize def thrice_float(x): return 3.0 * x
Observe that we memoize each implementation, not the dispatcher.
This solution keeps dispatching and memoization orthogonal.
Examples:
from unpythonic import memoize
ncalls = 0
@memoize # <-- important part
def square(x):
global ncalls
ncalls += 1
return x**2
assert square(2) == 4
assert ncalls == 1
assert square(3) == 9
assert ncalls == 2
assert square(3) == 9
assert ncalls == 2 # called only once for each unique set of arguments
assert square(x=3) == 9
assert ncalls == 2 # only the resulting bindings matter, not how you pass the args
# "memoize lambda": classic evaluate-at-most-once thunk
# See also the `lazy[]` macro.
thunk = memoize(lambda: print("hi from thunk"))
thunk() # the message is printed only the first time
thunk()
Changed in v0.15.0. curry
supports both positional and named arguments, and binds arguments to function parameters like Python itself does. The call triggers when all parameters are bound, regardless of whether they were passed by position or by name, and at which step of the currying process they were passed.
unpythonic
's multiple-dispatch system (@generic
, @typed
) is supported. curry
looks for an exact match first, then a match with extra args/kwargs, and finally a partial match. If there is still no match, this implies that at least one parameter would get a binding that fails the type check. In such a case TypeError
regarding failed multiple dispatch is raised.
If the function being curried is @generic
or @typed
, or has type annotations on its parameters, the parameters being passed in are type-checked. A type mismatch immediately raises TypeError
. This helps support fail-fast in code using curry
.
Currying is a technique in functional programming, where a function that takes multiple arguments is converted to a sequence of nested one-argument functions, each one specializing (fixing the value of) the leftmost remaining positional parameter. Each such function returns another function that takes the next parameter. The last function, when no more parameters remain, then performs the actual computation and returns the result.
Some languages, such as Haskell, curry all functions natively. In languages that do not, like Python or Racket, when currying is implemented as a library function, this is often done as a form of partial application, which is a subtly different concept, but encompasses the curried behavior as a special case. In practice this means that you can pass several arguments in a single step, and the original function will be called when all parameters have been bound.
Our curry
can be used both as a decorator and as a regular function. As a decorator, curry
takes no decorator arguments. As a regular function, curry
itself is curried à la Racket. If any args or kwargs are given (beside the function to be curried), they are the first step. This helps eliminate many parentheses.
CAUTION: If the signature of f
cannot be inspected, currying fails, raising ValueError
, like inspect.signature
does. This may happen with builtins such as list.append
, operator.add
, print
, or range
, depending on which version of Python is used (and whether it is CPython or PyPy3).
Like Haskell, and spicy
for Racket, our curry
supports passthrough; but we pass through both positional and named arguments.
Any args and/or kwargs that are incompatible with the target function's call signature, are passed through in the sense that the function is called with the args and kwargs compatible with its call signature, and then its return value is merged with the remaining args and kwargs.
If the first positional return value of the result of passthrough is callable, it is (curried and) invoked on the remaining args and kwargs, after the merging. This helps with some instances of point-free style.
Some finer points concerning the passthrough feature:
-
Incompatible means too many positional args, or named args that have no corresponding parameter. Note that if the function has a
**kwargs
parameter, then all named args are considered compatible, because it absorbs anything. -
Multiple return values (both positional and named) are denoted using
Values
(which see). A standard return value is considered to consist of one positional return value only (even if it is atuple
). -
Extra positional args are passed through on the right. Any positional return values of the curried function are prepended, on the left.
-
Extra named args are passed through by name. They may be overridden by named return values (with the same name) from the curried function.
-
If more args/kwargs are still remaining when the top-level curry context exits, by default
TypeError
is raised.- To override this behavior, set the dynvar
curry_context
. It is a list representing the stack of currently active curry contexts. A context is any object, a human-readable label is fine. See below for an example.- To set the dynvar,
from unpythonic import dyn
, and thenwith dyn.let(curry_context=["whatever"]):
.
- To set the dynvar,
- To override this behavior, set the dynvar
Examples:
from operator import add, mul
from unpythonic import curry, foldl, foldr, composer, to1st, cons, nil, ll, dyn, Values
mysum = curry(foldl, add, 0)
myprod = curry(foldl, mul, 1)
a = ll(1, 2)
b = ll(3, 4)
c = ll(5, 6)
append_two = lambda a, b: foldr(cons, b, a)
append_many = lambda *lsts: foldr(append_two, nil, lsts) # see unpythonic.lappend
assert mysum(append_many(a, b, c)) == 21
assert myprod(b) == 12
# curry with passthrough
double = lambda x: 2 * x
with dyn.let(curry_context=["whatever"]): # set a context to allow passthrough to the top level
# positionals are passed through on the right
assert curry(double, 2, "foo") == Values(4, "foo") # arity of double is 1
# named args are passed through by name
assert curry(double, 2, nosucharg="foo") == Values(4, nosucharg="foo")
# actual use case for passthrough
map_one = lambda f: curry(foldr, composer(cons, to1st(f)), nil)
doubler = map_one(double)
assert doubler((1, 2, 3)) == ll(2, 4, 6)
assert curry(map_one, double, ll(1, 2, 3)) == ll(2, 4, 6)
We could also write the last example as:
from unpythonic import curry, foldl, composer, const, to1st, nil, lreverse
double = lambda x: 2 * x
rmap_one = lambda f: curry(foldl, composer(cons, to1st(f)), nil) # essentially reversed(map(...))
map_one = lambda f: composer(rmap_one(f), lreverse)
assert curry(map_one, double, ll(1, 2, 3)) == ll(2, 4, 6)
which may be a useful pattern for lengthy iterables that could overflow the call stack (although not in foldr
, since our implementation uses a linear process).
In the example, in rmap_one
, we can use either curry
or partial
. In this case it does not matter which, since we want just one partial application anyway. We provide two arguments, and the minimum arity of foldl
is 3, so curry
will trigger the call as soon as (and only as soon as) it gets at least one more argument.
The final curry
in the example uses the passthrough features. The function map_one
has arity 1, but two positional arguments are given. It also invokes a call to the callable returned by map_one
, with the remaining arguments (in this case just one, the ll(1, 2, 3)
).
Yet another way to write map_one
is:
from unpythonic import curry, foldr, composer, cons, nil
mymap = lambda f: curry(foldr, composer(cons, curry(f)), nil)
The curried f
uses up one argument (provided it is a one-argument function!), and the second argument is passed through on the right; these two values then end up as the arguments to cons
.
Using a currying compose function (name suffixed with c
), we can drop the inner curry:
from unpythonic import curry, foldr, composerc, cons, nil
mymap = lambda f: curry(foldr, composerc(cons, f), nil)
myadd = lambda a, b: a + b
assert curry(mymap, myadd, ll(1, 2, 3), ll(2, 4, 6)) == ll(3, 6, 9)
This is as close to (define (map f) (foldr (compose cons f) empty)
(in #lang
spicy
) as we're gonna get in pure Python.
Notice how the last two versions accept multiple input iterables; this is thanks to currying f
inside the composition. An element from each of the iterables is taken by the processing function f
. Being the last argument, acc
is passed through on the right. The output from the processing function - one new item - and acc
then become two arguments, passed into cons.
Finally, keep in mind the mymap
example is intended as a feature demonstration. In production code, the builtin map
is much better. It produces a lazy iterable, so it does not care which kind of actual data structure the items will be stored in (once they are computed). In other words, a lazy iterable is a much better model for a process that produces a sequence of values; how, and whether, to store that sequence is an orthogonal concern.
The example we have here evaluates all items immediately, and specifically produces a linked list. It is just a nice example of function composition involving incompatible positional arities, thus demonstrating the kind of situation where the passthrough feature of curry
is useful. It is taken from a paper by John Hughes (1984).
Our curry
, beside what it says on the tin, is effectively an explicit local modifier to Python's reduction rules, which allows some Haskell-like idioms. Let's consider a simple example with positional arguments only. When we say:
curry(f, a0, a1, ..., a[n-1])
it means the following. Let m1
and m2
be the minimum and maximum positional arity of the callable f
, respectively.
- If
n > m2
, callf
with the firstm2
arguments.- If the result is a callable, curry it, and recurse.
- Else form a tuple, where first item is the result, and the rest are the remaining arguments
a[m2]
,a[m2+1]
, ...,a[n-1]
. Return it.- If more positional args are still remaining when the top-level curry context exits, by default
TypeError
is raised. Use the dynvarcurry_context
to override; see above for an example.
- If more positional args are still remaining when the top-level curry context exits, by default
- If
m1 <= n <= m2
, callf
and return its result (like a normal function call).- Any positional arity accepted by
f
triggers the call; beware when working with variadic functions.
- Any positional arity accepted by
- If
n < m1
, partially applyf
to the given arguments, yielding a new function with smallerm1
,m2
. Then curry the result and return it.- Internally we stack
functools.partial
applications, but there will be only onecurried
wrapper no matter how many invocations are used to build up arguments beforef
eventually gets called.
- Internally we stack
As of v0.15.0, the actual algorithm by which curry
decides what to do, in the presence of kwargs, @generic
functions, and Values
multiple-return-values (and named return values), is:
- If
f
is not@generic
or@typed
:- Compute parameter bindings of the args and kwargs collected so far, against the call signature of
f
.- Note we keep track of which arguments were passed positionally and which by name. To avoid subtle errors, they are eventually passed to
f
the same way they were passed tocurry
. (Positional args are passed positionally, and kwargs are passed by name.)
- Note we keep track of which arguments were passed positionally and which by name. To avoid subtle errors, they are eventually passed to
- If there are no unbound parameters, and no args/kwargs are left over, we have an exact match. Call
f
and return its result, like a normal function call.- Any sequence of curried calls that ends up binding all parameters of
f
triggers the call. - Beware when working with variadic functions. Particularly, keep in mind that
*args
matches zero or more positional arguments (as the Kleene star-ish notation indeed suggests).
- Any sequence of curried calls that ends up binding all parameters of
- If there are no unbound parameters, but there are args/kwargs left over, arrange passthrough for the leftover args/kwargs (that were rejected by the call signature of
f
), and callf
. Any leftover positional arguments are passed through on the right.- Merge the return value of
f
with the leftover args/kwargs, thus forming updated leftover args/kwargs.- If the return value of
f
is aValues
: prepend positional return values into the leftover args (i.e. insert them on the left), and update the leftover kwargs with the named return values. (I.e. a key name conflict causes an overwrite in the leftover kwargs.) - Else: there is just one positional return value. Prepend it to the leftover args.
- If the return value of
- If the first positional return value is a callable: remove it from the leftover args, curry it, and recurse with the (updated) leftover args/kwargs.
- Else: form a
Values
from the leftover args/kwargs, and return it. (This return goes to the next outer curry context, or at the top level, to the original caller.)
- Merge the return value of
- If neither of the above match, we know there is at least one unbound parameter, i.e. we have a partial match. Keep currying.
- Compute parameter bindings of the args and kwargs collected so far, against the call signature of
- If
f
is@generic
or@typed
:- Iterate over multimethods registered on
f
, up to three times. - First, try for an exact match that passes the type check. If any such match is found, pick that multimethod. Call it and return its result (as above).
- Then, try for a match that passes the type check, but has extra args/kwargs. If any such match is found, pick that multimethod. Arrange passthrough... (as above).
- Then, try for a partial match that passes the type check. If any such match is found, keep currying.
- If none of the above match, it implies that no matter which multimethod we pick, at least one parameter will get a binding that fails the type check. Raise
TypeError
.
- Iterate over multimethods registered on
If interested in the gritty details, see the source code of unpythonic.curry
, in the module unpythonic.fun
. It calls some functions from the module unpythonic.dispatch
for its @generic
support, but otherwise it is pretty much self-contained.
Getting back to the simple case, in the above example:
curry(mapl_one, double, ll(1, 2, 3))
the callable mapl_one
takes one argument, which is a function. It returns another function, let us call it g
. We are left with:
curry(g, ll(1, 2, 3))
The remaining argument is then passed into g
; we obtain a result, and reduction is complete.
A curried function is also a curry context:
add2 = lambda x, y: x + y
a2 = curry(add2)
a2(a, b, c) # same as curry(add2, a, b, c); reduces to (a + b, c)
so on the last line, we do not need to say
curry(a2, a, b, c)
because a2
is already curried. Doing so does no harm, though; curry
automatically prevents stacking curried
wrappers:
curry(a2) is a2 # --> True
If we wish to modify precedence, parentheses are needed, which takes us out of the curry context, unless we explicitly curry
the subexpression. This works:
curry(f, a, curry(g, x, y), b, c)
but this does not:
curry(f, a, (g, x, y), b, c)
because (g, x, y)
is just a tuple of g
, x
and y
. This is by design; as with all things Python, explicit is better than implicit.
Note: to code in curried style, a contract system or a type checker can be useful. Also, be careful with variadic functions, because any allowable arity will trigger the call.
(The map
function in the standard library is a particular offender here, since it requires at least one iterable to actually do anything but raise TypeError
, but its call signature suggests it can be called without any iterables. Hence, for curry-friendliness we provide a wrapper unpythonic.map
that requires at least one iterable.)
-
Contract systems for Python include icontract and PyContracts.
-
For static type checking, consider mypy.
-
For run-time type checking, consider
@typed
or@generic
right here inunpythonic
. -
You can also just use Python's type annotations;
unpythonic
'scurry
type-checks the arguments before accepting the curried function. The annotations work if the stdlib functiontyping.get_type_hints
can find them.
The name fix
comes from the least fixed point with respect to the definedness relation, which is related to Haskell's fix
function. However, this fix
is not that function. Our fix
breaks recursion cycles in strict functions - thus causing some non-terminating strict functions to return. (Here strict means that the arguments are evaluated eagerly.)
CAUTION: Worded differently, this function solves a small subset of the halting problem. This should be hint enough that it will only work for the advertised class of special cases - i.e., a specific kind of recursion cycles.
If you need fix
for code that uses TCO, use fixtco
. The implementations of recursion cycle breaking and TCO must interact in a very particular way to work properly; this is done by fixtco
.
For examples, see the unit tests.
Usage:
from unpythonic import fix, identity
@fix()
def f(...):
...
result = f(23, 42) # start a computation with some args
@fix(bottom=identity)
def f(...):
...
result = f(23, 42)
If no recursion cycle occurs, f
returns normally. If a cycle occurs, the call to f
is aborted (dynamically, when the cycle is detected), and:
-
In the first example, the default special value
typing.NoReturn
is returned. -
In the latter example, the name
"f"
and the offending args are returned.
A cycle is detected when f
is called again with a set of args that have already been previously seen in the current call chain. Infinite mutual recursion is detected too, at the point where any @fix
-instrumented function is entered again with a set of args already seen during the current call chain.
CAUTION: The infinitely recursive call sequence f(0) → f(1) → ... → f(k+1) → ...
contains no cycles in the sense detected by fix
. The fix
function will not catch all cases of infinite recursion, but only those where a previously seen set of arguments is seen again. If f
is pure, the same arguments appearing again during recursion implies the call will not return, so we can terminate it.
CAUTION: If we have a function g(a, b)
, the argument lists of the invocations g(1, 2)
and g(a=1, b=2)
are in principle different. However, we bind arguments like Python itself does, and consider the resulting bindings only. It does not matter how the arguments were passed.
We can use fix
to find the (arithmetic) fixed point of cos
:
from math import cos
from unpythonic import fix, identity
# let's use this as the `bottom` callable:
def justargs(funcname, *args): # bottom doesn't need to accept kwargs if f doesn't.
return identity(*args) # identity unpacks if just one
@fix(justargs)
def cosser(x):
return cosser(cos(x))
c = cosser(1) # provide starting value
assert c == cos(c) # 0.7390851332151607
This works because the fixed point of cos
is attractive (see the Banach fixed point theorem). The general pattern to find an attractive fixed point with this strategy is:
from functools import partial
from unpythonic import fix, identity
def justargs(funcname, *args):
return identity(*args)
# setting `bottom=justargs` discards the name "iterate1_rec" from the bottom return value
@fix(justargs)
def iterate1_rec(f, x):
return iterate1_rec(f, f(x))
def fixpoint(f, x0):
effer = partial(iterate1_rec, f)
# f ends up in the return value because it's in the args of iterate1_rec.
_, c = effer(x0)
return c
from math import cos
c = fixpoint(cos, x0=1)
assert c == cos(c)
NOTE: See unpythonic.fixpoint
, which is meant specifically for finding arithmetic fixed points, and unpythonic.iterate1
, which produces a generator that iterates f
without needing recursion.
Notes:
-
Our
fix
is a parametric decorator with the signaturedef fix(bottom=typing.NoReturn, memo=True):
. -
f
must be pure for this to make sense. -
All args of
f
must be hashable, for technical reasons. -
The
bottom
parameter (named after the empty type ⊥) specifies the final return value to be returned when a recursion cycle is detected in a call tof
.The default is the special value
typing.NoReturn
, which represents ⊥ in Python. If you just want to detect that a cycle occurred, this is usually fine.When bottom is returned, it means the collected evidence shows that if we were to let
f
continue forever, the call would not return. -
bottom
can be any non-callable value, in which case it is simply returned upon detection of a cycle. -
bottom
can be a callable, in which case the function name and args at the point where the cycle was detected are passed to it, and its return value becomes the final return value. This is useful e.g. for debug logging.The function name is provided, because we catch also infinite mutual recursion; so it can be a useful piece of information which function it was that was first called with already-seen arguments.
-
The
memo
flag controls whether to memoize intermediate results. It adds some additional function call layers between function entries from recursive calls; if that is a problem (due to causing Python's call stack to blow up faster), usememo=False
. You can still memoize the final result if you want; just put@memoize
on the outside.
This kind of fix
is sometimes helpful in recursive pattern-matching definitions for parsers. When the pattern matcher gets stuck in an infinite left-recursion, it can return a customizable special value instead of not terminating. Being able to not care about non-termination may simplify definitions.
This fix
can also be used to find arithmetic fixed points of functions, as in the above examples.
The idea comes from Matthew Might's article on parsing with (Brzozowski's) derivatives, where it was a utility implemented in Racket as the define/fix
form. It was originally ported to Python by Per Vognsen (linked from the article). The fix
in unpythonic
is a redesign with kwargs support, thread safety, and TCO support.
In Haskell, the function named fix
computes the least fixed point with respect to the definedness ordering. For any strict f
, we have fix f = ⊥
. Why? If f
is strict, f(⊥) = ⊥
(does not terminate), so ⊥
is a fixed point. On the other hand, ⊥
means also undefined
, describing a value about which nothing is known. So it is the least fixed point in this sense.
Haskell's fix
is related to the Y combinator; it is essentially the idea of recursion packaged into a higher-order function. The name in unpythonic
for the Y combinator idea is withself
, allowing a lambda to refer to itself by passing in the self-reference from the outside.
A simple way to explain Haskell's fix
is:
fix f = let x = f x in x
so anywhere the argument is referred to in the definition of f
, it is replaced by another application of f
, recursively. This obviously yields a notation useful for corecursively defining infinite lazy lists.
For more, see [1] [2] [3] [4] [5].
unpack
: lazily unpack an iterable. Suitable for infinite inputs.- Return the first
n
items and thek
th tail, in a tuple. Default isk = n
. - Use
k > n
to fast-forward, consuming the skipped items. Works bydrop
. - Use
k < n
to peek without permanently extracting an item. Works by teeing; plan accordingly.
- Return the first
- fold, scan, unfold:
foldl
,foldr
with support for multiple input iterables, like in Racket.- Like in Racket,
op(elt, acc)
; general caseop(e1, e2, ..., en, acc)
. Note Python's ownfunctools.reduce
uses the orderingop(acc, elt)
instead. - No sane default for multi-input case, so the initial value for
acc
must be given. - One-input versions with optional init are provided as
reducel
,reducer
, with semantics similar to Python'sfunctools.reduce
, but with the rackety orderingop(elt, acc)
. - By default, multi-input folds terminate on the shortest input. To instead terminate on the longest input, use the
longest
andfillvalue
kwargs. - For multiple inputs with different lengths,
foldr
syncs the left ends. rfoldl
,rreducel
reverse each input and then left-fold. This syncs the right ends.
- Like in Racket,
scanl
,scanr
: scan (a.k.a. accumulate, partial fold); a lazy fold that returns a generator yielding intermediate results.scanl
is suitable for infinite inputs.- Iteration stops after the final result.
- For
scanl
, this is whatfoldl
would have returned (if the fold terminates at all, i.e. if the shortest input is finite). - For
scanr
, ordering of output is different from Haskell: we yield the results in the order they are computed (via a linear process).
- For
- Multiple input iterables and shortest/longest termination supported; same semantics as in
foldl
,foldr
. - One-input versions with optional init are provided as
scanl1
,scanr1
. Note ordering of arguments to matchfunctools.reduce
, but op is still the racketyop(elt, acc)
. rscanl
,rscanl1
reverse each input and then left-scan. This syncs the right ends.
unfold1
,unfold
: generate a sequence corecursively. The counterpart offoldl
.unfold1
is for 1-in-2-out functions. The input isstate
, the return value must be(value, newstate)
orNone
.unfold
is for n-in-(1+n)-out functions.- Changed in v0.15.0. The initial args/kwargs are unpacked to the args/kwargs of the user function. The function must return a
Values
object, where the first positional return value is the value to yield, and anything else is unpacked to the args/kwargs of the user function at the next iteration.
- Changed in v0.15.0. The initial args/kwargs are unpacked to the args/kwargs of the user function. The function must return a
- Unfold returns a generator yielding the collected values. The output can be finite or infinite; to signify that a finite sequence ends, the user function must return
None
. (Beside aValues
object, a bareNone
is the only other allowed return value from the user function.)
- mapping and zipping:
map_longest
: the final missing battery formap
.- Essentially
starmap(func, zip_longest(*iterables))
, so it's spanned byitertools
, but it's convenient to have a named shorthand to do that.
- Essentially
rmap
,rzip
,rmap_longest
,rzip_longest
: reverse each input, then map/zip. For multiple inputs, syncs the right ends.mapr
,zipr
,mapr_longest
,zipr_longest
: map/zip, then reverse the result. For multiple inputs, syncs the left ends.map
: curry-friendly wrapper for the builtin, making it mandatory to specify at least one iterable. Added in v0.14.2.
- windowing, chunking, and similar:
window
: sliding length-n window iterator for general iterables. Acts like the well-known n-gram zip trick, but the input can be any iterable. Changed in v0.15.0. Parameter ordering is nowwindow(n, iterable)
, to make it curry-friendly.chunked
: split an iterable into constant-length chunks. Added in v0.14.2.pad
: extend an iterable to length at leastn
with afillvalue
. Added in v0.14.2.interleave
: interleave items from several iterables:interleave(a, b, c)
→a0, b0, c0, a1, b1, c1, ...
until the next item does not exist. Added in v0.14.2.- This differs from
zip
in that the output is flattened, and the termination condition is checked after each item. So e.g.interleave(['a', 'b', 'c'], ['+', '*'])
→['a', '+', 'b', '*', 'c']
(the actual return value is a generator, not a list).
- This differs from
- flattening:
flatmap
: map a function, that returns a list or tuple, over an iterable and then flatten by one level, concatenating the results into a single tuple.- Essentially,
composel(map(...), flatten1)
; the same thing the bind operator of the List monad does.
- Essentially,
flatten1
,flatten
,flatten_in
: remove nested list structure.flatten1
: outermost level only.flatten
: recursive, with an optional predicate that controls whether to flatten a given sublist.flatten_in
: recursive, with an optional predicate; but recurse also into items which don't match the predicate.
- extracting items, subsequences:
take
,drop
,split_at
: based onitertools
recipes.- Especially useful for testing generators.
islice
is maybe more pythonic thantake
anddrop
; it enables slice syntax for any iterable.
tail
: return the tail of an iterable. Same asdrop(1, iterable)
; common use case.butlast
,butlastn
: return a generator that yields from iterable, dropping the lastn
items if the iterable is finite. Inspired by a similar utility in PG's On Lisp.- Works by using intermediate storage. Do not use the original iterator after a call to
butlast
orbutlastn
.
- Works by using intermediate storage. Do not use the original iterator after a call to
lastn
: yield the lastn
items from an iterable. Works by intermediate storage. Will not terminate for infinite iterables. Added in v0.14.2.first
,second
,nth
,last
: return the specified item from an iterable. Any preceding items are consumed at C speed.partition
fromitertools
recipes.find
: return the first item that matches a predicate. Convenience function; if you need them all, just usefilter
or a comprehension. Added in v0.14.2.running_minmax
,minmax
: Extract both min and max in one pass over an iterable. Therunning_
variant is a scan and returns a generator; the just-give-me-the-final-result variant is a fold. Added in v0.14.2.
- math-related:
within
: yield items from iterable until successive iterates are close enough. Useful with Cauchy sequences. Added in v0.14.2.prod
: like the builtinsum
, but compute the product. Oddly missing from the standard library.iterate1
,iterate
: return an infinite generator that yieldsx
,f(x)
,f(f(x))
, ...iterate1
is for 1-to-1 functions.iterate
is for n-to-n, unpacking the return value to the args/kwargs of the next call.- Changed in v0.15.0. In the n-to-n version, now the user function must return a
Values
object in the same shape as it accepts args and kwargs. ThisValues
object is thex
that is yielded at each iteration.
- Changed in v0.15.0. In the n-to-n version, now the user function must return a
- miscellaneous:
uniqify
,uniq
: remove duplicates (either all or consecutive only, respectively), preserving the original ordering of the items.rev
is a convenience function that triesreversed
, and if the input was not a sequence, converts it to a tuple and reverses that. The return value is areversed
object.scons
: prepend one element to the start of an iterable, return new iterable.scons(x, iterable)
is lispy shorthand foritertools.chain((x,), iterable)
, allowing to omit the one-item tuple wrapper. The name is an abbreviation ofstream-cons
.inn
: contains-check (x in iterable
) with automatic termination for monotonic divergent infinite iterables.- Only applicable to monotonic divergent inputs (such as
primes
). Increasing/decreasing is auto-detected from the first non-zero diff, but the function may fail to terminate if the input is actually not monotonic, or has an upper/lower bound.
- Only applicable to monotonic divergent inputs (such as
iindex
: likelist.index
, but for a general iterable. Consumes the iterable, so only makes sense for memoized inputs.CountingIterator
: useCountingIterator(iterable)
instead ofiter(iterable)
to produce an iterator that, as a side effect, counts how many items have been yielded. The count is stored in the.count
attribute. Added in v0.14.2.slurp
: extract all items from aqueue.Queue
(until it is empty) to a list, returning that list. Added in v0.14.2.subset
: test whether an iterable is a subset of another. Added in v0.14.3.powerset
: yield the power set (set of all subsets) of an iterable. Works also for potentially infinite iterables, if only a finite prefix is ever requested. (But beware, both runtime and memory usage are exponential in the input size.) Added in v0.14.2.allsame
: test whether all elements of an iterable are the same. Sometimes useful in writing testing code. Added in v0.14.3.
Examples:
from functools import partial
from itertools import count, takewhile
from operator import add, mul
from unpythonic import (scanl, scanr, foldl, foldr,
mapr, zipr, rmap, rzip, identity,
uniqify, uniq,
flatten1, flatten, flatten_in, flatmap,
take, drop,
unfold, unfold1,
unpack,
cons, nil, ll, curry,
imemoize, gmemoize,
s, inn, iindex, find,
partition, partition_int,
window,
subset, powerset,
allsame,
Values)
assert tuple(scanl(add, 0, range(1, 5))) == (0, 1, 3, 6, 10)
assert tuple(scanr(add, 0, range(1, 5))) == (0, 4, 7, 9, 10)
assert tuple(scanl(mul, 1, range(2, 6))) == (1, 2, 6, 24, 120)
assert tuple(scanr(mul, 1, range(2, 6))) == (1, 5, 20, 60, 120)
assert tuple(scanl(cons, nil, ll(1, 2, 3))) == (nil, ll(1), ll(2, 1), ll(3, 2, 1))
assert tuple(scanr(cons, nil, ll(1, 2, 3))) == (nil, ll(3), ll(2, 3), ll(1, 2, 3))
def step2(k): # x0, x0 + 2, x0 + 4, ...
return (k, k + 2) # value, newstate
assert tuple(take(10, unfold1(step2, 10))) == (10, 12, 14, 16, 18, 20, 22, 24, 26, 28)
def nextfibo(a, b):
# First positional return value is the value to yield.
# Everything else is newstate, to be unpacked to `nextfibo`'s
# args/kwargs at the next iteration.
return Values(a, a=b, b=a + b)
assert tuple(take(10, unfold(nextfibo, 1, 1))) == (1, 1, 2, 3, 5, 8, 13, 21, 34, 55)
def fibos():
a, b = 1, 1
while True:
yield a
a, b = b, a + b
a1, a2, a3, tl = unpack(3, fibos())
a4, a5, tl = unpack(2, tl)
print(a1, a2, a3, a4, a5, tl) # --> 1 1 2 3 5 <generator object fibos at 0x7fe65fb9f798>
# inn: contains-check with automatic termination for monotonic iterables (infinites ok)
evens = imemoize(s(2, 4, ...))
assert inn(42, evens())
assert not inn(41, evens())
@gmemoize
def primes(): # FP sieve of Eratosthenes
yield 2
for n in count(start=3, step=2):
if not any(n % p == 0 for p in takewhile(lambda x: x*x <= n, primes())):
yield n
assert inn(31337, primes())
assert not inn(1337, primes())
# partition: split an iterable according to a predicate
iseven = lambda x: x % 2 == 0
assert [tuple(it) for it in partition(iseven, range(10))] == [(1, 3, 5, 7, 9), (0, 2, 4, 6, 8)]
# CAUTION: not to be confused with:
# partition_int: split a small positive integer, in all possible ways, into smaller integers that sum to it
assert tuple(partition_int(4)) == ((4,), (3, 1), (2, 2), (2, 1, 1), (1, 3), (1, 2, 1), (1, 1, 2), (1, 1, 1, 1))
assert all(sum(terms) == 10 for terms in partition_int(10))
# iindex: find index of item in iterable (mostly only makes sense for memoized input)
assert iindex(2, (1, 2, 3)) == 1
assert iindex(31337, primes()) == 3378
# find: return first item matching predicate (convenience function)
gen = (x for x in range(5))
assert find(lambda x: x >= 3, gen) == 3
assert find(lambda x: x >= 3, gen) == 4 # if consumable, consumed as usual
# window: length-n sliding window iterator for general iterables
lst = (x for x in range(5))
out = []
for a, b, c in window(3, lst):
out.append((a, b, c))
assert out == [(0, 1, 2), (1, 2, 3), (2, 3, 4)]
# subset
assert subset([1, 2, 3], [1, 2, 3, 4, 5])
assert subset({"cat"}, {"cat", "lynx"})
# power set (set of all subsets) of an iterable
assert tuple(powerset(range(3))) == ((0,), (1,), (0, 1), (2,), (0, 2), (1, 2), (0, 1, 2))
r = range(10)
assert all(subset(s, r) for s in powerset(r))
# test whether all elements of a finite iterable are the same
assert allsame(())
assert allsame((1,))
assert allsame((8, 8, 8, 8, 8))
assert not allsame((1, 2, 3))
# flatmap
def msqrt(x): # multivalued sqrt
if x == 0.:
return (0.,)
else:
s = x**0.5
return (s, -s)
assert tuple(flatmap(msqrt, (0, 1, 4, 9))) == (0., 1., -1., 2., -2., 3., -3.)
# **CAUTION**: zip and reverse do NOT commute for inputs with different lengths:
assert tuple(zipr((1, 2, 3), (4, 5, 6), (7, 8))) == ((2, 5, 8), (1, 4, 7)) # zip first
assert tuple(rzip((1, 2, 3), (4, 5, 6), (7, 8))) == ((3, 6, 8), (2, 5, 7)) # reverse first
# zipr syncs *left* ends, then iterates *from the right*.
assert tuple(zipr((1, 2, 3), (4, 5, 6), (7, 8))) == ((2, 5, 8), (1, 4, 7))
# so does mapr.
zipr2 = partial(mapr, identity)
assert tuple(zipr2((1, 2, 3), (4, 5, 6), (7, 8))) == (Values(2, 5, 8), Values(1, 4, 7))
# rzip syncs *right* ends, then iterates from the right.
assert tuple(rzip((1, 2, 3), (4, 5, 6), (7, 8))) == ((3, 6, 8), (2, 5, 7))
# so does rmap.
rzip2 = partial(rmap, identity)
assert tuple(rzip2((1, 2, 3), (4, 5, 6), (7, 8))) == (Values(3, 6, 8), Values(2, 5, 7))
# foldr syncs *left* ends, then collects results from the right:
def zipper(*args):
*rest, acc = args
return acc + (tuple(rest),)
zipr1 = curry(foldr, zipper, ())
assert zipr1((1, 2, 3), (4, 5, 6), (7, 8)) == ((2, 5, 8), (1, 4, 7))
# so the result is tuple(rev(zip(...))), whereas rzip gives tuple(zip(*(rev(s) for s in ...)))
assert tuple(uniqify((1, 1, 2, 2, 2, 1, 2, 2, 4, 3, 4, 3, 3))) == (1, 2, 4, 3) # all
assert tuple(uniq((1, 1, 2, 2, 2, 1, 2, 2, 4, 3, 4, 3, 3))) == (1, 2, 1, 2, 4, 3, 4, 3) # consecutive
assert tuple(flatten1(((1, 2), (3, (4, 5), 6), (7, 8, 9)))) == (1, 2, 3, (4, 5), 6, 7, 8, 9)
assert tuple(flatten(((1, 2), (3, (4, 5), 6), (7, 8, 9)))) == (1, 2, 3, 4, 5, 6, 7, 8, 9)
is_nested = lambda sublist: all(isinstance(x, (list, tuple)) for x in sublist)
assert tuple(flatten((((1, 2), (3, 4)), (5, 6)), is_nested)) == ((1, 2), (3, 4), (5, 6))
data = (((1, 2), ((3, 4), (5, 6)), 7), ((8, 9), (10, 11)))
assert tuple(flatten(data, is_nested)) == (((1, 2), ((3, 4), (5, 6)), 7), (8, 9), (10, 11))
assert tuple(flatten_in(data, is_nested)) == (((1, 2), (3, 4), (5, 6), 7), (8, 9), (10, 11))
Added in v0.14.2.
While all other pure-Python features of unpythonic
live in the main unpythonic
package, the network-related features are placed in the subpackage unpythonic.net
. This subpackage also contains the REPL server and client for hot-patching live processes.
unpythonic.net.msg
: A simplistic message protocol for sending message data over a stream-based transport, such as TCP.unpythonic.net.ptyproxy
: Proxy between a Linux PTY and a network socket. Useful for serving terminal utilities over the network. The selling point is this does not usepty.spawn
, so it can be used for proxying also Python libraries that expect to run in a terminal.unpythonic.net.util
: Miscellaneous small utilities.
For a usage example of unpythonic.net.ptyproxy
, see the source code of unpythonic.net.server
.
More details can be found in the docstrings.
The problem with stream-based transports, such as network sockets, is that they have no concept of a message boundary [1] [2] [3]. This is where a message protocol comes in. We provide a sans-io implementation of a minimalistic message protocol that adds rudimentary message framing and stream re-synchronization. Example:
from io import BytesIO, SEEK_SET
from unpythonic.net.msg import encodemsg, MessageDecoder
from unpythonic.net.util import bytessource, streamsource # have also socketsource
# Encode a message:
rawdata = b"hello world"
message = encodemsg(rawdata)
# Decode a message:
decoder = MessageDecoder(bytessource(message))
assert decoder.decode() == b"the quick brown fox jumps over the lazy dog"
assert decoder.decode() is None # The message is consumed by the first decode.
bio = BytesIO()
bio.write(encodemsg(b"hello world"))
bio.write(encodemsg(b"hello again"))
bio.seek(0, SEEK_SET)
# A streamsource accepts any byte stream, such as BytesIO,
# and files opened with open().
decoder = MessageDecoder(streamsource(bio))
assert decoder.decode() == b"hello world"
assert decoder.decode() == b"hello again"
assert decoder.decode() is None
# If junk arrives between messages, the protocol automatically
# re-synchronizes when retrieving the next message.
bio = BytesIO()
bio.write(encodemsg(b"cat"))
bio.write(b"junk junk junk") # not a message!
bio.write(encodemsg(b"mew"))
bio.seek(0, SEEK_SET)
decoder = MessageDecoder(streamsource(bio))
assert decoder.decode() == b"cat"
assert decoder.decode() == b"mew"
assert decoder.decode() is None
Changed in v0.14.2. Added support for negative start
and stop
.
Slice any iterable, using the regular slicing syntax:
from unpythonic import islice, primes, s
p = primes()
assert tuple(islice(p)[10:15]) == (31, 37, 41, 43, 47)
assert tuple(islice(primes())[10:15]) == (31, 37, 41, 43, 47)
p = primes()
assert islice(p)[10] == 31
odds = islice(s(1, 2, ...))[::2]
assert tuple(islice(odds)[:5]) == (1, 3, 5, 7, 9)
assert tuple(islice(odds)[:5]) == (11, 13, 15, 17, 19) # five more
As a convenience feature: a single index is interpreted as a length-1 islice
starting at that index. The slice is then immediately evaluated and the item is returned.
The slicing variant calls itertools.islice
with the corresponding slicing parameters, after possibly converting negative start
and stop
to the appropriate positive values.
CAUTION: When using negative start
and/or stop
, the whole iterable is consumed to determine where it ends, if at all. Obviously, this will not terminate for infinite iterables. The desired elements are then held in an internal buffer until they are yielded by iterating over the islice
.
CAUTION: Keep in mind that negative step
is not supported, and that the slicing process consumes elements from the iterable.
Like fup
, our islice
is essentially a manually curried function with unusual syntax; the initial call to islice
passes in the iterable to be sliced. The object returned by the call accepts a subscript to specify the slice or index. Once the slice or index is provided, the call to itertools.islice
triggers.
Inspired by Python itself.
Changed in v0.15.0. The generator instances created by the gfuncs returned by gmemoize
, imemoize
, and fimemoize
, now support the __len__
and __getitem__
methods to access the already-yielded, memoized part. Asking for the len
returns the current length of the memo. For subscripting, both a single int
index and a slice are accepted. Note that memoized generators do not support all of the collections.abc.Sequence
API, because e.g. __contains__
and __reversed__
are missing, on purpose.
Make generator functions (gfunc, i.e. a generator definition) which create memoized generators, similar to how streams behave in Racket.
Memoize iterables; like itertools.tee
, but no need to know in advance how many copies of the iterator will be made. Provided for both iterables and for factory functions that make iterables.
gmemoize
is a decorator for a gfunc, which makes it memoize the instantiated generators.- If the gfunc takes arguments, they must be hashable. A separate memoized sequence is created for each unique set of argument values seen.
- For simplicity, the generator itself may use
yield
for output only;send
is not supported. - Any exceptions raised by the generator (except StopIteration) are also memoized, like in
memoize
. - Thread-safe. Calls to
next
on the memoized generator from different threads are serialized via a lock. Each memoized sequence has its own lock. This usesthreading.RLock
, so re-entering from the same thread (e.g. in recursively defined mathematical sequences) is fine. - The whole history is kept indefinitely. For infinite iterables, use this only if you can guarantee that only a reasonable number of terms will ever be evaluated (w.r.t. available RAM).
- Typically,
gmemoize
should be the outermost decorator if several are used on the same gfunc.
imemoize
: memoize an iterable. Likeitertools.tee
, but keeps the whole history, so more copies can be teed off later.- Same limitation: do not use the original iterator after it is memoized. The danger is that if anything other than the memoization mechanism advances the original iterator, some values will be lost before they can reach the memo.
- Returns a gfunc with no parameters which, when called, returns a generator that yields items from the memoized iterable. The original iterable is used to retrieve more terms when needed.
- Calling the gfunc essentially tees off a new instance, which begins from the first memoized item.
fimemoize
: convert a factory function, that returns an iterable, into the corresponding gfunc, andgmemoize
that. Return the memoized gfunc.- Especially convenient with short lambdas, where
(yield from ...)
instead of...
is just too much text. See example below.
- Especially convenient with short lambdas, where
from itertools import count, takewhile
from unpythonic import gmemoize, imemoize, fimemoize, take, nth
@gmemoize
def primes(): # FP sieve of Eratosthenes
yield 2
for n in count(start=3, step=2):
if not any(n % p == 0 for p in takewhile(lambda x: x*x <= n, primes())):
yield n
assert tuple(take(10, primes())) == (2, 3, 5, 7, 11, 13, 17, 19, 23, 29)
assert nth(3378, primes()) == 31337 # with memo, linear process; no crash
# but be careful:
31337 in primes() # --> True
1337 in takewhile(lambda p: p <= 1337, primes()) # not prime, need takewhile() to stop
# or use unpythonic.inn, which auto-terminates on monotonic iterables:
from unpythonic import inn
inn(31337, primes()) # --> True
inn(1337, primes()) # --> False
Memoizing only a part of an iterable. This is where imemoize
and fimemoize
can be useful. The basic idea is to make a chain of generators, and only memoize the last one:
from unpythonic import gmemoize, drop, last
def evens(): # the input iterable
yield from (x for x in range(100) if x % 2 == 0)
@gmemoize
def some_evens(n): # we want to memoize the result without the n first terms
yield from drop(n, evens())
assert last(some_evens(25)) == last(some_evens(25)) # iterating twice!
Using a lambda, we can also write some_evens
as:
se = gmemoize(lambda n: (yield from drop(n, evens())))
assert last(se(25)) == last(se(25))
Using fimemoize
, we can omit the yield from
, shortening this to:
se = fimemoize(lambda n: drop(n, evens()))
assert last(se(25)) == last(se(25))
If we don't need to take an argument, we can memoize the iterable directly, using imemoize
:
se = imemoize(drop(25, evens()))
assert last(se()) == last(se()) # se is a gfunc, so call it to get a generator instance
Finally, compare the fimemoize
example, rewritten using def
, to the original gmemoize
example:
@fimemoize
def some_evens(n):
return drop(n, evens())
@gmemoize
def some_evens(n):
yield from drop(n, evens())
The only differences are the name of the decorator and return
vs. yield from
. The point of fimemoize
is that in simple cases like this, it allows us to use a regular factory function that makes an iterable, instead of a gfunc. Of course, the gfunc could have several yield
expressions before it finishes, whereas the factory function terminates at the return
.
Changed in v0.15.0. Bug fixed: Now an infinite replacement sequence to pull items from is actually ok, as the documentation has always claimed.
We provide three layers, in increasing order of the level of abstraction: ShadowedSequence
, fupdate
, and fup
.
The class ShadowedSequence
is a bit like collections.ChainMap
, but for sequences, and only two levels (but it's a sequence; instances can be chained). It supports slicing (read-only), equality comparison, str
and repr
. Out-of-range read access to a single item emits a meaningful error, like in list
. We will not discuss ShadowedSequence
in more detail here, as it is a low-level tool; see its docstring for details.
The function fupdate
functionally updates sequences and mappings. Whereas ShadowedSequence
reads directly from the original sequences at access time, fupdate
makes a shallow copy, of the same type as the given input sequence, when it finalizes its output.
Finally, the function fup
provides a high-level API to functionally update a sequence, with nice syntax.
The preferred way to use fupdate
on sequences is through the fup
utility function, which specializes fupdate
to sequences, and adds support for Python's standard slicing syntax:
from unpythonic import fup
from itertools import repeat
tup = (1, 2, 3, 4, 5)
assert fup(tup)[3] << 42 == (1, 2, 3, 42, 5)
assert fup(tup)[0::2] << tuple(repeat(10, 3)) == (10, 2, 10, 4, 10)
assert fup(tup)[0::2] << repeat(10) == (10, 2, 10, 4, 10) # infinite replacement
Currently only one update specification is supported in a single fup()
. The low-level fupdate
function supports more; see below.
An update specification is a combination of where to update, and what to put there. The where part can be a single index or a slice. When it is a single index, the what is a single item; and when a slice, the what is a sequence or an iterable, which must contain at least as many items as are required to perform the update. For details, see fupdate
below.
The fup
function is essentially curried. It takes in the sequence to be functionally updated. The object returned by the call accepts a subscript to specify the index or indices. This then returns another object that accepts a left-shift to specify the values. Once the values are provided, the underlying call to fupdate
triggers, and the result is returned.
The notation follows the unpythonic
convention that <<
denotes an assignment of some sort. Here it denotes a functional update, which returns a modified copy, leaving the original untouched.
The fupdate
function itself, which is the next lower abstraction level, works as follows:
from unpythonic import fupdate
lst = [1, 2, 3]
out = fupdate(lst, 1, 42)
assert lst == [1, 2, 3] # the original remains untouched
assert out == [1, 42, 3]
lst = [1, 2, 3]
out = fupdate(lst, -1, 42) # negative indices are also supported
assert lst == [1, 2, 3]
assert out == [1, 2, 42]
Because the update is functional - i.e. the result is a new object, without mutating the original - immutable update target sequences are allowed. For example, we can replace a slice of a tuple by a sequence:
from itertools import repeat
tup = (1, 2, 3, 4, 5)
assert fupdate(tup, slice(0, None, 2), tuple(repeat(10, 3))) == (10, 2, 10, 4, 10)
assert fupdate(tup, slice(1, None, 2), tuple(repeat(10, 2))) == (1, 10, 3, 10, 5)
assert fupdate(tup, slice(None, None, 2), tuple(repeat(10, 3))) == (10, 2, 10, 4, 10)
assert fupdate(tup, slice(None, None, -1), range(5)) == (4, 3, 2, 1, 0)
Slicing supports negative indices and steps, and default starts, stops and steps, as usual in Python. Just remember a[start:stop:step]
actually means a[slice(start, stop, step)]
(with None
replacing omitted start
, stop
and step
), and everything should follow. Multidimensional arrays are not supported.
When fupdate
constructs its output, the replacement occurs by walking the input sequence left-to-right, and pulling an item from the replacement sequence when the given replacement specification so requires. Hence the replacement sequence is not necessarily accessed left-to-right. In the last example above, the range(5)
was read in the order 4, 3, 2, 1, 0
. This is because when slice(None, None, -1)
is applied to the input sequence, the first item of the input sequence is index 4
in the slice. So when replacing the first item, fupdate
looked up index 4
in the replacement sequence. Because the replacement was just range(5)
, the value at index 4
was also 4
.
The replacement sequence must have at least as many items as the slice requires, when the slice is applied to the original input sequence. Any extra items in the replacement sequence are simply ignored, but if the replacement is too short, IndexError
is raised.
The replacement must have __len__
and __getitem__
methods if the slice (when treated as explained above) requires reading the replacement backwards, and/or if you plan to iterate over the ShadowedSequence
multiple times. If the replacement only needs to be read forwards, AND you only plan to iterate over the ShadowedSequence
just once (e.g., as part of a fup
/fupdate
operation), then it is sufficient for the replacement to implement the collections.abc.Iterator
API only (i.e. just __iter__
and __next__
).
An infinite replacement causes fupdate
(and fup
) to pull as many items as are needed:
from itertools import repeat, count
from unpythonic import fup
tup = (1, 2, 3, 4, 5)
assert fup(tup)[::] << repeat(42) == (42, 42, 42, 42, 42)
assert fup(tup)[::] << count(start=10) == (10, 11, 12, 13, 14)
The rest of the infinite replacement is considered as extra items, and is ignored.
CAUTION: If converting existing code, be careful not to accidentally tuple(...)
an infinite replacement. Python will happily fill all available RAM and essentially crash your machine trying to exhaust the infinite generator.
If you need to reverse-walk the start of an infinite replacement: use imemoize(...)
on the original iterable, instantiate the generator, and use that generator instance as the replacement:
from itertools import count
from unpythonic import fup, imemoize
tup = (1, 2, 3, 4, 5)
assert fup(tup)[::-1] << imemoize(count(start=10))() == (14, 13, 12, 11, 10)
Just like above, due to the slice [::-1]
, fup
calculates that - when walking the input sequence left-to-right - it first needs to take the item at index 4
of the replacement. The fup
succeeds, because when it retrieves this fifth item, all of the first five items are stored in the memo (which is internally a sequence). So fup
can retrieve the fifth item, then the fourth, and so on - even though from the viewpoint of the original underlying iterable, the earlier items have already been consumed when the fifth item is accessed.
ShadowedSequence
(and thus also fupdate
and fup
) internally uses __getitem__
to retrieve the actual previous items from the memo, so even the memoized generator is only iterated over once. This functionality supports any generator instance created by the gfuncs returned by imemoize
, fimemoize
, or gmemoize
.
In fupdate
, it is also possible to replace multiple individual items:
tup = (1, 2, 3, 4, 5)
out = fupdate(tup, (1, 2, 3), (17, 23, 42)) # target, (*where), (*what)
assert tup == (1, 2, 3, 4, 5)
assert out == (1, 17, 23, 42, 5)
These are treated as separate specifications, applied left to right. This means later updates shadow earlier ones, if updating at the same index:
Multiple specifications can be used with slices and sequences as well:
tup = tuple(range(10))
out = fupdate(tup, (slice(0, 10, 2), slice(1, 10, 2)),
(tuple(repeat(2, 5)), tuple(repeat(3, 5))))
assert tup == tuple(range(10))
assert out == (2, 3, 2, 3, 2, 3, 2, 3, 2, 3)
Strictly speaking, each specification can be either a slice/sequence pair or an index/item pair:
tup = tuple(range(10))
out = fupdate(tup, (slice(0, 10, 2), slice(1, 10, 2), 6),
(tuple(repeat(2, 5)), tuple(repeat(3, 5)), 42))
assert tup == tuple(range(10))
assert out == (2, 3, 2, 3, 2, 3, 42, 3, 2, 3)
Mappings can be functionally updated, too:
d1 = {'foo': 'bar', 'fruit': 'apple'}
d2 = fupdate(d1, foo='tavern')
assert sorted(d1.items()) == [('foo', 'bar'), ('fruit', 'apple')]
assert sorted(d2.items()) == [('foo', 'tavern'), ('fruit', 'apple')]
For immutable mappings, fupdate
supports frozendict
(see below). Any other mapping is assumed mutable, and fupdate
essentially just performs copy.copy()
and then .update()
.
Named tuples can be functionally updated, too:
from collections import namedtuple
A = namedtuple("A", "p q")
a = A(17, 23)
out = fupdate(a, 0, 42)
assert a == A(17, 23)
assert out == A(42, 23)
Named tuples export only a sequence interface, so they cannot be treated as mappings, even though their elements have names.
Support for namedtuple
uses an extra feature of fupdate
, which is available for custom classes, too. When constructing the output sequence, fupdate
first checks whether the type of the input sequence has a ._make()
method, and if so, hands the iterable containing the final data to that to construct the output. Otherwise the regular constructor is called (and it must accept a single iterable).
A writable view into a sequence, with slicing, so you can take a slice of a slice (of a slice ...), and it reflects the original both ways:
from unpythonic import view
lst = list(range(10))
v = view(lst)[::2]
assert v == [0, 2, 4, 6, 8]
v2 = v[1:-1]
assert v2 == [2, 4, 6]
v2[1:] = (10, 20)
assert lst == [0, 1, 2, 3, 10, 5, 20, 7, 8, 9]
lst[2] = 42
assert v == [0, 42, 10, 20, 8]
assert v2 == [42, 10, 20]
lst = list(range(5))
v = view(lst)[2:4]
v[:] = 42 # scalar broadcast
assert lst == [0, 1, 42, 42, 4]
While fupdate
lets you be more functional than Python otherwise allows, view
lets you be more imperative than Python otherwise allows.
We store slice specs, not actual indices, so this works also if the underlying sequence undergoes length changes.
Slicing a view returns a new view. Slicing anything else will usually shallow-copy, because the object being sliced does, before we get control. To slice lazily, first view the sequence itself and then slice that. The initial no-op view is optimized away, so it won't slow down accesses. Alternatively, pass a slice
object into the view
constructor.
The view can be efficiently iterated over. As usual, iteration assumes that no inserts/deletes in the underlying sequence occur during the iteration.
Getting/setting an item (subscripting) checks whether the index cache needs updating during each access, so it can be a bit slow. Setting a slice checks just once, and then updates the underlying iterable directly. Setting a slice to a scalar value broadcasts the scalar à la NumPy.
Beside view
itself, the unpythonic.collections
module provides also some other related abstractions.
There is the read-only sister of view, roview
, which is like view
, except it has no __setitem__
or reverse
. This can be useful for providing explicit read-only access to a sequence, when it is undesirable to have clients write into it.
The constructor of the writable view
checks that the input is not read-only (roview
, or a Sequence
that is not also a MutableSequence
) before allowing creation of the writable view.
Finally, there are the SequenceView
and MutableSequenceView
abstract base classes. The concrete view
and roview
are instances of them.
NOTE: A writable view supports also the read-only API, so isinstance(MutableSequenceView, SequenceView) is True
; as well as isinstance(view, roview) is True
. Keep in mind the Liskov substitution principle.
Changed in v0.14.3. mogrify
now skips nil
, actually making it useful for processing ll
linked lists.
Recurse on a given container, apply a function to each atom. If the container is mutable, then update in-place; if not, then construct a new copy like map
does.
If the container is a mapping, the function is applied to the values; keys are left untouched.
Unlike map
and its cousins, mogrify
only supports a single input container. Supporting multiple containers as input would require enforcing some compatibility constraints on their type and shape, because mogrify
is not limited to sequences.
from unpythonic import mogrify
lst1 = [1, 2, 3]
lst2 = mogrify(lst1, lambda x: x**2)
assert lst2 == [2, 4, 6]
assert lst2 is lst1
Containers are detected by checking for instances of collections.abc
superclasses (also virtuals are ok). Supported abcs are MutableMapping
, MutableSequence
, MutableSet
, Mapping
, Sequence
and Set
. Any value that does not match any of these is treated as an atom. Containers can be nested, with an arbitrary combination of the types supported.
For convenience, we support some special cases:
-
Any classes created by
collections.namedtuple
; they do not conform to the standard constructor API for aSequence
.Thus, to support also named tuples: for any immutable
Sequence
, we first check for the presence of a._make()
method, and if found, use it as the constructor. Otherwise we use the regular constructor. -
str
is treated as an atom, although technically aSequence
.It does not conform to the exact same API (its constructor does not take an iterable), and often one does not want to treat strings as containers anyway.
If you want to process strings, implement it in your function that is called by
mogrify
. You can e.g.tuple(thestring)
and then callmogrify
on that. -
The
box
,ThreadLocalBox
andSome
containers from the moduleunpythonic.collections
. Although the first two are mutable, their update is not conveniently expressible by thecollections.abc
APIs. -
The
cons
container from the moduleunpythonic.llist
, including linked lists created usingll
orllist
. This is treated with the general tree strategy, so nested linked lists will be flattened, and the finalnil
is also processed.Note that since
cons
is immutable, anyway, if you know you have a long linked list where you need to update the values, just iterate over it and produce a new copy - that will work as intended.
Changed in v0.15.0. The deprecated names have been removed.
Changed in v0.14.3. To improve descriptiveness, and for consistency with names of other abstractions in unpythonic
, m
has been renamed imathify
and mg
has been renamed gmathify
. This is a one-time change; it is not likely that these names will be changed ever again. The old names are now deprecated.
Changed in v0.14.3. Added convenience mode to generate cyclic infinite sequences.
We provide a compact syntax to create lazy constant, cyclic, arithmetic, geometric and power sequences: s(...)
. Numeric (int
, float
, mpmath
) and symbolic (SymPy) formats are supported. We avoid accumulating roundoff error when used with floating-point formats.
We also provide arithmetic operation support for iterables (termwise). To make any iterable infix math aware, use imathify(iterable)
. The arithmetic is lazy; it just plans computations, returning a new lazy mathematical sequence. To extract values, iterate over the result. (Note this implies that expressions consisting of thousands of operations will overflow Python's call stack. In practice this shouldn't be a problem.)
The function versions of the arithmetic operations (also provided, à la the operator
module) have an s prefix (short for mathematical sequence), because in Python the i prefix (which could stand for iterable) is already used to denote the in-place operators.
We provide the Cauchy product, and its generalization, the diagonal combination-reduction, for two (possibly infinite) iterables. Note cauchyprod
does not sum the series; given the input sequences a
and b
, the call cauchyprod(a, b)
computes the elements of the output sequence c
.
We also provide gmathify
, a decorator to mathify a gfunc, so that it will imathify()
the generator instances it makes. Combo with imemoize
for great justice, e.g. a = gmathify(imemoize(myiterable))
, and then a()
to instantiate a memoized-and-mathified copy.
Finally, we provide ready-made generators that yield some common sequences (currently, the Fibonacci numbers, the triangular numbers, and the prime numbers). The prime generator is an FP-ized sieve of Eratosthenes.
from unpythonic import s, imathify, cauchyprod, take, last, fibonacci, triangular, primes
assert tuple(take(10, s(1, ...))) == (1,)*10
assert tuple(take(10, s(1, 2, ...))) == tuple(range(1, 11))
assert tuple(take(10, s(1, 2, 4, ...))) == (1, 2, 4, 8, 16, 32, 64, 128, 256, 512)
assert tuple(take(5, s(2, 4, 16, ...))) == (2, 4, 16, 256, 65536) # 2, 2**2, (2**2)**2, ...
assert tuple(take(10, s([8], ...))) == (8,) * 10
assert tuple(take(10, s(1, [8], ...))) == (1,) + (8,) * 9
assert tuple(take(10, s([1, 2], ...))) == (1, 2) * 5
assert tuple(take(10, s(1, 2, [3, 4], ...))) == (1, 2) + (3, 4) * 4
assert tuple(s(1, 2, ..., 10)) == tuple(range(1, 11))
assert tuple(s(1, 2, 4, ..., 512)) == (1, 2, 4, 8, 16, 32, 64, 128, 256, 512)
assert tuple(take(5, s(1, -1, 1, ...))) == (1, -1, 1, -1, 1)
assert tuple(take(5, s(1, 3, 5, ...) + s(2, 4, 6, ...))) == (3, 7, 11, 15, 19)
assert tuple(take(5, s(1, 3, ...) * s(2, 4, ...))) == (2, 12, 30, 56, 90)
assert tuple(take(5, s(1, 3, ...)**s(2, 4, ...))) == (1, 3**4, 5**6, 7**8, 9**10)
assert tuple(take(5, s(1, 3, ...)**2)) == (1, 3**2, 5**2, 7**2, 9**2)
assert tuple(take(5, 2**s(1, 3, ...))) == (2**1, 2**3, 2**5, 2**7, 2**9)
assert tuple(take(3, cauchyprod(s(1, 3, 5, ...), s(2, 4, 6, ...)))) == (2, 10, 28)
assert tuple(take(10, primes())) == (2, 3, 5, 7, 11, 13, 17, 19, 23, 29)
assert tuple(take(10, fibonacci())) == (1, 1, 2, 3, 5, 8, 13, 21, 34, 55)
assert tuple(take(10, triangular())) == (1, 3, 6, 10, 15, 21, 28, 36, 45, 55)
A math iterable (i.e. one that has infix math support) is an instance of the class imathify
:
a = s(1, 3, ...)
b = s(2, 4, ...)
c = a + b
assert isinstance(a, imathify)
assert isinstance(b, imathify)
assert isinstance(c, imathify)
assert tuple(take(5, c)) == (3, 7, 11, 15, 19)
d = 1 / (a + b)
assert isinstance(d, imathify)
Applying an operation meant for regular (non-math) iterables will drop the arithmetic support, but it can be restored by imathify
-ing manually:
e = take(5, c)
assert not isinstance(e, imathify)
f = imathify(take(5, c))
assert isinstance(f, imathify)
Symbolic expression support with SymPy:
from unpythonic import s
from sympy import symbols
x0 = symbols("x0", real=True)
k = symbols("k", positive=True)
assert tuple(take(4, s(x0, ...))) == (x0, x0, x0, x0)
assert tuple(take(4, s(x0, x0 + k, ...))) == (x0, x0 + k, x0 + 2*k, x0 + 3*k)
assert tuple(take(4, s(x0, x0*k, x0*k**2, ...))) == (x0, x0*k, x0*k**2, x0*k**3)
assert tuple(s(x0, x0 + k, ..., x0 + 3*k)) == (x0, x0 + k, x0 + 2*k, x0 + 3*k)
assert tuple(s(x0, x0*k, x0*k**2, ..., x0*k**5)) == (x0, x0*k, x0*k**2, x0*k**3, x0*k**4, x0*k**5)
x0, k = symbols("x0, k", positive=True)
assert tuple(s(x0, x0**k, x0**(k**2), ..., x0**(k**4))) == (x0, x0**k, x0**(k**2), x0**(k**3), x0**(k**4))
x = symbols("x", real=True)
px = lambda stream: stream * s(1, x, x**2, ...) # powers of x
s1 = px(s(1, 3, 5, ...)) # 1, 3*x, 5*x**2, ...
s2 = px(s(2, 4, 6, ...)) # 2, 4*x, 6*x**2, ...
assert tuple(take(3, cauchyprod(s1, s2))) == (2, 10*x, 28*x**2)
CAUTION: Symbolic sequence detection is sensitive to the assumptions on the symbols, because very pythonically, SymPy
only simplifies when the result is guaranteed to hold in the most general case under the given assumptions.
Inspired by Haskell.
Added in v0.14.2.
We provide lispy symbols, an uninterned symbol generator, and a pythonic singleton abstraction. These are all pickle-aware and thread-safe.
In plain English, a symbol is a lightweight, human-readable, process-wide unique marker, that can be quickly compared to another such marker by comparing object identity. For example:
from unpythonic import sym
cat = sym("cat")
assert cat is sym("cat")
assert cat is not sym("dog")
The constructor sym
produces an interned symbol. Whenever, in the same process, the same name is passed to the sym
constructor, it gives the same object instance. Even unpickling a symbol that has the same name produces the same sym
object instance as any other sym
with that name.
Thus a sym
behaves like a Lisp symbol. Technically speaking, it's like a zen-minimalistic Scheme/Racket symbol, since Common Lisp stuffs all sorts of additional cruft in symbols. If you insist on emulating that, note a sym
is just a Python object you could customize in the usual ways, even though its instantiation logic plays by somewhat unusual rules.
The function gensym
, which is an abbreviation for generate symbol, creates an uninterned symbol, also known as a gensym. The label given in the call to gensym
is a short human-readable description, like the name of a named symbol, but it has no relation to object identity. Object identity is tracked by an UUID, which is automatically assigned when gensym
creates the value. Even if gensym
is called with the same label, the return value is a new unique symbol each time.
A gensym never conflicts with any named symbol; not even if one takes the UUID from a gensym and creates a named symbol using that as the name.
The return value of gensym
is the only time you will see that particular uninterned symbol object; take good care of it!
For example:
from unpythonic import gensym
tabby = gensym("cat")
scottishfold = gensym("cat")
assert tabby is not scottishfold
print(tabby) # gensym:cat:81a44c53-fe2d-4e65-b2de-329076cc7755
print(scottishfold) # gensym:cat:94287f75-02b5-4138-9174-1e422e618d59
Uninterned symbols are useful as guaranteed-unique sentinel or nonce (sense 2, adapted to programming) values, like the pythonic idiom nonce = object()
, but they come with a human-readable label.
They also have a superpower: with the help of the UUID automatically assigned by gensym
, they survive a pickle roundtrip with object identity intact. Unpickling the same gensym value multiple times in the same process will produce just one object instance. If the original return value from gensym is still alive, it is that same object instance.
The UUID is generated with the pseudo-random algorithm uuid.uuid4
. Due to rollover of the time field, it is possible for collisions with current UUIDs (as of the early 21st century) to occur with those generated after (approximately) the year 3400. See RFC 4122.
Our sym
is like a Lisp/Scheme/Racket symbol, which is essentially an interned string, which in the Lisp family is a data type distinct from a regular string. In our implementation, we do not use sys.intern
; the interning mechanism of our symbol types is completely separate and independent of Python's string interning mechanism.
Our gensym
is like the Lisp gensym
, and the JavaScript Symbol
.
If you're familiar with mcpyrate
's gensym
or MacroPy's gen_sym
, those mean something different. Their purpose is to create, in a macro, a lexical identifier that is not already in use in the source code being compiled, whereas our gensym
creates an uninterned symbol object for run-time use. Lisp macros use symbols to represent identifiers, hence the potential for confusion in Python, where that is not the case. The symbols of unpythonic
are a purely run-time abstraction.
If your background is in C++ or Java, you may notice the symbol abstraction is a kind of a parametric singleton; each symbol with the same name is a singleton, as is any gensym with the same UUID.
A singleton is an object of which only one instance can exist at any given time (in the same process). We provide a base class, Singleton
, which can also be used as a mixin. A basic example:
import pickle
from unpythonic import Singleton
class SingleXHolder(Singleton):
def __init__(self, x=42):
self.x = x
h = SingleXHolder(17)
s = pickle.dumps(h)
h2 = pickle.loads(s)
assert h2 is h # the same instance!
Often the singleton pattern is discussed in the context of classic relatively low-level, static languages such as C++ or Java. In Python, some of the classical issues, such as singletons being forced to use a clunky, nonstandard object construction syntax, are moot, because the language itself offers customization hooks that can be used to smooth away such irregularities.
Thus, rather than blindly copying the pattern from C++ or Java, the questions to ask first are, what is a singleton? What is the service it provides? What responsibilities should it have? Does it even make sense to have a singleton abstraction for Python?
As the result of answering these questions, unpythonic
's idea of a singleton slightly differs from the textbook pattern. In C++ or Java, one uses an instance accessor method (which is a static method) to retrieve the singleton instance, instead of calling a constructor normally - which is actually made private, so nothing except the accessor method can call it. Calling the accessor method either instantiates the singleton (if not created yet), or silently returns the existing instance (if already created).
However, Python can easily retrieve a singleton instance with syntax that looks like regular object construction, by customizing __new__
. Hence no static accessor method is needed. This in turn raises the question, what should we do with constructor arguments, as we surely would like to (in general) to allow those, and they can obviously differ between call sites. Since there is only one object instance to load state into, we could either silently update the state, or silently ignore the new proposed arguments. Good luck tracking down bugs either way. But upon closer inspection, that question depends on an unfounded assumption. What we should be asking instead is, what should happen if the constructor of a singleton is called again, while an instance already exists?
We believe in the principles of separation of concerns and fail-fast. The textbook singleton pattern conflates two concerns, possibly due to language limitations: the management of object instances, and the enforcement of the at-most-one-instance-only guarantee. If we wish to uncouple these responsibilities, then the obvious pythonic answer is that attempting to construct the singleton again while it already exists should be considered a run-time error. Since a singleton type does not support that operation, this situation should raise a TypeError
. This makes the error explicit as early as possible, thus adhering to the fail-fast principle, hence making it difficult for bugs to hide. Constructor arguments will either take effect, or the constructor call will explicitly fail.
Another question arises due to Python having builtin support for object persistence, namely pickle
. What should happen when a singleton is unpickled, while an instance of that singleton already exists? Arguably, by default, it should load the state from the pickle file into the existing instance, overwriting its current state.
This design is based on considering the following scenario. Consider a program that uses the singleton abstraction. During its second and later runs, the program first initializes, which causes the singleton instance to be created, just like during the first run of the program. Then the program loads state from a pickle file, containing (among other data) the state the singleton instance was in when the program previously shut down. Considering the singleton, the data in the file is more relevant than the defaults the program initialization step feeds in. Hence, the default should be to replace the state of the existing singleton instance with the data from the pickle file.
Our Singleton
abstraction is the result of these pythonifications applied to the classic pattern. For more documentation and examples, see the unit tests in unpythonic/tests/test_singleton.py
.
NOTE: A related pattern is the Borg, a.k.a. Monostate. After considering the matter, it was felt that in the context of Python, it offers no advantages over the singleton abstraction, while eliminating a useful feature: the singleton abstraction allows using the object identity check (is
) to check if a name refers to the singleton instance. For this reason, unpythonic
provides Singleton
, but no Borg
. If you feel this is unjust, please let me know - this decision can be revisited, if a situation in which a Borg
is more appropriate than a Singleton
comes up.
CAUTION: Singleton
introduces a custom metaclass to guard constructor calls. Hence it cannot be trivially combined with a class that uses another custom metaclass for some other purpose.
Most often, don't. Singleton
is provided for the very rare occasion where it's the appropriate abstraction. There exist at least three categories of use cases where singleton-like instantiation semantics are desirable:
- A process-wide unique marker value, which has no functionality other than being quickly and uniquely identifiable as that marker.
sym
andgensym
are the specific tools that cover this use case, depending on whether the intent is to allow that value to be independently "constructed" in several places yet always obtaining the same instance (sym
), or if the implementation just happens to internally need a guaranteed-unique value that no value passed in from the outside could possibly clash with (gensym
). For the latter case, sometimes the simple (and much faster) pythonic idiomnonce = object()
will do just as well, if you don't need a human-readable label, andpickle
support.- If you need the singleton object to have extra functionality (e.g. our
nil
supports the iterator protocol), it's possible to subclasssym
orgsym
, but subclassingSingleton
is also a possible solution.
- An empty immutable collection.
- An immutable collection instance cannot have elements added to it after construction, so there is no point in creating more than one instance of an empty immutable collection of any particular type.
- Unfortunately, a class cannot easily be partly
Singleton
(i.e., only when the instance is empty). So this use case is better coded manually, likefrozendict
does. Also, for this use case silently returning the existing instance is the right thing to do.
- A service that may have at most one instance per process.
- But only if it is certain that there can't arise a situation where multiple simultaneous instances of the service are needed.
- The dynamic assignment controller
dyn
is an example, and it is indeed aSingleton
.
Cases 1 and 2 have no meaningful instance data. Case 3 may or may not, depending on the specifics. If your object does, and if you want it to support pickle
, you may want to customize __getnewargs__
(called at pickling time), __setstate__
, and sometimes maybe also __getstate__
. Note that unpickling skips __init__
, and calls just __new__
(with the "newargs") and then __setstate__
.
I am not completely sure if it is meaningful to provide a generic Singleton
abstraction for Python, except for teaching purposes. Practical use cases may differ so much, and some of the implementation details of the specific singleton object (especially related to pickling) may depend so closely on the implementation details of the singleton abstraction, that it may be easier to just roll your own singleton code when needed. If you are new to customizing this part of Python, the code we have here should at least demonstrate how to do that.
Tools related to control flow.
See also the with tco
macro, which applies tail call optimization automatically.
Tail call optimization is a technique to treat tail calls in such a way that they do not grow the call stack. It sometimes allows expressing algorithms very elegantly. Some functional programming patterns such as functional loops are based on tail calls.
The factorial function is a classic example of tail recursion:
from unpythonic import trampolined, jump
@trampolined
def fact(n, acc=1):
if n == 0:
return acc
return jump(fact, n - 1, n * acc)
print(fact(4)) # 24
fact(5000) # no crash
Functions that use TCO must be @trampolined
. The decorator wraps the original function with a trampoline. Calling a trampolined function normally starts the trampoline.
Inside a trampolined function, a normal call f(a, ..., kw=v, ...)
remains a normal call.
A tail call with target f
is denoted return jump(f, a, ..., kw=v, ...)
. This explicitly marks that it is indeed a tail call, due to the explicit return
. Note that jump
is a noun, not a verb. The jump(f, ...)
part just evaluates to a jump
instance, which on its own does nothing. Returning the jump
instance to the trampoline actually performs the tail call.
If the jump target has a trampoline, the trampoline implementation will automatically strip it and jump into the actual entry point.
To return a final result, just return
it normally. Returning anything but a jump
shuts down the trampoline, and returns the given value from the initial call (to the @trampolined
function) that originally started that trampoline.
CAUTION: Trying to jump(...)
without the return
does nothing useful, and will usually print an unclaimed jump warning. It does this by checking a flag in the __del__
method of jump
; any correctly used jump instance should have been claimed by a trampoline before it gets garbage-collected. It can only print a warning, not raise an exception or halt the program, due to the limitations of __del__
.
Some unclaimed jump warnings may appear also if the process is terminated by Ctrl+C (KeyboardInterrupt
). This is normal; it just means that the termination occurred after a jump object was instantiated but before it was claimed by a trampoline.
For comparison, with the macro API, the example becomes:
from unpythonic.syntax import macros, tco
with tco:
def fact(n, acc=1):
if n == 0:
return acc
return fact(n - 1, n * acc)
print(fact(4)) # 24
fact(5000) # no crash
The with tco
macro implicitly inserts the @trampolined
decorator, and converts any regular call that appears in tail position into a jump
. It also transforms lambdas in a similar way.
To make a tail-recursive anonymous function, use trampolined
together with withself
. The self
argument is declared explicitly, but passed implicitly, just like the self
argument of a method:
from unpythonic import trampolined, jump, withself
t = trampolined(withself(lambda self, n, acc=1:
acc if n == 0 else jump(self, n - 1, n * acc)))
print(t(4)) # 24
Here the jump is just jump
instead of return jump
, because lambda
does not use the return
syntax.
For comparison, with the macro API, this becomes:
from unpythonic.syntax import macros, tco
from unpythonic import withself
with tco:
t = withself(lambda self, n, acc=1:
acc if n == 0 else self(n - 1, n * acc))
print(t(4)) # 24
Mutual recursion is also supported. Just ask the trampoline to jump
into the desired function:
from unpythonic import trampolines,jump
@trampolined
def even(n):
if n == 0:
return True
return jump(odd, n - 1)
@trampolined
def odd(n):
if n == 0:
return False
return jump(even, n - 1)
assert even(42) is True
assert odd(4) is False
assert even(10000) is True # no crash
from unpythonic import letrec, trampolined, jump
letrec(evenp=lambda e:
trampolined(lambda x:
(x == 0) or jump(e.oddp, x - 1)),
oddp=lambda e:
trampolined(lambda x:
(x != 0) and jump(e.evenp, x - 1)),
body=lambda e:
e.evenp(10000))
For comparison, with the macro API of letrec
, this becomes:
from unpythonic.syntax import macros, letrec
from unpythonic import trampolined, jump
letrec[[evenp << trampolined(lambda x:
(x == 0) or jump(oddp, x - 1)),
oddp << trampolined(lambda x:
(x != 0) and jump(evenp, x - 1))] in
evenp(10000)]
TCO from another viewpoint:
@trampolined
def foo():
return jump(bar)
@trampolined
def bar():
return jump(baz)
@trampolined
def baz():
print("How's the spaghetti today?")
foo()
Each function in the TCO call chain tells the trampoline where to go next (and with what arguments). All hail lambda, the ultimate GOTO!
Each TCO call chain brings its own trampoline, so they nest as expected:
@trampolined
def foo():
return jump(bar)
@trampolined
def bar():
t = even(42) # start another trampoline for even/odd
return jump(baz, t)
@trampolined
def baz(result):
print(result)
foo() # start trampoline
Clojure has (trampoline ...)
, which works almost exactly like our trampolined
.
The return jump(...)
solution is essentially the same there (the syntax is #(...)
), but in Clojure, the trampoline must be explicitly enabled at the call site, instead of baking it into the function definition, as our decorator does.
Clojure's trampoline system is thus more explicit and simple than ours (the trampoline does not need to detect and strip the tail-call target's trampoline, if it has one - because with Clojure's solution, it never does), at some cost to convenience at each use site. We have chosen to emphasize use-site convenience.
In functional programming, looping can be represented as recursion. The loop body is written as a recursive function. To loop, the function tail-calls itself, possibly with new argument values. Both for
and while
loops can be expressed in this way.
As a practical detail, tail-call optimization is important, to avoid growing the call stack at each iteration of the loop.
Here is a functional loop using unpythonic
, with automatic tail call optimization - no macros needed:
from unpythonic import looped
@looped
def s(loop, acc=0, i=0):
if i == 10:
return acc
return loop(acc + i, i + 1)
print(s) # 45
Compare the sweet-exp Racket:
define s
let loop ([acc 0] [i 0])
cond
{i = 10}
acc
else
loop {acc + i} {i + 1}
displayln s ; 45
In @looped
, the function name of the loop body is the name of the final result, like in @call
. To terminate the loop, just return
the final result normally. This shuts down the loop and replaces the loop body definition (in the example, s
) with the final result value.
The first parameter of the loop body is the magic parameter loop
. It is self-ish, representing a jump back to the loop body itself, starting a new iteration. Just like Python's self
, loop
can have any name; it is passed positionally.
Note that loop
is a noun, not a verb. This is because the expression loop(...)
is essentially the same as jump(...)
to the loop body itself. However, it also arranges things so that the trampolined call inserts the magic parameter loop
, which can only be set up via this mechanism.
Additional arguments can be given to loop(...)
. When the loop body is called, any additional positional arguments are appended to the implicit ones, and can be anything. Additional arguments can also be passed by name. The initial values of any additional arguments must be declared as defaults in the formal parameter list of the loop body. The loop is automatically started by @looped
, by calling the body with the magic loop
as the only argument.
Any loop variables such as i
in the above example are in scope only in the loop body; there is no i
in the surrounding scope. Moreover, it is a fresh i
at each iteration; nothing is mutated by the looping mechanism.
Be careful if you use a mutable object instance as a loop variable: the loop body is just a function call like any other, so the usual rules apply.
For another example of functional looping, here is a typical while True
loop in FP style:
from unpythonic import looped
@looped
def _(loop):
print("Enter your name (or 'q' to quit): ", end='')
s = input()
if s.lower() == 'q':
return # ...the implicit None. In a "while True:", "break" here.
else:
print(f"Hello, {s}!")
return loop()
Functional loops do not have to be pure. Here is a functional loop with a side effect:
from unpythonic import looped
out = []
@looped
def _(loop, i=0):
if i <= 3:
out.append(i) # cheeky side effect
return loop(i + 1)
# the implicit "return None" terminates the loop.
assert out == [0, 1, 2, 3]
CAUTION: This pure-Python FP looping mechanism is slow, so it may make sense to use it only when "the FP-ness" (no mutation, scoping) is important.
The @looped
decorator is essentially sugar. If you read the section further above on TCO, you may have guessed how it is implemented: the loop
function is actually a jump record in disguise, and @looped
installs a trampoline.
Indeed, the following code is behaviorally equivalent to the first example:
from unpythonic import trampolined, jump
@trampolined
def s(acc=0, i=0):
if i == 10:
return acc
return jump(s, acc + i, i + 1)
s = s()
print(s) # 45
However, the actual implementation of @looped
slightly differs from what would be implied by this straightforward translation, because the feature uses no macros.
In Python, loops often run directly over the elements of an iterable, which markedly improves readability compared to dealing with indices.
For this use case, we provide @looped_over
:
from unpythonic import looped_over
@looped_over(range(10), acc=0)
def s(loop, x, acc):
return loop(acc + x)
assert s == 45
The @looped_over
decorator is essentially sugar. Behaviorally equivalent code:
from unpythonic import call, looped
@call
def s(iterable=range(10)):
it = iter(iterable)
@looped
def tmp(loop, acc=0):
try:
x = next(it)
return loop(acc + x) # <-- the loop body
except StopIteration:
return acc
return tmp
assert s == 45
In @looped_over
, the loop body takes three magic positional parameters. The first parameter loop
is similar to that in @looped
. The second parameter x
is the current element. The third parameter acc
is initialized to the acc
value given to @looped_over
, and then (functionally) updated at each iteration.
The new value of acc
is the first positional argument given to loop(...)
, if any positional arguments were given. Otherwise acc
retains its last value.
If acc
is a mutable object, mutating it is allowed. For example, if acc
is a list, it is perfectly fine to acc.append(...)
and then just loop()
with no arguments, allowing acc
to retain its last value. To be exact, keeping the last value means the binding of the name acc
does not change, so when the next iteration starts, the name acc
still points to the same object that was mutated. This strategy can be used to pythonically construct a list in an FP loop.
Additional arguments can be given to loop(...)
. The same notes as above apply. For example, here we have the additional parameters fruit
and number
. The first one is passed positionally, and the second one by name:
from unpythonic import looped_over
@looped_over(range(10), acc=0)
def s(loop, x, acc, fruit="pear", number=23):
print(fruit, number)
newfruit = "orange" if fruit == "apple" else "apple"
newnumber = number + 1
return loop(acc + x, newfruit, number=newnumber)
assert s == 45
The loop body is called once for each element in the iterable. When the iterable runs out of elements, the final value of acc
becomes the return value of the loop. If the iterable is empty, the body never runs; then the return value of the loop is the initial value of acc
.
To terminate the loop early, just return
your final result normally, like in @looped
. It can be anything, it does not need to be acc
.
Multiple input iterables work somewhat like in Python's for
, except any sequence unpacking must be performed inside the body:
from unpythonic import looped_over
@looped_over(zip((1, 2, 3), ('a', 'b', 'c')), acc=())
def p(loop, item, acc):
numb, lett = item
return loop(acc + (f"{numb:d}{lett}",))
assert p == ('1a', '2b', '3c')
@looped_over(enumerate(zip((1, 2, 3), ('a', 'b', 'c'))), acc=())
def q(loop, item, acc):
idx, (numb, lett) = item
return loop(acc + (f"Item {idx:d}: {numb:d}{lett}",))
assert q == ('Item 0: 1a', 'Item 1: 2b', 'Item 2: 3c')
This is because while tuple parameter unpacking was supported in Python 2.x, it was removed starting from Python 3.0, in PEP 3113. The original implementation of the feature caused certain technical issues (detailed in the PEP), and it was not widely used. It is somewhat curious that re-engineering the implementation to overcome those issues was not even suggested in the PEP. For what it's worth, JavaScript (technically, ECMAScript) does support this feature, so the rationale for removal seems specific to the Python community.
FP loops can be nested (also those over iterables):
from unpythonic import looped_over
@looped_over(range(1, 4), acc=())
def outer_result(outer_loop, y, outer_acc):
@looped_over(range(1, 3), acc=())
def inner_result(inner_loop, x, inner_acc):
return inner_loop(inner_acc + (y*x,))
return outer_loop(outer_acc + (inner_result,))
assert outer_result == ((1, 2), (2, 4), (3, 6))
If you feel the trailing commas ruin the aesthetics, see unpythonic.pack
.
As the reference warns (note 6), repeated concatenation of tuples has an O(n²) runtime cost, because each concatenation creates a new tuple, which needs to copy all of the already existing elements. To keep the runtime O(n), there are two options:
- Pythonic solution: Destructively modify a mutable sequence. Particularly,
list
is a dynamic array that has a low amortized cost for concatenation (most often O(1), with the occasional O(n) when the allocated storage grows). - Unpythonic solution:
cons
a linked list, and reverse it at the end. Cons cells are immutable; consing a new element to the front costs O(1). Reversing the list costs O(n).
Mutable sequence (Python list
):
from unpythonic import looped_over
@looped_over(zip((1, 2, 3), ('a', 'b', 'c')), acc=[])
def p(loop, item, acc):
numb, lett = item
newelt = f"{numb:d}{lett}"
acc.append(newelt)
return loop()
assert p == ['1a', '2b', '3c']
Linked list:
from unpythonic import looped_over, cons, nil, ll, lreverse
@lreverse
@looped_over(zip((1, 2, 3), ('a', 'b', 'c')), acc=nil)
def p(loop, item, acc):
numb, lett = item
newelt = f"{numb:d}{lett}"
return loop(cons(newelt, acc))
assert p == ll('1a', '2b', '3c')
Note the unpythonic use of the lreverse
function as a decorator. @looped_over
overwrites the def'd name by the return value of the loop; then lreverse
takes that as input, and overwrites once more. Thus p
becomes the final list.
To get the output as a tuple, we can add tuple
to the decorator chain:
from unpythonic import looped_over, cons, nil, ll, lreverse
@tuple
@lreverse
@looped_over(zip((1, 2, 3), ('a', 'b', 'c')), acc=nil)
def p(loop, item, acc):
numb, lett = item
newelt = f"{numb:d}{lett}"
return loop(cons(newelt, acc))
assert p == ('1a', '2b', '3c')
This works in both solutions. The cost is an additional O(n) step.
The main way to exit an FP loop (also early) is, at any time, to just return
the final result normally.
If you want to exit the function containing the loop from inside the loop, see escape continuations below.
The main way to continue an FP loop is, at any time, to loop(...)
with the appropriate arguments that will make the loop proceed to the next iteration. Or package the appropriate loop(...)
expression into your own function cont
, and then use cont(...)
:
from unpythonic import looped
@looped
def s(loop, acc=0, i=0):
cont = lambda newacc=acc: loop(newacc, i + 1) # always increase i; by default keep current value of acc
if i <= 4:
return cont() # essentially "continue" for this FP loop
elif i == 10:
return acc
else:
return cont(acc + i)
print(s) # 35
This approach separates the computations of the new values for the iteration counter and the accumulator.
See @breakably_looped
(offering brk
) and @breakably_looped_over
(offering brk
and cnt
).
The point of brk(value)
over just return value
is that brk
is first-class, so it can be passed on to functions called by the loop body - so that those functions then have the power to directly terminate the loop.
In @looped
, a library-provided cnt
would not make sense, since all parameters except loop
are user-defined. The client code itself defines what it means to proceed to the "next" iteration. Really the only way in a construct with this degree of flexibility is for the client code to fill in all the arguments itself.
Because @looped_over
is a more specific abstraction, there the concept of continue is much more clear-cut. We define cnt
to mean proceed to take the next element from the iterable, keeping the current value of acc
. Essentially cnt
is a partially applied loop(...)
with the first positional argument set to the current value of acc
.
Just call the looped()
decorator manually:
from unpythonic import looped
s = looped(lambda loop, acc=0, i=0:
loop(acc + i, i + 1) if i < 10 else acc)
print(s)
It's not just a decorator; in the Scheme family of Lisps, a construct like this would likely be named call/looped
.
We can also use let
to make local definitions:
from unpythonic import looped, let
s = looped(lambda loop, acc=0, i=0:
let(cont=lambda newacc=acc:
loop(newacc, i + 1),
body=lambda e:
e.cont(acc + i) if i < 10 else acc)
print(s)
The looped_over()
decorator also works, if we just keep in mind that parameterized decorators in Python are actually decorator factories:
from unpythonic import looped_over
r10 = looped_over(range(10), acc=0)
s = r10(lambda loop, x, acc:
loop(acc + x))
assert s == 45
If you really need to make that into an expression, bind r10
using let
(if you use letrec
, keeping in mind it is a callable), or to make your code unreadable, just inline it.
With curry
, using its passthrough feature, this is also a possible solution:
from unpythonic import curry, looped_over
s = curry(looped_over, range(10), 0,
lambda loop, x, acc:
loop(acc + x))
assert s == 45
As of v0.15.0, curry
handles also named arguments, so we can make explicit what the 0
means:
from unpythonic import curry, looped_over
s = curry(looped_over, range(10), acc=0,
body=(lambda loop, x, acc:
loop(acc + x)))
assert s == 45
but because, due to syntactic limitations of Python, no positional arguments can be given after a named argument, you then have to know - in order to be able to provide the loop body - that the decorator returned by the factory looped_over
calls it body
.
You can of course obtain such information by inspection (here shown in IPython running Python 3.8):
In [2]: looped_over(range(10), acc=0)
Out[2]: <function unpythonic.fploop.looped_over.<locals>.run(body)>
or by looking at the source code.
In unpythonic
, a generator can tail-chain into another generator. This is like invoking itertools.chain
, but as a tail call from inside the generator - so that the generator itself can choose the next iterable in the chain. If the next iterable is a generator, it can again tail-chain into something else. If it is not a generator, it becomes the last iterable in the TCO chain.
Python provides a convenient hook to build things like this, in the guise of return
:
from unpythonic import gtco, take, last
def march():
yield 1
yield 2
return march() # tail-chain to a new instance of itself
assert tuple(take(6, gtco(march()))) == (1, 2, 1, 2, 1, 2)
last(take(10000, gtco(march()))) # no crash
Note the calls to gtco
at the use sites. For convenience, we provide @gtrampolined
, which automates that:
from unpythonic import gtrampolined, take, last
@gtrampolined
def ones():
yield 1
return ones()
assert tuple(take(10, ones())) == (1,) * 10
last(take(10000, ones())) # no crash
It is safe to tail-chain into a @gtrampolined
generator; the system strips the TCO target's trampoline if it has one.
Like all tail calls, this works for any iterative process. In contrast, this does not work:
from operator import add
from unpythonic import gtrampolined, scanl, take
@gtrampolined
def fibos(): # see numerics.py
yield 1
return scanl(add, 1, fibos())
print(tuple(take(10, fibos()))) # --> (1, 1, 2), only 3 terms?!
This sequence (technically iterable, but in the mathematical sense) is recursively defined, and the return
shuts down the generator before it can yield more terms into scanl
. With yield from
instead of return
the second example works - but since it is recursive, it eventually blows the call stack.
This particular example can be converted into a linear process with a different higher-order function, no TCO needed:
from unpythonic import unfold, take, last, Values
def fibos():
def nextfibo(a, b):
return Values(a, a=b, b=a + b)
return unfold(nextfibo, 1, 1)
assert tuple(take(10, fibos())) == (1, 1, 2, 3, 5, 8, 13, 21, 34, 55)
last(take(10000, fibos())) # no crash
Changed in v0.15.0. The deprecated names have been removed.
Changed in v0.14.2. These constructs were previously named setescape
, escape
. The names have been changed to match the standard naming for this feature in several Lisps. The old names are now deprecated.
In a nutshell, an escape continuation, often abbreviated ec, transfers control outward on the call stack. Escape continuations are a generalization of continue
, break
and return
. Those three constructs are essentially second-class ecs with a hard-coded escape point (respectively: end of iteration of loop; end of loop; end of function). A general escape continuation mechanism allows setting an escape point explicitly.
For example, escape continuations can be used as a multi-return:
from unpythonic import catch, throw
@catch() # note the parentheses
def f():
def g():
throw("hello from g") # the argument becomes the return value of f()
print("not reached")
g()
print("not reached either")
assert f() == "hello from g"
In Lisp terms, @catch
essentially captures the escape continuation (ec) of the function decorated with it. The nearest (dynamically) surrounding ec can then be invoked by throw(value)
. When the throw
is performed, the function decorated with @catch
immediately terminates, returning value
.
In Python terms, a throw (in the escape continuation sense) means just raising a specific type of exception; the usual rules concerning try/except/else/finally
and with
blocks apply. The throw
is a function call, so it works also in lambdas.
For another example, here we return from the function surrounding an FP loop, from inside the loop:
@catch()
def f():
@looped
def s(loop, acc=0, i=0):
if i > 5:
throw(acc)
return loop(acc + i, i + 1)
print("never reached")
f() # --> 15
For more control, both @catch
points and throw
instances can be tagged:
@catch(tags="foo") # catch point tags can be single value or tuple (tuples OR'd, like isinstance())
def foo():
@call
@catch(tags="bar")
def bar():
@looped
def s(loop, acc=0, i=0):
if i > 5:
throw(acc, tag="foo") # throw instance tag must be a single value
return loop(acc + i, i + 1)
print("never reached")
return False
print("never reached either")
return False
assert foo() == 15
For details on tagging, especially how untagged and tagged throw and catch points interact, and how to make one-to-one connections, see the docstring for @catch
. See also call_ec
(below), which is a compact syntax to make a one-to-one connection.
CAUTION: The implementation is based on exceptions, so catch-all except:
statements will intercept also throws, breaking the escape mechanism. As you already know, be specific in which exception types you catch in an except
clause!
Etymology
This feature is known as catch
/throw
in several Lisps, e.g. in Emacs Lisp and in Common Lisp (as well as some of its ancestors). This terminology is independent of the use of throw
/catch
in C++/Java for the exception handling mechanism.
Common Lisp also provides a lexically scoped variant (BLOCK
/RETURN-FROM
) that is more idiomatic (according to Seibel), but we currently provide only this dynamic variant.
We provide the function call/ec
(a.k.a. call-with-escape-continuation
), in Python spelled as call_ec
. It's a decorator that, like @call
, immediately runs the function and replaces the def'd name with the return value. The twist is that it internally sets up a catch point, and hands a first-class escape continuation to the callee.
The function to be decorated must take one positional argument, the ec instance. The parameter is conventionally named ec
.
The ec instance itself is another function, which takes one positional argument: the value to send to the catch point. That value can also be a Values
object if you want to escape with multiple-return-values or named return values; the ec will send any argument given to it.
The ec instance and the catch point are connected one-to-one. No other @catch
point will catch the ec instance, and the catch point catches only the ec instances created by this invocation of call_ec
, and nothing else.
Any particular ec instance is only valid inside the dynamic extent of the call_ec
invocation that created it. Attempting to call the ec later raises RuntimeError
.
from unpythonic import call_ec
@call_ec
def result(ec): # effectively, just a code block, capturing the ec as an argument
answer = 42
ec(answer) # here this has the same effect as "return answer"...
print("never reached")
answer = 23
return answer
assert result == 42
@call_ec
def result(ec):
answer = 42
def inner():
ec(answer) # ...but here this directly escapes from the outer def
print("never reached")
return 23
answer = inner()
print("never reached either")
return answer
assert result == 42
The ec does not have to be called from the lexical scope of the call_ec
'd function, as long as the call occurs within the dynamic extent of the call_ec
. It's essentially a return from me for the original function:
def f(ec):
print("hi from f")
ec(42)
@call_ec
def result(ec):
f(ec) # pass on the ec - it's a first-class value
print("never reached")
assert result == 42
This also works with lambdas, by using call_ec()
directly. No need for a trampoline:
result = call_ec(lambda ec:
begin(print("hi from lambda"),
ec(42),
print("never reached")))
assert result == 42
Normally begin()
would return the last value, but the ec overrides that; it is effectively a return
for multi-expression lambdas!
But wait, doesn't Python evaluate all the arguments of begin(...)
before the begin
itself has a chance to run? Why doesn't the example print also never reached? This is because escapes are implemented using exceptions. Evaluating the ec call raises an exception, preventing any further elements from being evaluated.
This usage is valid with named functions, too, so strictly speaking, call_ec
is not only a decorator:
def f(ec):
print("hi from f")
ec(42)
print("never reached")
# ...possibly somewhere else, possibly much later...
result = call_ec(f)
assert result == 42
If you use the macro API of unpythonic
, be aware that the macros cannot analyze this last example properly, because there is no lexical clue that f
will actually be called using call_ec
. To be safe in situations like this, name your ec parameter ec
; then it will be recognized as an escape continuation. Also brk
(defined by @looped_over
) and throw
are recognized by name.
CAUTION: The call_ec
mechanism builds on @catch
and throw
, so the caution about catch-all except:
statements applies here, too.
We provide a simple variant of nondeterministic evaluation. This is essentially a toy that has no more power than list comprehensions or nested for loops. See also the easy-to-use macro version with natural syntax and a clean implementation.
An important feature of McCarthy's amb
operator is its nonlocality - being able to jump back to a choice point, even after the dynamic extent of the function where that choice point resides. If that sounds a lot like call/cc
, that is because that's how amb
is usually implemented. See examples in Ruby and in Racket.
Python cannot do that, short of transforming the whole program into CPS, while applying TCO everywhere to prevent stack overflow. If that is what you want, see continuations
in the macros.
This forall
is essentially a tuple comprehension that:
- Can have multiple body expressions (side effects also welcome!), by simply listing them in sequence.
- Allows filters to be placed at any level of the nested looping.
- Presents the source code in the same order as it actually runs.
The module unpythonic.amb
defines four operators:
forall
is the control structure, which marks a section that uses nondeterministic evaluation.choice
binds a name:choice(x=range(3))
essentially meansfor e.x in range(3):
.insist
is a filter, which allows the remaining lines to run if the condition evaluates to truthy.deny
isinsist not
; it allows the remaining lines to run if the condition evaluates to falsey.
Choice variables live in the environment, which is accessed via a lambda e: ...
, just like in letrec
. Lexical scoping is emulated. In the environment, each line only sees variables defined above it; trying to access a variable defined later raises AttributeError
.
The last line in a forall
describes one item of the output. The output items are collected into a tuple, which becomes the return value of the forall
expression.
out = forall(choice(y=range(3)),
choice(x=range(3)),
lambda e: insist(e.x % 2 == 0),
lambda e: (e.x, e.y))
assert out == ((0, 0), (2, 0), (0, 1), (2, 1), (0, 2), (2, 2))
out = forall(choice(y=range(3)),
choice(x=range(3)),
lambda e: deny(e.x % 2 == 0),
lambda e: (e.x, e.y))
assert out == ((1, 0), (1, 1), (1, 2))
Pythagorean triples:
pt = forall(choice(z=range(1, 21)), # hypotenuse
choice(x=lambda e: range(1, e.z+1)), # shorter leg
choice(y=lambda e: range(e.x, e.z+1)), # longer leg
lambda e: insist(e.x*e.x + e.y*e.y == e.z*e.z),
lambda e: (e.x, e.y, e.z))
assert tuple(sorted(pt)) == ((3, 4, 5), (5, 12, 13), (6, 8, 10),
(8, 15, 17), (9, 12, 15), (12, 16, 20))
Beware:
out = forall(range(2), # evaluate remaining items twice!
choice(x=range(1, 4)),
lambda e: e.x)
assert out == (1, 2, 3, 1, 2, 3)
The initial range(2)
causes the remaining items to run twice - because it yields two output values - regardless of whether we bind the result to a variable or not. In effect, each line, if it returns more than one output, introduces a new nested loop at that point.
For more, see the docstring of forall
.
The implementation is based on the List monad, and a bastardized variant of do-notation. Quick vocabulary:
forall(...)
=do ...
(for a List monad)choice(x=foo)
=x <- foo
, wherefoo
is an iterableinsist x
=guard x
deny x
=guard (not x)
- Last line = implicit
return ...
Changed in v0.15.0. Functions resignal_in
and resignal
added; these perform the same job for conditions as reraise_in
and reraise
do for exceptions, that is, they allow you to map library exception types to semantically appropriate application exception types, with minimum boilerplate.
Upon an unhandled signal, signal
now returns the canonized input condition
, with a nice traceback attached. This feature is intended for implementing custom error protocols on top of signal
; error
already uses it to produce a nice-looking error report.
The error-handling protocol that was used to send a signal is now available for inspection in the __protocol__
attribute of the condition instance. It is the callable that sent the signal, such as signal
, error
, cerror
or warn
. It is the responsibility of each error-handling protocol (except the fundamental signal
itself) to pass its own function to signal
as the protocol
argument; if not given, protocol
defaults to signal
. The protocol information is used by the resignal
mechanism.
Changed in v0.14.3. Conditions can now inherit from BaseException
, not only from Exception.
Just like the except
statement, with handlers
catches also derived types, e.g. a handler for Exception
now catches a signaled ValueError
.
When an unhandled error
or cerror
occurs, the original unhandled error is now available in the __cause__
attribute of the ControlError
exception that is raised in this situation.
Signaling a class, as in signal(SomeExceptionClass)
, now implicitly creates an instance with no arguments, just like the raise
statement does. On Python 3.7+, signal
now automatically equips the condition instance with a traceback, just like the raise
statement does for an exception.
Added in v0.14.2.
One of the killer features of Common Lisp are conditions, which are essentially resumable exceptions.
Following Peter Seibel (Practical Common Lisp, chapter 19), we define errors as the consequences of Murphy's Law, i.e. situations where circumstances cause interaction between the program and the outside world to fail. An error is not a bug, but failing to handle an error certainly is.
An exception system splits error-recovery responsibilities into two parts. In Python terms, we speak of raising and then handling an exception. In comparison, a condition system splits error-recovery responsibilities into three parts: signaling, handling and restarting.
The result is improved modularity and better separation of mechanism and policy. The actual error-recovery code (the mechanism) lives in restarts, at the inner level (of the call stack) - which has access to all the low-level technical details that are needed to actually perform an error recovery. It is possible to provide several different canned recovery strategies, which implement any appropriate ways to recover, in the context of each low- or middle-level function. The decision of which strategy to use (the policy) in any particular situation is deferred to an outer level (of the call stack). The outer level knows the big picture - why the inner levels are running in this particular case, i.e., what we are trying to accomplish and how. Hence, it is the appropriate place to choose which error-recovery strategy should be used in its high-level context.
Seibel's Practical Common Lisp explains conditions in the context of a log file parser. In contrast, let us explain them with some Theoretical Python:
from unpythonic import restarts, handlers, signal, invoke, unbox
class TellMeHowToRecover(Exception):
pass
def low():
with restarts(resume_low=(lambda x: x)) as result: # result is a box
signal(TellMeHowToRecover())
result << "low level completed" # value for normal exit from the `with restarts` block
return unbox(result) + " > normal exit from low level"
def mid():
with restarts(resume_mid=(lambda x: x)) as result:
result << low()
return unbox(result) + " > normal exit from mid level"
# Trivial use case where we want to just ignore the condition.
# An uncaught signal() is just a no-op; see warn(), error(), cerror() for other standard options.
def high1():
assert mid() == "low level completed > normal exit from low level > normal exit from mid level"
high1()
# Use case where we want to resume at the low level. In a real-world application, repairing the error,
# and letting the rest of the low-level code (after the `with restarts` block) continue processing
# with the repaired data.
# Note we need new code only at the high level; the mid and low levels remain as-is.
def high2():
with handlers((TellMeHowToRecover, lambda c: invoke("resume_low", "resumed at low level"))):
assert mid() == "resumed at low level > normal exit from low level > normal exit from mid level"
high2()
# Use case where we want to resume at the mid level. In a real-world application, skipping a failed part.
def high3():
with handlers((TellMeHowToRecover, lambda c: invoke("resume_mid", "resumed at mid level"))):
assert mid() == "resumed at mid level > normal exit from mid level"
high3()
Generally a conditions-and-restarts system operates as follows. A signal is sent, outward on the call stack, from the actual location where an error was detected. A handler at any outer level (of the call stack) may then respond to it, and execution resumes from the restart that is invoked by the handler.
The sequence of catching a signal and invoking a restart is termed handling the signal. Handlers are searched in order from innermost to outermost on the call stack. Strictly speaking, though, the handlers live on a separate stack; we consider those handler bindings whose dynamic extent the point of execution is in, at the point of time when the signal is sent.
In general, it is allowed for a handler to fall through (return normally); then the next outer handler for the same signal type gets control. This allows the programmer to chain handlers to obtain their side effects, such as logging. This is referred to as canceling, since as a result, the signal remains unhandled.
Viewed with respect to the call stack, the restarts live between the (outer) level of the handler, and the (inner) level where the signal was sent from. The main difference to the exception model is that unlike raising an exception, sending a signal does not unwind the call stack. (Let that sink in for a moment.)
Although the handlers live further out on the call stack, the stack does not unwind that far. The handlers are just consulted for what to do. The call stack unwinds only when a restart is invoked. Then, only the part of the call stack between the location that sent the signal, and the invoked restart, is unwound.
Restarts, despite the name, are a mildly behaved, structured control construct. The block of code that encountered the error is actually not arbitrarily resumed; instead, the code of the invoked restart runs instead of the rest of the block, and the return value of the restart replaces the normal return value. (But see cerror
.)
Restarts are set up using the with restarts
context manager (Common Lisp: RESTART-CASE
). Restarts are defined by passing named arguments to the restarts
form; the argument name sets the restart name. The restart name is distinct from the name (if any) of the function that is used as the restart. A restart can only be invoked from within the dynamic extent of its with restarts
(the same rule is effect also in Common Lisp). A restart may take any args and kwargs; any that it expects must be provided when it is invoked.
Note difference to the API of python-cl-conditions, which requires functions used as restarts to be named, and uses the function name as the restart name.
A common use case is a use_value=(lambda x: x)
restart, which is just passed the value that should be returned when the restart is invoked. There is a predefined function of the same name, use_value
(Common Lisp: USE-VALUE
), which expects one argument, and immediately invokes the use_value
restart currently in scope, sending it that argument. This allows using the shorthand lambda c: use_value(...)
as a handler instead of the spelled-out lambda c: invoke("use_value", ...)
.
Signals are sent using signal
(Common Lisp: SIGNAL
). Any exception or warning instance (both builtin or custom) can be signaled. If you need to send data to your handler, place it in attributes of the exception object (just like you would do when programming with the exception model).
Handlers are established using the with handlers
context manager (Common Lisp: HANDLER-BIND
). Handlers are bound to exception types, or tuples of types, just like regular exception handlers in Python. The handlers
form takes as its arguments any number of (exc_spec, handler)
pairs. Here exc_spec
specifies the exception types to catch (when sent via signal
), and handler
is a callable. When catching a signal, in case of multiple matches in the same with handlers
form, the handler that appears earlier in the argument list wins.
A handler catches signals of the types it is bound to, and their subtypes. The code in the handler may invoke a restart by calling invoke
(Common Lisp: INVOKE-RESTART
), with the desired restart name as a string. In case of duplicate names, the most recently established restart (that is still in scope) with the given name wins. Any extra args and kwargs are passed through to the restart. The invoke
function always transfers control, it never returns normally.
A handler may take one optional positional argument, the exception instance being signaled. Roughly, API-wise signal handlers are similar to exception handlers (except
clauses). A handler that accepts an argument is like an except ... as ...
, whereas one that does not is like except ...
. The main difference to an exception handler is that a signal handler should not try to recover from the error by itself; instead, it should just choose which strategy the lower-level code should use to recover from the error. Usually, the only thing a signal handler needs to do is to invoke a particular restart.
To create a simple handler that does not take an argument, and just invokes a pre-specified restart, see invoker
. If you instead want to create a function that you can call from a handler, in order to invoke a particular restart immediately (so to define a shorthand notation similar to use_value
), use functools.partial(invoke, "my_restart_name")
.
Following Common Lisp terminology, a named function that invokes a specific restart - whether it is intended to act as a handler or to be called from one - is termed a restart function. This is somewhat confusing, as a restart function is not a function that implements a restart, but a function that invokes a specific one. The use_value
function mentioned above is an example.
For a detailed API reference, see the module unpythonic.conditions
.
We actually provide four signaling protocols: signal
(i.e. the fundamental protocol), and three that build additional behavior on top of it: error
, cerror
and warn
. Each of the three is modeled after its Common Lisp equivalent.
If no handler handles the signal, the signal(...)
protocol just returns normally. In effect, with respect to control flow, unhandled signals are ignored by this protocol. However, any side effects of handlers that caught the signal but did not invoke a restart, still take place.
The error(...)
protocol first delegates to signal
, and if the signal was not handled by any handler, then raises ControlError
as a regular exception. (Note the Common Lisp ERROR
function would at this point drop you into the debugger.) The implementation of error
itself is the only place in the condition system that raises an exception for the end user; everything else (including any error situations) uses the signaling mechanism.
The cerror(...)
protocol likewise makes handling the signal mandatory, but allows the handler to optionally ignore the error (sort of like ON ERROR RESUME NEXT
in some 1980s BASIC variants). To do this, invoke the proceed
restart in your handler (or use the pre-defined proceed
function as the handler); this makes the cerror(...)
call return normally. If no handler handles the cerror
, it then behaves like error
.
Finally, there is the warn(...)
protocol, which is just a lispy interface to Python's warnings.warn
. It comes with a muffle
restart that can be invoked by a handler to skip emitting a particular warning. Muffling a warning prevents its emission altogether, before it even hits Python's warnings filter.
The combination of warn
and muffle
(as well as cerror
when a handler invokes its proceed
restart) behaves somewhat like contextlib.suppress
, except that execution continues normally from the next statement in the caller of warn
(respectively cerror
) instead of unwinding to the handler.
If the standard protocols do not cover what you need, you can also build your own high-level protocols on top of signal
. See the source code of error
, cerror
and warn
for examples (it's just a few lines in each case).
The name cerror
stands for correctable error, see e.g. CERROR in the CL HyperSpec. What we call proceed
, Common Lisp calls CONTINUE
; the name is different because in Python the function naming convention is lowercase, and continue
is a reserved word.
If you really want to emulate ON ERROR RESUME NEXT
, just use Exception
as the condition type for your handler, and all cerror
calls within the block will return normally, provided that no other handler (that appears in an inner position on the call stack) handles those conditions first.
Using the condition system essentially requires eschewing exceptions, using only restarts and handlers instead. A regular raise
will fly past a with handlers
form uncaught. The form just maintains a stack of functions; it does not establish an exception handler. Similarly, a try
/except
cannot catch a signal, because no exception is raised yet at handler lookup time. Delaying the stack unwind, to achieve the three-way split of responsibilities, is the whole point of the condition system.
Which of the two systems to use is a design decision that must be made consistently on a per-project basis. Even better would be to make it globally on a per-language basis. Python's standard library, as well as all existing libraries, use exceptions instead of conditions, so to obtain a truly seamless conditions-and-restarts user experience, one would have to wrap (or rewrite) at least all of the standard library, plus any other libraries a project needs, to be protected from sudden, unexpected unwinds of the call stack. (The nature of both conditions and exceptions is that, in principle, they may be triggered anywhere.)
Be aware that error-recovery code in a Lisp-style signal handler is of a very different nature compared to error-recovery code in an exception handler. A signal handler usually only chooses a restart and invokes it; as was explained above, the code that actually performs the error recovery (i.e. the restart) lives further in on the call stack, and still has available (in its local variables) the state that is needed to perform the recovery. An exception handler, on the other hand, must respond by directly performing error recovery right where it is, without any help from inner levels - because the stack has already unwound when the exception handler gets control.
Hence, the two systems are intentionally kept separate. The language discontinuity is unfortunate, but inevitable when conditions are added to a language where an error recovery culture based on the exception model (of the regular non-resumable kind) already exists.
CAUTION: Make sure to never catch the internal InvokeRestart
exception (with an exception handler), as the condition system uses it to perform restarts. Again, do not use catch-all except
clauses!
If a handler attempts to invoke a nonexistent restart (or one that is not in the current dynamic extent), ControlError
is signaled using error(...)
. The error message in the exception instance will have the details.
If this ControlError
signal is not handled, a ControlError
will then be raised (as a last-resort measure) as a regular exception, as per the error
protocol. It is allowed to catch ControlError
with an exception handler.
Conditions are one of the killer features of Common Lisp, so if you are new to conditions, Peter Seibel: Practical Common Lisp, chapter 19 is a good place to learn about them. There is also a relevant discussion on Lambda the Ultimate.
For Python, conditions were first implemented in python-cl-conditions by Alexander Artemenko (2016).
What we provide here is essentially a rewrite, based on studying that implementation. The main reasons for the rewrite are to give the condition system an API consistent with the style of unpythonic
, to drop any and all historical baggage without needing to consider backward compatibility, and to allow interaction with (and customization taking into account) the other parts of unpythonic
.
The core idea can be expressed in fewer than 100 lines of Python; ours is (as of v0.15.0) 199 lines, not counting docstrings, comments, or blank lines. The main reason our module is over 900 lines are the docstrings.
Changed in v0.15.0. The dispatch
and typecheck
modules providing this functionality are now considered stable (no longer experimental). Starting with this release, they receive the same semantic-versioning guarantees as the rest of unpythonic
.
Added the @augment
parametric decorator that can register a new multimethod on an existing generic function originally defined in another lexical scope.
Added the function methods
, which displays a list of multimethods of a generic function. This is especially useful in the REPL.
Docstrings of the multimethods are now automatically concatenated to make up the docstring of the generic function, so you can document each multimethod separately.
curry
now supports @generic
. In the case where the number of positional arguments supplied so far matches at least one multimethod, but there is no match for the given combination of argument types, curry
waits for more arguments (returning the curried function). See the manual section on curry
for details.
It is now possible to dispatch also on a homogeneous type of contents collected by a **kwargs
parameter. In the type signature, use typing.Dict[str, mytype]
. Note that in this use, the key type is always str
.
Changed in v0.14.3. The multiple-dispatch decorator @generic
no longer takes a master definition. Multimethods are registered directly with @generic
; the first multimethod definition implicitly creates the generic function.
The @generic
and @typed
decorators can now decorate also instance methods, class methods and static methods (beside regular functions, as previously in 0.14.2).
Added in v0.14.2.
The generic
decorator allows creating multiple-dispatch generic functions with type annotation syntax. We also provide some friendly utilities: augment
adds a new multimethod to an existing generic function, typed
creates a single-method generic with the same syntax (i.e. provides a compact notation for writing dynamic type-checking code), and isoftype
(which powers the first three) is the big sister of isinstance
, with support for many (but unfortunately not all) features of the typing
standard library module.
This is a purely run-time implementation, so it does not give performance benefits, but it can make code more readable, and makes it modular to add support for new input types (or different call signatures) to an existing function later.
The terminology is:
- The function that supports multiple call signatures is a generic function.
- Each of its individual implementations is a multimethod.
The term multimethod distinguishes them from the OOP sense of method, already established in Python, as well as reminds that multiple arguments participate in dispatching.
CAUTION: Code using the with lazify
macro cannot usefully use @generic
or @typed
, because all arguments of each function call will be wrapped in a promise (unpythonic.Lazy
) that carries no type information on its contents.
The generic
decorator essentially allows replacing the if
/elif
dynamic type checking boilerplate of polymorphic functions with type annotations on the function parameters, with support for features from the typing
stdlib module. This not only kills boilerplate, but makes the dispatch extensible, since the dispatcher is separate from the actual function definition, and has a mechanism to register new multimethods.
If several multimethods of the same generic function match the arguments given, the most recently registered multimethod wins.
To see what multimethods are registered on a given generic function f
, call methods(f)
. It will print a human-readable description to stdout.
CAUTION: The winning multimethod is chosen differently from Julia, where the most specific multimethod wins. Doing that requires a more careful type analysis than what we have here.
The details are best explained by example:
import typing
from unpythonic import generic
@generic # The first definition creates the generic function, and registers the first multimethod.
def zorblify(x: int, y: int):
return "int, int"
@generic # noqa: F811, registered as a multimethod of the same generic function.
def zorblify(x: str, y: int):
return "str, int"
@generic # noqa: F811
def zorblify(x: str, y: float):
return "str, float"
# Then we just call our function as usual.
# Note all arguments participate in dispatching (i.e. in choosing which multimethod gets called).
assert zorblify(2, 3) == "int, int"
assert zorblify("cat", 3) == "str, int"
assert zorblify("cat", 3.14) == "str, float"
# Let's emulate the argument handling of Python's `range` builtin, just for fun.
# Note the meaning of the argument in each position depends on the number of arguments.
@generic
def r(stop: int):
return _r_impl(0, 1, stop)
@generic # noqa: F811
def r(start: int, stop: int):
return _r_impl(start, 1, stop)
@generic # noqa: F811
def r(start: int, step: int, stop: int):
return _r_impl(start, step, stop)
# With this arrangement, the actual implementation always gets the args in the same format,
# so we can use meaningful names for its parameters.
def _r_impl(start, step, stop):
return start, step, stop
# Arity participates in dispatching, too.
assert r(10) == (0, 1, 10)
assert r(2, 10) == (2, 1, 10)
assert r(2, 3, 10) == (2, 3, 10)
# varargs are supported via `typing.Tuple`
@generic
def gargle(*args: typing.Tuple[int, ...]): # any number of ints
return "int"
@generic # noqa: F811
def gargle(*args: typing.Tuple[float, ...]): # any number of floats
return "float"
@generic # noqa: F811
def gargle(*args: typing.Tuple[int, float, str]): # exactly three args, matching the specified types
return "int, float, str"
assert gargle(1, 2, 3, 4, 5) == "int"
assert gargle(2.71828, 3.14159) == "float"
assert gargle(42, 6.022e23, "hello") == "int, float, str"
assert gargle(1, 2, 3) == "int" # as many as in the [int, float, str] case. Still resolves correctly.
# v0.15.0: dispatching on a homogeneous type inside **kwargs is also supported, via `typing.Dict`
@generic
def kittify(**kwargs: typing.Dict[str, int]): # all kwargs are ints
return "int"
@generic
def kittify(**kwargs: typing.Dict[str, float]): # all kwargs are floats # noqa: F811
return "float"
assert kittify(x=1, y=2) == "int"
assert kittify(x=1.0, y=2.0) == "float"
See the unit tests for more. For which features of the typing
stdlib module are supported, see isoftype
below.
Beginning with v0.14.3, @generic
and @typed
can decorate instance methods, class methods and static methods (beside regular functions as in v0.14.2).
When using both @generic
or @typed
and OOP, important things to know are:
-
In case of
@generic
, consider first if that is what you really want.- The method access syntax already hides a single-dispatch mechanism behind the dot-access syntax: the syntax
x.op(...)
picks the definition ofop
based on the type ofx
. This behaves exactly like a single-dispatch function where the first argument isx
, i.e., we could as well writeop(x, ...)
. - So the question to ask is, is the use case best served by two overlapping dispatch mechanisms?
- If not, what are the alternative strategies? Would it be better, for example, to represent the operations as top-level
@generic
functions, and perform the dispatch there, dispatching to OOP methods as appropriate? @typed
is fine to use with OOP, because semantically, it is not really a dispatch mechanism, but a run-time type-checking mechanism, even though it is implemented in terms of the multiple-dispatch machinery.
- The method access syntax already hides a single-dispatch mechanism behind the dot-access syntax: the syntax
-
self
andcls
parameters.-
The
self
andcls
parameters do not participate in dispatching, and need no type annotation. -
Beside appearing as the first positional-or-keyword parameter, the self-like parameter must be named one of
self
,this
,cls
, orklass
to be detected by the ignore mechanism.This limitation is due to implementation reasons; while a class body is being evaluated, the context needed to distinguish a method (in the OOP sense) from a regular function is not yet present. In Python, OOP method binding is performed by the descriptor that triggers when the method attribute is read on an instance.
If curious, try this (tested in Python 3.8):
class Thing: def f(self): pass print(type(Thing.f)) # --> "function", i.e. the same type as a bare function assert Thing.f is Thing.f # it's always the same function object thing = Thing() print(type(thing.f)) # --> "method", i.e. a bound method of Thing instance at 0x... assert thing.f is not thing.f # each read produces a **new** bound method object lst = [1, 2, 3] print(type(lst.append)) # --> "builtin_function_or_method" assert lst.append is not lst.append # this happens even for builtins
-
-
OOP inheritance.
-
When
@generic
is installed on a method (instance method, or@classmethod
), then at call time, classes are tried in MRO order. All multimethods of the method defined in the class currently being looked up are tested for matches first, before moving on to the next class in the MRO. This has subtle consequences, related to in which class in the hierarchy the various multimethods for a particular method are defined. -
To work with OOP inheritance,
@generic
must be the outermost decorator (except@classmethod
or@staticmethod
, which are essentially compiler annotations). -
However, when installed on a
@staticmethod
, the@generic
decorator does not support MRO lookup, because that would make no sense. A static method is just a bare function that happens to be stored in a class namespace. See discussions on the interaction between@staticmethod
andsuper
in Python: [1] [2]. -
When inspecting an instance method that is
@generic
, be sure to call themethods
function on an instance:class Thing: @generic def f(self, x: int): pass @classmethod @generic def g(cls, x: int): pass thing = Thing() methods(thing.f) methods(Thing.g)
This allows seeing registered multimethods also from linked dispatchers in the MRO.
If we instead call it as
methods(Thing.f)
, theself
argument is not bound yet (becauseThing.f
is just a bare function), so the dispatch machinery cannot get a reference to the MRO. This is obviously not an issue when actually usingf
, since an instance method is pretty much always invoked on an instance.For class methods,
methods(Thing.g)
sees the MRO, becausecls
is already bound.
-
For usage examples of @generic
with OOP, see the unit tests.
The @augment
decorator adds a new multimethod to an existing generic function. With this system, it is possible to implement holy traits:
import typing
from unpythonic import generic, augment
class FunninessTrait:
pass
class IsFunny(FunninessTrait):
pass
class IsNotFunny(FunninessTrait):
pass
@generic
def funny(x: typing.Any): # default
raise NotImplementedError(f"`funny` trait not registered for anything matching {type(x)}")
@augment(funny)
def funny(x: str): # noqa: F811
return IsFunny()
@augment(funny)
def funny(x: int): # noqa: F811
return IsNotFunny()
@generic
def laugh(x: typing.Any):
return laugh(funny(x), x)
@augment(laugh)
def laugh(traitvalue: IsFunny, x: typing.Any):
return f"Ha ha ha, {x} is funny!"
@augment(laugh)
def laugh(traitvalue: IsNotFunny, x: typing.Any):
return f"{x} is not funny."
assert laugh("that") == "Ha ha ha, that is funny!"
assert laugh(42) == "42 is not funny."
CAUTION: @augment
can be dangerous to the readability of your codebase. Keep in mind that the multiple-dispatch table is global state. If you add a new multimethod for a generic function defined elsewhere, for types defined elsewhere, this may lead to spooky action at a distance (as in action at a distance), because it may change the meaning of existing code. In the Julia community, this is known as type piracy.
As Alexis King points out, no type piracy occurs if at least one of the following conditions holds:
-
At least one of the types in the call signature of the new multimethod is defined by you.
-
The generic function you are augmenting is defined by you.
Given this:
# thirdparty.py
def op(x):
if isinstance(x, int):
return 2 * x
elif isinstance(x, float):
return 2.0 * x
raise TypeError(f"unsupported argument: {type(x)} with value {repr(x)}")
you do not have to change that code, but you will have to know which argument types the existing function supports (because that information is not available in an inspectable form at its interface), and then overwrite the original binding, with something like this:
# ours.py
import thirdparty
original_op = thirdparty.op
# Multimethod implementations for the types supported by the original `op`.
# We just re-dispatch here.
@generic
def op(x: int):
return original_op(x)
@generic
def op(x: float):
return original_op(x)
thirdparty.op = op # unavoidable bit of monkey-patching
Then it can be augmented as usual:
@augment(op)
def op(x: str): # "ha" -> "ha, ha"
return ", ".join(x for _ in range(2))
while preserving the meaning of all existing code that uses thirdparty.op
.
The typed
decorator creates a one-multimethod pony, which automatically enforces its argument types. Just like with generic
, the type specification may use features from the typing
stdlib module.
import typing
from unpythonic import typed
@typed
def blubnify(x: int, y: float):
return x * y
@typed
def jack(x: typing.Union[int, str]):
return x
assert blubnify(2, 21.0) == 42
blubnify(2, 3) # TypeError
assert not hasattr(blubnify, "register") # no more multimethods can be registered on this function
assert jack(42) == 42
assert jack("foo") == "foo"
jack(3.14) # TypeError
For which features of the typing
stdlib module are supported, see isoftype
below.
Type check object instances against type specifications at run time. This is the machinery that powers generic
and typed
. This goes beyond isinstance
in that many (but unfortunately not all) features of the typing
standard library module are supported.
Any checks on the type arguments of the meta-utilities defined in the typing
stdlib module are performed recursively using isoftype
itself, in order to allow compound abstract specifications.
Some examples:
import typing
from unpythonic import isoftype
# concrete types - uninteresting, we just delegate to `isinstance`
assert isoftype(17, int)
assert isoftype(lambda: ..., typing.Callable)
# typing.newType
UserId = typing.NewType("UserId", int)
assert isoftype(UserId(42), UserId)
# Note limitation: since NewType types discard their type information at
# run time, any instance of the underlying actual run-time type will match.
assert isoftype(42, UserId)
# typing.Any
assert isoftype(5, typing.Any)
assert isoftype("something", typing.Any)
# TypeVar, bare; a named type, but behaves like Any.
X = typing.TypeVar("X")
assert isoftype(3.14, X)
assert isoftype("anything", X)
assert isoftype(lambda: ..., X)
# TypeVar, with arguments; matches only the specs in the constraints.
Number = typing.TypeVar("Number", int, float, complex)
assert isoftype(31337, Number)
assert isoftype(3.14159, Number)
assert isoftype(1 + 2j, Number)
# typing.Optional
assert isoftype(None, typing.Optional[int])
assert isoftype(1337, typing.Optional[int])
# typing.Tuple
assert isoftype((1, 2, 3), typing.Tuple)
assert isoftype((1, 2, 3), typing.Tuple[int, ...])
assert isoftype((1, 2.1, "footgun"), typing.Tuple[int, float, str])
# typing.List
assert isoftype([1, 2, 3], typing.List[int])
U = typing.Union[int, str]
assert isoftype(42, U)
assert isoftype("hello yet again", U)
# abstract container types
assert isoftype({1: "foo", 2: "bar"}, typing.MutableMapping[int, str])
assert isoftype((1, 2, 3), typing.Sequence[int])
assert isoftype({1, 2, 3}, typing.AbstractSet[int])
# one-trick ponies
assert isoftype(3.14, typing.SupportsRound)
assert isoftype([1, 2, 3], typing.Sized)
See the unit tests for more.
CAUTION: Callables are just checked for being callable; no further analysis is done. Type-checking callables properly requires a much more complex type checker.
CAUTION: The isoftype
function is one big hack. In Python 3.6 through 3.9, there is no consistent way to handle a type specification at run time. We must access some private attributes of the typing
meta-utilities, because that seems to be the only way to get what we need to do this.
The multiple-dispatch subsystem of unpythonic
was inspired by the multi-methods of CLOS (the Common Lisp Object System), and the generic functions of Julia.
In both CLOS and in Julia, function is the generic entity, while method refers to its specialization to a particular combination of argument types. Note that no object instance or class is needed. Contrast with the classical OOP sense of method, i.e. a function that is associated with an object instance or class, with single dispatch based on the class (or in exotic cases, such as monkey-patched instances, on the instance).
Based on my own initial experiments with this feature in Python, the machinery itself works well enough, but to really shine - just like conditions and restarts - multiple dispatch needs to be used everywhere, throughout the language's ecosystem. Julia is impressive here. Python obviously does not do that.
Our machinery is missing some advanced features, such as matching the most specific multimethod candidate instead of the most recently defined one; an issubclass
equivalent that understands typing
type specifications; and a mechanism to remove previously declared multimethods.
If you need multiple dispatch, but not the other features of unpythonic
, see the multipledispatch library, which likely runs faster.
If you need a run-time type checker, but not the other features of unpythonic
, see the typeguard
library. If you are fine with a separate static type checker (which is the step where type checking arguably belongs), just use Mypy
.
Utilities for dealing with exceptions.
Changed in v0.15.0. Deprecated parameters for raisef
removed.
Changed in v0.14.3. Now we have also tryf
.
Changed in v0.14.2. The parameters of raisef
now more closely match what would be passed to raise
. See examples below. Old-style parameters are now deprecated.
The raisef
function allows to raise an exception from an expression position:
from unpythonic import raisef
# plain `raise ...`
f = lambda x: raisef(RuntimeError("I'm in ur lambda raising exceptions"))
# `raise ... from ...`
exc = TypeError("oof")
g = lambda x: raisef(RuntimeError("I'm in ur lambda raising exceptions"), cause=exc)
The tryf
function is a try
/except
/else
/finally
construct for an expression position:
from unpythonic import raisef, tryf
raise_instance = lambda: raisef(ValueError("all ok"))
test[tryf(lambda: raise_instance(),
(ValueError, lambda err: f"got a ValueError: '{err.args[0]}'")) == "got a ValueError: 'all ok'"]
The exception handler is a function. It may optionally accept one argument, the exception instance. Just like in an except
clause, the exception specification can be either an exception type, or a tuple
of exception types.
Functions can also be specified to represent the else
and finally
blocks; the keyword parameters to do this are elsef
and finallyf
. Each of them is a thunk (a 0-argument function). See the docstring of unpythonic.tryf
for details.
Examples can be found in the unit tests.
Added in v0.14.3.
In Python 3.7 and later, the equip_with_traceback
function equips a manually created exception instance with a traceback. This is useful mainly in special cases, where raise
cannot be used for some reason. (The signal
function in the conditions-and-restarts system uses this.)
e = SomeException(...)
e = equip_with_traceback(e)
The traceback is automatically extracted from the call stack of the calling thread.
Optionally, you can cull a number of the topmost frames by passing the optional argument stacklevel=...
. Typically, for direct use of this function stacklevel
should be the default 1
(so it excludes equip_with_traceback
itself, but shows all stack levels from your code), and for use in a utility function that itself is called from your code, it should be 2
(so it excludes the utility function, too). If the utility function itself calls a separate low-level utility, 3
can be useful (see the source code of the conditions-and-restarts system for an example).
Added in v0.14.2.
CAUTION: Currently this is supported by CPython only, because as of June 2021, PyPy3 does not expose the required functionality to the Python level, nor there seem to be any plans to do so.
Usually injecting an exception into an unsuspecting thread makes absolutely no sense. But there are special cases, notably KeyboardInterrupt
. Especially, a REPL server may need to send a KeyboardInterrupt
into a REPL session thread that is happily stuck waiting for input inside InteractiveConsole.interact
- while the client that receives the actual Ctrl+C
is running in a separate process, possibly even on a different machine. This and similar awkward situations in network programming are pretty much the only use case for this feature.
The function is named async_raise
, because it injects an asynchronous exception. This has nothing to do with async
/await
. Synchronous vs. asynchronous exceptions mean something different.
In a nutshell, a synchronous exception (which is the usual kind of exception) has an explicit raise
somewhere in the code that the thread that encountered the exception is running. In contrast, an asynchronous exception does not, it just suddenly magically materializes from the outside. As such, it can in principle happen anywhere, with absolutely no hint about it in any obvious place in the code.
Obviously, this can be very confusing, so this feature should be used sparingly, if at all. We only provide it because the REPL server needs it, and it would be silly to have such a feature but not make it public.
Here is an example:
from unpythonic import async_raise, box
out = box()
def worker():
try:
for j in range(10):
sleep(0.1)
except KeyboardInterrupt: # normally, KeyboardInterrupt is only raised in the main thread
pass
out << j
t = threading.Thread(target=worker)
t.start()
sleep(0.1) # make sure the worker has entered the loop
async_raise(t, KeyboardInterrupt) # CPython only! This will gracefully error out on PyPy.
t.join()
assert unbox(out) < 9 # thread terminated early due to the injected KeyboardInterrupt
No, it is not. The way KeyboardInterrupt
usually works is, the OS sends a SIGINT, which is then trapped by an OS signal handler that runs in the main thread.
Note that it is an OS signal, in the *nix sense; which is unrelated to the Lisp/unpythonic
sense, as in conditions-and-restarts.
At that point the magic has already happened: the control of the main thread is now inside the signal handler, as if the signal handler was called from the otherwise currently innermost point on the call stack. All the handler needs to do is to perform a regular raise
, and the exception will propagate correctly.
Original detective work by Federico Ficarelli and LIU Wei.
Raising async exceptions is a documented feature of Python's public C API, but it was never meant to be invoked from within pure Python code. But then the CPython devs gave us ctypes.pythonapi, which allows access to CPython's C API from within Python. Combining the two gives async_raise
without the need to compile a C extension.
(If you think ctypes.pythonapi
is too quirky, the pycapi PyPI package smooths over the rough edges.)
Unfortunately PyPy does not currently (June 2021) implement this function in its CPython C API emulation layer, cpyext
. See unpythonic
issue #58.
Added in v0.15.0.
Sometimes it is useful to semantically convert exception types from one problem domain to another, particularly across the different levels of abstraction in an application. We provide reraise_in
and reraise
to do this with minimum boilerplate:
from unpythonic import reraise_in, reraise, raisef
class LibraryException(Exception):
pass
class MoreSophisticatedLibraryException(LibraryException):
pass
class UnrelatedException(Exception):
pass
class ApplicationException(Exception):
pass
# reraise_in: expr form
# The mapping is {in0: out0, ...}
try:
# reraise_in(thunk, mapping)
reraise_in(lambda: raisef(LibraryException),
{LibraryException: ApplicationException})
except ApplicationException: # note the type!
print("all ok!")
try:
# subclasses are converted, too
reraise_in(lambda: raisef(MoreSophisticatedLibraryException),
{LibraryException: ApplicationException})
except ApplicationException:
print("all ok!")
try:
# tuples of types are accepted, like in `except` clauses
reraise_in(lambda: raisef(UnrelatedException),
{(LibraryException, UnrelatedException):
ApplicationException})
except ApplicationException:
print("all ok!")
# reraise: block form
# The mapping is {in0: out0, ...}
try:
with reraise({LibraryException: ApplicationException}):
raise LibraryException
except ApplicationException:
print("all ok!")
try:
with reraise({LibraryException: ApplicationException}):
raise MoreSophisticatedLibraryException
except ApplicationException:
print("all ok!")
try:
with reraise({(LibraryException, UnrelatedException):
ApplicationException}):
raise LibraryException
except ApplicationException:
print("all ok!")
If that does not seem much shorter than a hand-written try
/except
/raise from
, consider that you can create the mapping once and then use it from a variable - this shortens it to just with reraise(my_mapping)
.
Any exceptions that do not match anything in the mapping are passed through. When no exception occurs, reraise_in
passes the return value of thunk
through, and reraise
does nothing.
Full details in docstrings.
If you use the conditions-and-restarts system, see also resignal_in
, resignal
, which perform the same job for conditions. The new signal is sent using the same error handling protocol as the original signal, so e.g. an error
will remain an error
even if re-signaling changes its type.
Examples can be found in the unit tests.
Fuel for different thinking. Compare call-with-something
in Lisps - but without parameters, so just call
. A def
is really just a new lexical scope to hold code to run later... or as @call
does, right now!
At the top level of a module, this is seldom useful, but keep in mind that Python allows nested function definitions. Used with an inner def
, this becomes a versatile tool.
Note that beside use as a decorator, call
can also be used as a normal function: call(f, *a, **kw)
is the same as f(*a, **kw)
. This is occasionally useful.
Let us consider some example use cases of @call
.
from unpythonic import call
@call
def x():
a = 2 # many temporaries that help readability...
b = 3 # ...of this calculation, but would just pollute locals...
c = 5 # ...after the block exits
return a * b * c
print(x) # 30
As was noted in the section on escape continuations, continue
, break
and return
are really just second-class ecs. So use a def
to make return
escape to exactly where you want:
from unpythonic import call
@call
def result():
for x in range(10):
for y in range(10):
if x * y == 42:
return (x, y)
print(result) # (6, 7)
But if you need a multi-return, see @catch
, throw
, and call_ec
.
Compare the sweet-exp Racket:
define result
let/ec return ; name the (first-class) ec to break out of this let/ec block
for ([x in-range(10)])
for ([y in-range(10)])
cond
{{x * y} = 42}
return (list x y)
displayln result ; (6 7)
Noting what let/ec
does, using call_ec
we can make the Python even closer to the Racket:
from unpythonic import call_ec
@call_ec
def result(rtn):
for x in range(10):
for y in range(10):
if x * y == 42:
rtn((x, y))
print(result) # (6, 7)
from unpythonic import call
@call
def result(x=1, y=2, z=3):
return x * y * z
print(result) # 6
If you want an env
instance, see blet
and bletrec
.
from unpythonic import call
@call
def t():
def evenp(x): return x == 0 or oddp(x - 1)
def oddp(x): return x != 0 and evenp(x - 1)
return evenp(42)
print(t) # True
Essentially the implementation is just def call(thunk): return thunk()
. The point of this seemingly trivial construct is to:
-
Make it explicit right at the definition site that this block is going to be called now, in contrast to an explicit call and assignment after the definition. This centralizes the related information, and aligns the presentation order with the thought process.
-
Help eliminate errors, in the same way as the habit of typing parentheses only in pairs (or using a tool like Emacs's
smartparens-mode
to enforce that). With@call
, there is no risk of forgetting to call the block after writing the definition. -
Document that the block is going to be used only once. Tell your readers there is no need to remember this definition.
Note the grammar requires a newline after a decorator.
If you need to pass arguments when using @call
as a decorator, use its sister @callwith
:
from unpythonic import callwith
@callwith(3)
def result(x):
return x**2
assert result == 9
Like call
, beside use as a decorator, callwith
can also be called normally. It is essentially an argument freezer:
from unpythonic import callwith
def myadd(a, b):
return a + b
def mymul(a, b):
return a * b
apply23 = callwith(2, 3)
assert apply23(myadd) == 5
assert apply23(mymul) == 6
When callwith
is called normally, the two-step application is mandatory. The first step stores the given arguments. It then returns a function f(callable)
. When f
is called, it calls its callable
argument, passing in the arguments stored in the first step.
In other words, callwith
is similar to functools.partial
, but without specializing to any particular function. The function to be called is given later, in the second step.
Hence, callwith(2, 3)(myadd)
means make a function that passes in two positional arguments, with values 2
and 3
. Then call this function for the callable myadd
. But if we instead write callwith(2, 3, myadd)
, it means make a function that passes in three positional arguments, with values 2
, 3
and myadd
- not what we want in the above example.
If you want to specialize some arguments now and some later, combine callwith
with partial
:
from functools import partial
from unpythonic import callwith
p1 = partial(callwith, 2)
p2 = partial(p1, 3)
p3 = partial(p2, 4)
apply234 = p3() # actually call callwith, get the function
def add3(a, b, c):
return a + b + c
def mul3(a, b, c):
return a * b * c
assert apply234(add3) == 9
assert apply234(mul3) == 24
If the code above feels weird, it should. Arguments are gathered first, and the function to which they will be passed is chosen in the last step.
Another use case of callwith
is map
, if we want to vary the function instead of the data:
from unpythonic import callwith
m = map(callwith(3), [lambda x: 2*x, lambda x: x**2, lambda x: x**(1/2)])
assert tuple(m) == (6, 9, 3**(1/2))
If you use the quick lambda macro fn[]
(underscore notation for Python), these features combine nicely:
from unpythonic.syntax import macros, fn
from unpythonic import callwith
m = map(callwith(3), [fn[2 * _], fn[_**2], fn[_**(1/2)]])
assert tuple(m) == (6, 9, 3**(1/2))
A pythonic alternative to the above examples is:
a = [2, 3]
def myadd(a, b):
return a + b
def mymul(a, b):
return a * b
assert myadd(*a) == 5
assert mymul(*a) == 6
a = [2]
a += [3]
a += [4]
def add3(a, b, c):
return a + b + c
def mul3(a, b, c):
return a * b * c
assert add3(*a) == 9
assert mul3(*a) == 24
m = (f(3) for f in [lambda x: 2*x, lambda x: x**2, lambda x: x**(1/2)])
assert tuple(m) == (6, 9, 3**(1/2))
Inspired by Function application with $ in LYAH: Higher Order Functions.
Added in v0.15.0.
Values
is a structured multiple-return-values type.
With Values
, you can return multiple values positionally, and return values by name. This completes the symmetry between passing function arguments and returning values from a function. Python itself allows passing arguments by name, but has no concept of returning values by name. This class adds that concept.
Having a Values
type separate from tuple
helps with semantic accuracy. In unpythonic
0.15.0 and later, a tuple
return value means just that - one value that is a tuple
. It is distinct from a Values
that contains several positional return values (that are meant to be treated separately e.g. by a function composition utility).
Inspired by the values
form of Racket.
Most of the time, returning a tuple to denote multiple-return-values and unpacking it is just fine, and that is exactly what unpythonic
does internally in many places.
But the distinction is critically important in function composition, so that positional return values can be automatically mapped into positional arguments to the next function in the chain, and named return values into named arguments.
Accordingly, various parts of unpythonic
that deal with function composition use the Values
abstraction; particularly curry
, unfold
, iterate
, the compose
and pipe
families, and the with continuations
macro.
Values
is a duck-type with some features of both sequences and mappings, but not the full collections.abc
API of either.
If there are no named return values in a Values
object, it can be unpacked like a tuple. This covers the common use case of multiple positional return values with a minimum of fuss.
Each operation that obviously and without ambiguity makes sense only for the positional or named part, accesses that part.
The only exception is __getitem__
(subscripting), which makes sense for both parts, unambiguously, because the key types differ. If the index expression is an int
or a slice
, it is an index/slice for the positional part. If it is an str
, it is a key for the named part.
If you need to explicitly access either part (and its full API), use the rets
and kwrets
attributes. The names are in analogy with args
and kwargs
.
rets
is a tuple
, and kwrets
is an unpythonic.frozendict
.
Values
objects can be compared for equality. Two Values
objects are equal if both their rets
and kwrets
(respectively) are.
See the docstrings, the source code, and the unit tests for full details.
Examples:
from unpythonic import Values
def f():
return Values(1, 2, 3)
result = f()
assert isinstance(result, Values)
assert result.rets == (1, 2, 3)
assert not result.kwrets
assert result[0] == 1
assert result[:-1] == (1, 2)
a, b, c = result # if no kwrets, can be unpacked like a tuple
a, b, c = f()
def g():
return Values(x=3) # named return value
result = g()
assert isinstance(result, Values)
assert not result.rets
assert result.kwrets == {"x": 3} # actually a `frozendict`
assert "x" in result # `in` looks in the named part
assert result["x"] == 3
assert result.get("x", None) == 3
assert result.get("y", None) is None
assert tuple(result.keys()) == ("x",) # also `values()`, `items()`
def h():
return Values(1, 2, x=3)
result = h()
assert isinstance(result, Values)
assert result.rets == (1, 2)
assert result.kwrets == {"x": 3}
a, b = result.rets # positionals can always be unpacked explicitly
assert result[0] == 1
assert "x" in result
assert result["x"] == 3
def silly_but_legal():
return Values(42)
result = silly_but_legal()
assert result.rets[0] == 42
assert result.ret == 42 # shorthand for single-value case
The last example is silly, but legal, because it is preferable to just omit the Values
if it is known that there is only one return value. This also applies when that value is a tuple
, when the intent is to return it as a single tuple
, in contexts where this distinction matters.
The valuify
decorator converts the pythonic tuple-as-multiple-return-values idiom into Values
, to easily use existing code with our function composition utilities.
It converts a tuple
return value, exactly; no subclasses.
Demonstrating only the conversion:
from unpythonic import valuify, Values
@valuify
def f(x, y, z):
return x, y, z
assert isinstance(f(1, 2, 3), Values)
assert f(1, 2, 3) == Values(1, 2, 3)
We briefly introduce the functions below. More details and examples can be found in the docstrings and in the unit tests.
CAUTION for anyone new to numerics:
When working with floating-point numbers, keep in mind that they are, very roughly speaking, a finite-precision logarithmic representation of ℝ. They are, necessarily, actually a subset of ℚ, that is not even dense. The spacing between adjacent floats depends on where you are on the real line; see ulp
below.
For finer points concerning the behavior of floating-point numbers, see David Goldberg (1991): What every computer scientist should know about floating-point arithmetic, or for a tl;dr version, the floating point guide.
Or you could look at my lecture slides from 2018; particularly, lecture 7 covers the floating-point representation. It collects the most important details in a few slides, and contains some more links to further reading.
Test floating-point numbers for near-equality. Beside the built-in float
, we support also the arbitrary-precision software-implemented floating-point type mpf
from SymPy
's mpmath
package.
Anything else, for example SymPy
expressions, strings, and containers (regardless of content), is tested for exact equality.
For mpmath.mpf
, we just delegate to mpmath.almosteq
, with the given tolerance.
For float
, we use the strategy suggested in the floating point guide, because naive absolute and relative comparisons against a tolerance fail in commonly encountered situations.
Added in v0.14.2.
Not to be confused with the logical fixed point with respect to the definedness ordering, which is what Haskell's fix
function relates to.
Compute the (arithmetic) fixed point of a function, starting from a given initial guess. The fixed point must be attractive for this to work. See the Banach fixed point theorem.
If the fixed point is attractive, and the values are represented in floating point (hence finite precision), the computation should eventually converge down to the last bit (barring roundoff or catastrophic cancellation in the final few steps). Hence the default tolerance is zero; but any desired tolerance can be passed as an argument.
CAUTION: an arbitrary function from ℝ to ℝ does not necessarily have a fixed point. Limit cycles and chaotic behavior of the function will cause non-termination. Keep in mind the classic example, the logistic map.
Examples:
from math import cos, sqrt
from unpythonic import fixpoint, ulp
c = fixpoint(cos, x0=1)
# Actually "Newton's" algorithm for the square root was already known to the
# ancient Babylonians, ca. 2000 BCE. (Carl Boyer: History of mathematics)
# Concerning naming, see also https://en.wikipedia.org/wiki/Stigler's_law_of_eponymy
def sqrt_newton(n):
def sqrt_iter(x): # has an attractive fixed point at sqrt(n)
return (x + n / x) / 2
return fixpoint(sqrt_iter, x0=n / 2)
assert abs(sqrt_newton(2) - sqrt(2)) <= ulp(1.414)
Changed in v0.15.0. Added partition_int_triangular
and partition_int_custom
.
Added in v0.14.2.
Not to be confused with unpythonic.partition
, which partitions an iterable based on a predicate.
The partition_int
function partitions a small positive integer, i.e., splits it in all possible ways, into smaller integers that sum to it. This is useful e.g. to determine the number of letters to allocate for each component of an anagram that may consist of several words.
The partition_int_triangular
function is like partition_int
, but accepts only triangular numbers (1, 3, 6, 10, ...) as components of the partition. This function answers a timeless question: if I have n
stackable plushies, what are the possible stack configurations?
The partition_int_custom
function is like partition_int
, but lets you specify which numbers are acceptable as components of the partition.
Examples:
from itertools import count, takewhile
from unpythonic import partition_int, partition_int_triangular, rev
assert tuple(partition_int(4)) == ((4,), (3, 1), (2, 2), (2, 1, 1), (1, 3), (1, 2, 1), (1, 1, 2), (1, 1, 1, 1))
assert tuple(partition_int(5, lower=2)) == ((5,), (3, 2), (2, 3))
assert tuple(partition_int(5, lower=2, upper=3)) == ((3, 2), (2, 3))
assert (frozenset(tuple(sorted(c)) for c in partition_int_triangular(78, lower=10)) ==
frozenset({(10, 10, 10, 10, 10, 28),
(10, 10, 15, 15, 28),
(15, 21, 21, 21),
(21, 21, 36),
(78,)}))
evens_upto_n = lambda n: takewhile(lambda m: m <= n, count(start=2, step=2))
assert tuple(partition_int_custom(6, rev(evens_upto_n(6)))) == ((6,), (4, 2), (2, 4), (2, 2, 2))
As the first example demonstrates, most of the splits are a ravioli consisting mostly of ones. It is much faster to not generate such splits than to filter them out from the result. Use the lower
parameter to set the smallest acceptable value for one component of the split; the default value lower=1
generates all splits. Similarly, the upper
parameter sets the largest acceptable value for one component of the split. The default upper=None
sets no upper limit, so in effect the upper limit becomes n
.
In partition_int_triangular
, the lower
and upper
parameters work exactly the same. The only difference to partition_int
is that each component of the split must be a triangular number.
In partition_int_custom
, the components are given as an iterable, which is immediately forced (so if it is consumable, it will be completely consumed; and if it is infinite, the function will use up all available RAM and not terminate). Each component x
must be an integer that satisfies 1 <= x <= n
.
CAUTION: The number of possible partitions grows very quickly with n
, so in practice these functions are only useful for small numbers, or when the smallest allowed component is not too much smaller than n / 2
.
Added in v0.14.2.
Given a floating point number x
, return the value of the unit in the last place (the "least significant bit"). This is the local size of a "tick", i.e. the difference between x
and the next larger float. At x = 1.0
, this is the machine epsilon, by definition of the machine epsilon.
The float format is IEEE-754, i.e. standard Python float
.
This is just a small convenience function that is for some reason missing from the math
standard library.
from unpythonic import ulp
# in IEEE-754, exponent changes at integer powers of two
print([ulp(x) for x in (0.25, 0.5, 1.0, 2.0, 4.0)])
# --> [5.551115123125783e-17,
# 1.1102230246251565e-16,
# 2.220446049250313e-16, # x = 1.0, so this is sys.float_info.epsilon
# 4.440892098500626e-16,
# 8.881784197001252e-16]
print(ulp(1e10))
# --> 1.9073486328125e-06
print(ulp(1e100))
# --> 1.942668892225729e+84
print(ulp(2**52))
# --> 1.0 # yes, exactly 1
When x
is a round number in base-10, the ULP is not, because the usual kind of floats use base-2.
Stuff that didn't fit elsewhere.
Changed in v0.15.0. This utility now ignores unpythonic
's call helpers, and gives the filename from the deepest stack frame that does not match one of our helpers. This allows the testing framework report the source code filename correctly when testing code using macros that make use of these helpers (e.g. autocurry
, lazify
).
Added in v0.14.3.
Return the filename from which this function is being called. Useful as a building block for debug utilities and similar.
Added in v0.14.3.
Convenience function. Like issubclass(cls)
, but if cls
is not a class, swallow the TypeError
and return False
.
The default tuple
constructor accepts a single iterable. But sometimes one needs to pass in the elements separately. Most often a literal tuple such as (1, 2, 3)
is then the right solution, but there are situations that do not admit a literal tuple.
In such cases it is possible to use pack
:
from unpythonic import pack
myzip = lambda lol: map(pack, *lol)
lol = ((1, 2), (3, 4), (5, 6))
assert tuple(myzip(lol)) == ((1, 3, 5), (2, 4, 6))
Rename any function object, even a lambda. The return value of namelambda
is a modified copy; the original function object is not mutated. The input can be any function object (isinstance(f, (types.LambdaType, types.FunctionType))
). It will be renamed even if it already has a name.
This is mainly useful in those situations where you return a lambda as a closure, call it much later, and it happens to crash - so you can tell from the stack trace which of the N lambdas in your codebase it is.
namelambda
conforms to the parametric decorator API. Usage:
from unpythonic import namelambda
square = namelambda("square")(lambda x: x**2)
assert square.__name__ == "square"
kaboom = namelambda("kaboom")(lambda: some_typoed_name)
kaboom() # --> stack trace, showing the function name "kaboom"
The first call returns a foo-renamer, which takes a function object and returns a copy that has its name changed to foo.
Technically, this updates __name__
(the obvious place), __qualname__
(used by repr()
), and __code__.co_name
(used by stack traces).
CAUTION: There is one pitfall:
from unpythonic import namelambda, withself
nested = namelambda("outer")(lambda: namelambda("inner")(withself(lambda self: self)))
print(nested.__qualname__) # "outer"
print(nested().__qualname__) # "<lambda>.<locals>.inner"
The inner lambda does not see the outer's new name; the parent scope names are baked into a function's __qualname__
too early for the outer rename to be in effect at that time.
This is a small convenience utility, used as follows:
from unpythonic import timer
with timer() as tim:
for _ in range(int(1e6)):
pass
print(tim.dt) # elapsed time in seconds (float)
with timer(p=True): # if p, auto-print result
for _ in range(int(1e6)):
pass
The auto-print mode is a convenience feature to minimize bureaucracy if you just want to see the Δt. To instead access the Δt programmatically, name the timer instance using the with ... as ...
syntax. After the context exits, the Δt is available in its dt
attribute. The timer instance itself stays alive due to Python's scoping rules.
Added in v0.15.1.
Convert a duration from seconds (float
or int
) to a human-readable string of days, hours, minutes and seconds.
assert format_human_time(30) == "30 seconds"
assert format_human_time(90) == "01:30" # mm:ss
assert format_human_time(3690) == "01:01:30" # hh:mm:ss
assert format_human_time(86400 + 3690) == "1 day 01:01:30"
assert format_human_time(2 * 86400 + 3690) == "2 days 01:01:30"
Added in v0.15.1.
Simple but useful:
n = 1000
est = ETAEstimator(total=n, keep_last=10)
for k in range(n):
print(f"Processing item {k + 1} out of {n}, {est.formatted_eta}")
... # do something
est.tick()
The ETA estimate is automatically formatted using format_human_time
(see above) to maximize readability.
from unpythonic import getattrrec, setattrrec
class Wrapper:
def __init__(self, x):
self.x = x
w = Wrapper(Wrapper(42))
assert type(getattr(w, "x")) == Wrapper
assert type(getattrrec(w, "x")) == int
assert getattrrec(w, "x") == 42
setattrrec(w, "x", 23)
assert type(getattr(w, "x")) == Wrapper
assert type(getattrrec(w, "x")) == int
assert getattrrec(w, "x") == 23
Changed in v0.15.0. Now resolve_bindings
is a thin wrapper on top of inspect.Signature.bind
, which was added in Python 3.5. In unpythonic
0.14.2 and 0.14.3, we used to have our own implementation of the parameter binding algorithm (that ran also on Python 3.4), but it is no longer needed, since now we support only Python 3.6 and later. Now resolve_bindings
returns an inspect.BoundArguments
object.
Now tuplify_bindings
accepts an inspect.BoundArguments
object instead of its previous input format. The function is only ever intended to be used to postprocess the output of resolve_bindings
, so this change shouldn't affect your own code.
Added in v0.14.2: resolve_bindings
. Get the parameter bindings a given callable would establish if it was called with the given args and kwargs. This is mainly of interest for implementing memoizers, since this allows them to see (e.g.) f(1)
and f(a=1)
as the same thing for def f(a): pass
. Thanks to Graham Dumpleton, the author of the wrapt
library, for noticing and documenting this gotcha.
These are convenience functions providing an easy-to-use API for inspecting a function's signature. The heavy lifting is done by inspect
.
Methods on objects and classes are treated specially, so that the reported arity matches what the programmer actually needs to supply when calling the method (i.e., implicit self
and cls
are ignored).
from unpythonic import (arities, arity_includes, UnknownArity,
kwargs, required_kwargs, optional_kwargs,
resolve_bindings)
f = lambda a, b: None
assert arities(f) == (2, 2) # min, max positional arity
f = lambda a, b=23: None
assert arities(f) == (1, 2)
assert arity_includes(f, 2) is True
assert arity_includes(f, 3) is False
f = lambda a, *args: None
assert arities(f) == (1, float("+inf"))
f = lambda *, a, b, c=42: None
assert arities(f) == (0, 0)
assert required_kwargs(f) == set(('a', 'b'))
assert optional_kwargs(f) == set(('c'))
assert kwargs(f) == (set(('a', 'b')), set(('c')))
class A:
def __init__(self):
pass
def meth(self, x):
pass
@classmethod
def classmeth(cls, x):
pass
@staticmethod
def staticmeth(x):
pass
assert arities(A) == (0, 0) # constructor of "A" takes no args beside the implicit self
# methods on the class
assert arities(A.meth) == (2, 2)
assert arities(A.classmeth) == (1, 1)
assert arities(A.staticmeth) == (1, 1)
# methods on an instance
a = A()
assert arities(a.meth) == (1, 1) # self is implicit, so just one
assert arities(a.classmeth) == (1, 1) # cls is implicit
assert arities(a.staticmeth) == (1, 1)
def f(a, b, c):
pass
assert tuple(resolve_bindings(f, 1, 2, 3).items()) == (("a", 1), ("b", 2), ("c", 3))
assert tuple(resolve_bindings(f, a=1, b=2, c=3).items()) == (("a", 1), ("b", 2), ("c", 3))
assert tuple(resolve_bindings(f, 1, 2, c=3).items()) == (("a", 1), ("b", 2), ("c", 3))
assert tuple(resolve_bindings(f, 1, c=3, b=2).items()) == (("a", 1), ("b", 2), ("c", 3))
assert tuple(resolve_bindings(f, c=3, b=2, a=1).items()) == (("a", 1), ("b", 2), ("c", 3))
We special-case the builtin functions that either fail to return any arity (are uninspectable) or report incorrect arity information, so that also their arities are reported correctly. Note we do not special-case the methods of any builtin classes, so e.g. list.append
remains uninspectable. This limitation might or might not be lifted in a future version.
If the arity cannot be inspected, and the function is not one of the special-cased builtins, the UnknownArity
exception is raised.
Up to v0.14.3, various places in unpythonic
used to internally use arities
; particularly curry
, fix
, and @generic
. As of v0.15.0, we have started to prefer resolve_bindings
, because often what matters are the parameter bindings established, and performing the binding covers all possible ways to pass arguments. The let
and FP looping constructs still use arities
to emit a meaningful error message if the signature of user-provided function does not match what is expected.
Inspired by various Racket functions such as (arity-includes?)
and (procedure-keywords)
.
Consider this highly artificial example:
from collections import deque
inp = deque(range(5))
out = []
while inp:
x = inp.pop(0)
out.append(x)
assert inp == deque([])
assert out == list(range(5))
Popper
condenses the while
and pop
into a for
, while allowing the loop body to mutate the input iterable in arbitrary ways (we never actually iter()
it):
from collections import deque
from unpythonic import Popper
inp = deque(range(5))
out = []
for x in Popper(inp):
out.append(x)
assert inp == deque([])
assert out == list(range(5))
inp = deque(range(3))
out = []
for x in Popper(inp):
out.append(x)
if x < 10:
inp.appendleft(x + 10)
assert inp == deque([])
assert out == [0, 10, 1, 11, 2, 12]
Popper
comboes with other iterable utilities, such as window
:
from collections import deque
from unpythonic import Popper, window
inp = deque(range(3))
out = []
for a, b in window(2, Popper(inp)):
out.append((a, b))
if a < 10:
inp.append(a + 10)
assert inp == deque([])
assert out == [(0, 1), (1, 2), (2, 10), (10, 11), (11, 12)]
Although window
invokes iter()
on the Popper
instance, this works because the Popper
never invokes iter()
on the underlying container. Any mutations to the input container performed by the loop body will be understood by Popper
and thus also seen by the window
. The first n
elements, though, are read before the loop body gets control, because the window needs them to initialize itself.
One possible real use case for Popper
is to split sequences of items, stored as lists in a deque, into shorter sequences where some condition is contiguously True
or False
. When the condition changes state, just commit the current subsequence, and push the rest of that input sequence (still requiring analysis) back to the input deque, to be dealt with later.
The argument to Popper
contains the remaining items. Each iteration pops an element from the left. The loop terminates when, at the start of an iteration, there are no more items remaining.
The input container must support either popleft()
or pop(0)
. This is fully duck-typed. At least collections.deque
and any collections.abc.MutableSequence
(including list
) are fine.
Per-iteration efficiency is O(1) for collections.deque
, and O(n) for a list
.
Named after Karl Popper.