Reflections on prysm, part 03: inferior composability

This is the third in a series of posts that take a look at my python library, prysm. See The first post for more introduction.

This one will focus on composability of the library, which is how the pieces fit together.

prysm was written in a heavily factored way – the library is spread across nearly 30 python source files, and is divided into reusable blocks. That design is, I think, very good. Functions that deal with geometry should not be adjacent to ones for optical propagations, as they are completely unrelated. However, multiple imports in python is not particularly ergonomic, perhaps because the tooling/environments do not do automatic imports well. There is a lot more appeal in

from prysm import NollZernike

pu = NollZernike(...)
pu.rms

than

from prysm.zernike import NollZernike
from prysm.util import rms

pu = NollZernike(...)
rms(pu.phase)

I will ignore star imports. The actual code differs by very little in a practical sense only requiring the user to know that the phase attribute exists.

This results in every class you may ever want to know the RMS of having an rms property. This design was intentional and, to the person who made it (and is writing this post) obvious.

Yet users are continuously surprised that they can just type “.rms” on just about anything and get the number they want.

This tells me that the mountain of inheritance in prysm is confusing, and users often do not know what is or is not implemented.

RMS and other properties are largely implemented as one-liner functions and do not contribute much code to go with their inheritance. There are other functions which do. For example, we can compare the functions available on two types that are predominantly the same:

from prysm import Interferogram, FringeZernike

set(dir(Interferogram)) - set(dir(FringeZernike))
{'bandlimited_rms',
 'crop',
 'dropout_percentage',
 'fill',
 'filter',
 'fit_zernikes',
 'from_zygo_dat',
 'latcal',
 'mask',
 'pad',
 'psd',
 'pvr',
 'recenter',
 'remove_piston',
 'remove_piston_tiptilt',
 'remove_piston_tiptilt_power',
 'remove_power',
 'remove_tiptilt',
 'render_from_psd',
 'save_zygo_ascii',
 'spike_clip',
 'strip_latcal',
 'total_integrated_scatter'}

set(dir(FringeZernike)) - set(dir(Interferogram))
{'__add__',
 '__sub__',
 '_cache',
 '_name',
 'barplot',
 'barplot_magnitudes',
 'barplot_topn',
 'build',
 'fcn',
 'from_interferogram',
 'magnitudes',
 'names',
 'strehl',
 'top_n',
 'truncate',
 'truncate_topn'}

We’ve got 23 methods in one direction and 16 in the other which are not common. If most of the library was written in free functions instead of class methods, it would be more data oriented. I think z = FringeZernike(...); rms(z.phase) is less ambiguous than z.rms, too.

Discoverability

Composability and discoverability are two sides of the same coin. Code is composable when you can use A with B. Code is discoverable when you know you can use A with B.

The system used in prysm has led to inferior discoverability. Go and Julia have somewhat opposite approaches to this problem. In Go, you can write code like this:

type Foo interface {
    Bar(string)
}

type Baz struct {
    a string // content unimportant
}

func (b *Baz) Bar(inp string) { b.a = inp }

func Set(iface Foo, arg string) {
    ifar.Bar(arg)
}

Set(&Baz{}, "abcd")

The struct Baz has a method Bar with a pointer reciever. A pointer reciever is used on a reference to a struct, which “enables” mutation by making it visible to the caller. If the reciever were not a pointer, the code would work but the caller would not see the changes since Go copies function arguments.

Baz (technically, *Baz) implements Foo, so we can pass it to Set. The compiler checks this for us, and we do not need to make explicit that Baz implements Foo with any symbols (like Baz : IFoo in C#).

I think this works well in Go because the compiler tells you about invalid usage, and the language tooling enables that feedback in realtime within your editor. You could not call Set(Baz{}, "abcd"), for example. Three interfaces from the io package in stdlib are ubiquitous:

type Reader interface {
    Read(b []byte) (int, error)
}

type Writer interface {
    Write(b []byte) (int, error)
}

type Closer interface {
    Close() error
}

Julia takes the opposite approach. It uses multiple dispatch, which is ~= method (and operator) overloading in python. This results in the same fundamental behavior as Go, except the syntax is inverted. Where you might do

conn, err := net.Dial(...)
// ...
conn.Close()

in Julia, you would do

conn = Dial(...)
close(conn)

I think the inversion is less discoverable. I think the association of Close to conn is easier to wrap the mind around than “the close function has been overloaded to understand conn.”

Coming back to prysm, in Go it might make sense to define N one-line wrappers around some stats.RMS function or in julia to overload rms N times. I thin kthose are all less clear than rms(foo.data). The API is smaller, too, which means less code to write or read. In the case of Julia it would also mean faster compile times due to less overloading.

These is less “sex appeal” to doing things that way. I think, probably, the less sexy way is better. Or at least easier to learn.

There may be a natural tension between fluent APIs, which are highly usable and efficient for their expert users, and “explicit” ones which are faster to learn. It is not difficult to learn how Go’s typing works, but I think that has to do with quality of tooling. Dynamic languages like python, home of prysm, or Julia, simply do not have tooling of the same caliber.

Technically, I could write def rms(anything): return _rms(anything.data), but I think it is unclear what that works for, and users seem to not explore naturally.

Brandon Dube
Brandon Dube
Optical Engineer