So, You Want To Write a Raytracer?
Introduction
I have written quite a large amount about writing diffraction propagation and other physical optics software. I had some down time during an explosive COVID wave this winter, and decided I would spend some of it writing a ray tracer.
A major weakness of diffraction modeling is that it cannot account for what really happens when optics are misaligned, and is usually pre-fed by linear sensitivity matricies which have unknown domains of validity and leave out the fun of things like changes in pupil aberrations. It seems natural that the highest fidelity of modeling is a hybrid approach which utilizes raytracing to figure out what really happened, then uses wave propagation to accurately capture it.
Before anyone gets too excited, I’m not out to eat optical design software’s lunch. This is intended to be an analysis tool. It’s actually fairly zippy for something with a core banged out in a 1 FTE x 1 day or so. It’s not actually very slow – 2.5 million ray-surfaces per second on a laptop CPU, with full GPU compatability for > 1000x faster speeds at scale. Not bad for pure python.
I have long found two resources to be seminal on ray-tracing: the Finite Raytracing chapters in Welford’s Aberrations of Optical Systems, and Spencer & Murty’s 1961 JOSA paper, General Ray-Trace Procedure. The Spencer & Murty paper has a more modern or approachable syntax, despite being 20 years older. And it labors less over “special” types of surfaces, which dominate the pages of Welford’s chapters. Just be mindful of Horwitz’ 1991 correction for the fourth step. This post will be something of a tutorial on implementing their paper, so turn away now or break out your highlighters and notebooks. I’ll also suggest the source code of Mike Hayford’s ray-optics and Gavin Niendorf’s TracePy, if you want to look over things written by folks whose brains work a bit different to mine.
Before going further, a ray is described by its position and direction cosines. The latter essentially describe the direction in which it will travel in the future, or the “orientation” of the ray.
- (optional) Translate and rotate the ray from a global coordinate system to the surface’s local coordinate system
- Intersect the ray with the surface
- Reflect or refract the ray
- (optional) return to the global coordinate system
Global to Local Coordinates
To perform the first step is easy,
Ray-Surface intersection
Ray intersection relies on some special tricks S&M have come up with. First, their peculiar definition of an optical surface,
Computing these derivatives is laborious for more complex surface geometries. S&M give an answer for conics, but their definition of a conic is different to the colloquial one, so I don’t suggest using theirs. The chain and product rule, done by hand, are how I did it for every polynomial in prysm you’d care about (including Zernikes and all three kinds of Q polynomial).
Given those two capabilities, begin from
At the end of these iterations, the point which intersects the surface is
Reflection and Refraction
Reflection is simple,
No iteration is required, and the surface intersection function will give us the final value of
S&M solve refraction by using iteration. It’s not immediately obvious to me why they chose to do that, since it’s fairly expensive to do so. Instead here I’ll just show the final answer after working through the vectorial form of Snell’s law to the final answer,
The final step contains an error in S&M (Horwitz 1991) and should be written as the first step in reverse,
With those things done, you have completed an implementation of a raytracer. And, like me, you will find it takes about 50 microseconds per ray-surface – 20,000 ray-surfaces per second. But didn’t I say 2.5 million?
Performance
The fantastic thing about numpy is that it allows you to write code that very clearly implements numerical algorithms that operate on arrays. And, if those arrays are big enough, it’s at least as fast as a totally expert numerics programmer in C++ or Fortran. After all, a hundred of those people have been optimizing Numpy for 20 years. Just try to using Eigen or hand rolled code, You will be amazed how hard it is to even hit par. Or, like me you can write an experiment in a better compiled language than C++, and find that you can do twice as fast as numpy in exchange for having no access to plotting libraries or an easy way to call the code from an environment anyone wants to use it from. Twice as fast is still 500x slower than the GPU using Cupy, which requires no source code changes to the python. So you can imagine which I’m more interested in.
Anyway. If you have to do many tiny calculations – and the 3 element vectors here are tiny – you can actually batch them with numpy, paying the python tax only once for all of them. For example, if you want to do the matrix-vector product between np.einsum('ij,ij->i', S, R)
. The syntax is cryptic.
Inside the intersection, you simply need to keep a mask around of which rays have/have not converged and only iterate those rays which have not converged. This is implemented in the source code of prysm. I won’t explain it here, since it’t just bookkeeping. This is the biggest tax paid, since for the first time it invents work done purely for the parallel processing, which wouldn’t exist in a simple for loop in a compiled language.
The API is the same for batch raytraces and for single raytraces, simply give P and S to the raytrace function with a prescription. If P has one dimension it’s a scalar ray trace, and if it has two it’s a batch raytrace. Internally a scalar raytrace is expanded to an N=1 batch raytrace so the code can be the same.
I did a deep dive I wont’ blabber about here, and found that even with multithreading I can only go 2-3x faster than the python code single threaded by parallelizing it, which wasn’t very complicated. So to scale to lots of cores, boy oh boy do you need lots of memory bandwidth (hello, DDR5?) At least the Nvidia A100 has checks notes 2TB/s of memory bandwidth, 40x that of my laptop.
Creature Comforts
No raytrace is any good without the ability to make ray fans or grids, aim rays, draw raytrace plots, or do basic solves like for the paraxial image location.
A PIM solve can be implemented by tracing rays near the paraxial regime along the x=0 or y=0 spines and finding where they intersect each other. This is robust in ways that tallying up element powers and object locations is not. Simply trace one pair of rays in Y and one in X and intersect the two pairs. The average of those two locations is the paraxial image location.
Matplotlib is happy to draw line groups by giving it column vectors, so we just unpack the final axis of the history of ax.plot(phist[..., 2], phist[..., 1])
for a Z-Y plot. It’s even pretty fast (1.3 sec for 4096 rays, 25 ms for 32 rays). Generating ray grids is just a little front-end API to making 1D arrays of X,Y,Z and stacking them with mirrored direction cosines (which begin as
All and all, this is about a thousand lines of code for a ray tracer with asymptotic performance similar to the state of the art, with most of the needed creature comforts. Supporting new surface geometries is less than ten lines of code each by using the rich polynomials module of prysm and its coordinate centric design (as opposed to npix).
All in all, pretty nice.
I’ll close this with a piece of art, made by tracing a few thousand rays and plotting them with some transparency to show the caustics. It’s simply a concave mirror with a very large conic constant (-50). The caustic makes the appearance of Cassegrain family telescope with only one mirror.