OpenSCAD 3D rendering just got an order of magnitude faster. Here's how and what may come next.

TL;DR: OpenSCAD users: download a nightly build & enable fast-csg in settings for 10x faster render (YMMV). Make your models more ambitious and report issues / successes here!

Note: opinions expressed here are my own.

OpenSCAD = 3D CAD for developers
CGAL Nef Polyhedra + GMP = so precise it hurts (performance)
Just concatenating meshes, huh?
CGAL’s other hidden gem: corefinement functions
Great results already
What about multithreading? Or skipping operations altogether? 🙃
Rewriting trees to increase laziness and parallelizability
What’s next?
Thanks!

OpenSCAD = 3D CAD for developers

OpenSCAD is a popular open-source design tool for 3D printing afficionados (and others). It’s essentially a CAD software for programmers with a minimalist UI. A simple declarative programming language defines Constructive Solid Geometry (CSG) operations like unions, intersections, differences, which can be parameterized with loops and variables.

Last year I grew fond of it to design ever more complex models, but started running into limitations: while interactive rendering was fine, the final rendering (creating the STL files to give to the 3D printing slicer) was horrendously slow. Slow enough to prevent me from doing what I wanted, which was to generate large, varying scalemail patterns:

a very slow model to render

But something was off: my slicer software (Cura) was able to handle 10x10 grids of my scalemail pattern without breaking a sweat, so why was OpenSCAD so slow? So after investigating alternatives (wasn’t keen to get locked in a certain popular but proprietary CAD software with huge a learning curve and high prices after its 1 year hobbyist licenses), I dove into OpenSCAD’s codebase and tried to understand what it was up to.

CGAL Nef Polyhedra + GMP = so precise it hurts (performance)

As it turns out, OpenSCAD is using the Computational Geometry Algorithms Library (CGAL) for its CSG operations. That amazing library allows for completely generic numeric types, and the type OpenSCAD is using is some voodoo magic “exact” numeric type based on the GNU Multiple Precision Arithmetic Library (GMP).

More specifically, OpenSCAD is using 3D boolean operations on Nef Polyhedra, which provides extrely robust and accurate results, two welcome properties in the 3D printing hobby, but also is extremely complex and slow to run (especially with those exact underlying number abstractions).

Just concatenating meshes, huh?

But surely I thought, if we just need to assemble a grid of my interlocking patterns without any intersection, it’s just a matter of concatenating the meshes, no special operations needed.

Even for unions, many geometrical operations are required when overlaps / intersections require creating new points, new edges, so as to maintain the topological soundness (manifoldness) of solids so they can be 3D-printed later, etc.

So I started exploring the “fast-union” route (openscad#3636), in which I’d literally concatenated meshes if I could determine that they can’t actually intersect with each other based on their bounding boxes. This worked great for some models, but ran into weird performance for others. Searching for good operands to union together is important for many models, e.g. in a grid of overlapping tiles you can skip a tile to do a fast-union but two consecutive tiles would need a proper union. That search neededs to be bounded (otherwise would be in quadratic time of # of operands; randomizing the operands would kind of work well enough), the bounding box operations need to be lighter than the geometry itself (e.g. not true when unioning a swarm of cubes), many other things needed to be taken into account.

I’ve paused this track for now but might explore it again.

CGAL’s other hidden gem: corefinement functions

As some of the extremely friendly and helpful OpenSCAD project maintainers ❤️ pointed out on IRC, there was another option in stock: faster operations using the Polygon Mesh Processing CGAL package (authored by sloriot@) and its “corefinement” functions. For that I had to introduce Surface_mesh (after trying w/ less efficient Polyhedron_3) into OpenSCAD.

Corefinement works great in some cases, crashes (sometimes in unrecoverable ways) or politely fails in others (for instance it dislikes non-manifold inputs, or inputs that share edges / vertices). So I ended up creating a hybrid of Surface_mesh corefinement and Nef operations, reverting to the slow safety of Nefs when we detect it’s likely unsafe to use corefinement (openscad#4087). My goals was to provide the maximum possible backwards compatibility with all the existing model files out there.

Great results already

The result: 10x faster or more for my models 🎉🎉🎉

And up to 100x if you remove the aforementioned safety checks with the fast-csg-trust-corefinement feature (openscad#4101.

You should test it yourself (grab a nightly build; also hopefully we get some official benchmark suite soon, see discussion in openscad#3931).

What about multithreading? Or skipping operations altogether? 🙃

So I was getting there, performance-wise, but my models were still too slow to really scale up in complexity.

It was quite obvious from the start that OpenSCAD’s rendering is mono-threaded, so I tried multithreading it (as others had done here and there, which I didn’t know yet) but found it hard to get good speedups because of the tree structure of lots of models. Dependencies limit the amount of parallel operations that can be performed.

I then discovered that new experimental lazy-union feature (openscad#350, in nightly builds since a year) which essentially leaves the final top-level union up to the slicer to do. This can be a big deal if your model is just a top-level concatenation of very complex models, as it literally takes no time at all to (not) do that top-level union.

BUT, as soon as modules are involved, or for loops, or transforms, etc, lazy-union’s benefits are out of reach and OpenSCAD does perform actual unions, which may be 100x faster with fast-csg, but still not fast enough to scale my scalemail to 100x100 grids :-D

Rewriting trees to increase laziness and parallelizability

So the other promising approach is rewriting the CSG tree to something that provides more lazy-unioniable top-level operands, and also increases the arity of each individual CGS operation, especially those that can be parallelized (e.g. union and intersection).

union() {
  a();
  translate([0, 0, 1])
    union() {
      color("red") square(1);
      color("green") sphere(1);
      translate([1, 0, 0]) sphere(1);
    }
}

The result of flattening this tree is to have a top-level union that can be skipped w/ lazy-unions:

a();
color("red") translate([0, 0, 1]) square(1);
color("green") translate([0, 0, 1]) sphere(1);
translate([1, 0, 1]) sphere(1);

So there you go: rewrite-tree (incubated in this branch) completely drops the amount of time it takes to render this from 1min to 400ms (rewrite-tree + lazy-union). It would have taken 2sec with fast-csg. You can even go to N=10 (100 spheres) in 4 seconds, which wouldn’t complete in reasonable time without the new options (you’d then typically reduce $fn=50 or use other tricks).

$fn=50;
for (i=[0:N-1]) translate([i,0,0])
  for (j=[0:N-1]) translate([0,j,0])
    sphere(d=1.1);

What’s next?

Please download a nightly build and (stress) test these features (enable in the settings, or in CLI with --enable=fast-csg), file bugs (or simply report any success on the umbrella PR). That’s how open-source software gets better 🙏.

Roadmap to making OpenSCAD lightning fast:

Introduce corefinement CSG operations instead of Nef polyhedra (fast-csg feature in Nightly builds, openscad#4087; initial discussions in openscad#3641)
Trust corefinement to handle more cases (fast-cst-trust-corefinement feature, openscad#4101). Likely to get more crashes here.
Investigate non-lazy CGAL kernels. Currently need fast-csg-exact feature to avoid some extreme cases.
Experiment with CGAL’s upcoming simplification routines (cgal#5461). Corefinement currently produces more triangles than Nefs, which can make some models larger and even slower.
Enable fast-csg (and maybe fast-csg-trust-corefinement) by default in the next OpenSCAD release unless blockers are found.
Introduce rewrite-tree feature to vastly expand cases covered by lazy-union.
Ensure unions and intersections are parallelized to some level, maybe using one of the existing experiments.

Thanks!

Huge thanks to the dedicated team of OpenSCAD maintainers for their endless patience & support (special shoutouts to t-paul, thehans, redlizard, rcolyer and MichaelAtOz) and CGAL corefinement guru sloriot for providing the magic (and prompt support) to make the magic happen.

Oh and thank you for reading!

Please add any comments to this gist (there are also comment threads on reddit, hacker news and on the openscad mailing-list).

Follow ochafik@ on Twitter to follow my next experiments!