Parallel CSG engine

Dzhus

Short description: This project aims to deliver fast parallel Constructive Solid Geometry engine which uses accelerate/repa libraries for vectorised computations on GPU. Applications which require ray-body intersection detection will benefit from performance boost provided by usage of this engine.

Project goal and applications



Constructive Solid Geometry is the common approach to define complex
bodies in engineering applications, ray tracing engines and physical
simulators. CSG body is a recursive composition of primitive objects
or other bodies, with basic set operations (union, intersection and
complement) used to compose bodies. For example, we may define a body
as a half-space intersected with complement of a cylinder to produce a
hole in the body:

body = intersection [plane (1, 1, 1) 0,
                     complement $ cylinder (1, 1, 1) (0, 0, 0) 5]

To keep the model as generic as possible, primitives used are
infinite: plane defines a halfspace, cylinder is not capped etc.
Primitive objects are defined using their natural parameters like
position and radius for sphere; axis vector, point on axis and radius
for cylinder.

In different applications like particle simulations of gas flows
around complex bodies, or raytracing, it's desirable to calculate
points at which a ray intersects the CSG body, and get the normal
vector of surface at «hit point». This may be used to detect
collisions between gas particles and surface of object and calculate
post-collision velocity of a particle (using specular reflection model
or Maxwellian distribution).

A particle posessing a position in space and velocity may be mapped to
ray starting at its position, directed along the velocity vector (we
will further treat particles and rays for its trajectories as
interchangeable notions). By calculating ray intersection with body
and using normal vector, we may take further ray reflections into
account. By casting rays from viewport plane, we may raytrace the
whole scene, producing graphics from body definition.

CNC 3D printers are becoming more and more popular today. CSG may be
used to define the structure of printed body. Then, by casting rays
from plane above (or at the sides of) the object and calculating
distance to the first intersection with body surface, a height map of
body surface with desired resolution may be computed, allowing for
further translation into G-code, used by CNC engines to perform body
cutting or printing.

Proposed implementation design

Traces


There's some existing codebase available at
https://github.com/dzhus/dsmc/blob/master/src/DSMC/Traceables.hs,
which is being written for my thesis.

The primary notion used by the module is a trace of ray on body, which
is a list of segments of time values during which particle on the ray
is considered to be inside the object:

                    -----------
              ----/              \----
            -/                          \-
           /                               \        t
 ---*--[=====trace=====]--------->
           \                               /
            -\                          /-
              ----\              /----
                    ------------

(Better ASCII art is available at the beginning of Traceables module
in comments.)

`trace` function takes a particle (which, as shown above, may be
mapped to ray), a body and produces a trace.

trace :: Particle -> Body -> Trace

When body is a primitive object, analytical solution for ray-object
intersection is used (a lot of which are described in Graphics Gems
series), like finding two point of intersection of ray and a sphere,
or cylinder. Normal vectors of primitive are stored in thunks attached
to respective hit points. Segments may have one or two ends in
infinity, which means that ray never intersects the body, or that it
gets inside the primitive and never gets out (which may be the case
with halfspace primitive).

When body is a complex object composed using set operation, we derive
our approach from observation that trace for composition is a
composition of traces. For example, if trace for bodies B1 and B2 were
calculated as [(0, 5)] and [(3, 8)], respectively, then trace for
union of B1 and B2 is [(0, 8)] and for intersection it is [(3, 5)]. In
case of complement, the trace is complemented to [(-\infty, +\infty)]
and normals in hit points are flipped.

By applying these two rules of processing primitives and compositions
on every level of CSG tree, we and up having a single trace for given
ray on a top-level object.

This trace now may be intersected with [(0, dt)] to check if particle
collided with the body during previous small time step. For
ray-tracing, we check the right end of the top-level trace: if it's in
infinity, the ray never actually hit the body (assuming that all rays
start outside). Normal vectors attached at hit points are used to
calculate pixel color in raytracing, or post-collisional velocity in
particle simulations.


Simpler `inside` function is also a part of API, mapping particle and
body to Boolean value depending on whether the particle is inside the
body or not. Implementation of `inside` for compositions is trivial.
`inside` allows for implementing of marching cubes method atop of our
CSG, which produces polygonal model of complex body and may find use
in certain applications as well.

Traces approach is derived from rendering methods used for
visualizing CSG objects with GPU.

Vectorization


The project aims to vectorize tracing to use many particles at once:

trace :: [Particle] -> Body -> [Trace]

Intersecting of particle trajectory with geometrical primitive
requires a number of 3-vector operations: dot products, cross
products, subtractions, normalizing.

My goal would be to perform these operations in parallel for many
particles at once. If possible, this will be done with accelerate
library on GPU using CUDA or OpenCL.

In what ways will this project benefit the wider Haskell community?


Fast and versatile CSG library will bring performance boost to any of
abovementioned applications, as well as new application of parallel
libraries in Haskell will be provided, perhaps helping its authors to
test performance as implementations develop.

Approximate schedule


1. By the end of community bonding period, I plan to get deeper
understanding of strictness analysis and how to make fusion work for
my vectorised code, and get myself familiar with accelerate better, to
understand if I can use it (recursive nature of CSG may limit the
applicability). I will decide at which levels of engine I will use
accelerate and repa, and which backend of accelerate to use. CUDA is
provided by upstream, but I have no CUDA-compliant hardware. There's
experimental OpenCL backend for accelerate using hopencl library,
written by Martin Dybdal. If OpenCL backend is comparable in
performance to CUDA backend, I'll stick to it since OpenCL is more
portable. If not and the performance is way behind, or certain
features of accelerate are not supported, then I'll use CUDA and get
compliant GPU.

2. I plan to approach mid-term evaluation having started vectorizing
my tracing code. It should be clear how to properly implement
vectorized tracing for primitives and trace-composing operations, and
if composition steps are bottleneck, implement a better solution.
Laziness must be fought off by this time as well, with a better
solution for pushing normal vector information up the tree of bodies.
Certain additions to CSG primitive and operation set will arrive, like
cone primitive and difference operation (which may be defined as
intersection with complement, but a custom implementation may be more
efficient). Trace API is currently packed into Body record, which is
nice but not efficient enough when speaking of vectorized code, and
this will be resolved by this time to by switching to different
implementation for trace.

3. By the end of the project vectorized code must already provide
performance boost compared to original implementation. I will try to
use library in some project, like Gloam raytracer or the raycaster I
wrote for my thesis (see below). I think extra optimization may be
added to intersection detection code, like not considering rays which
do not belong certain part of space which encloses the primitive. CSG
operations will make this more difficult. The set of operations may be
extended as well, adding rotate or translate operators. Concurrent marching
cubes method using `inside` will be added if time allows.

I plan to track progress on every minor goal using public issue tracker,
like on Github.

My experience


I study applied mathematics at Bauman Moscow Technical University and
I'm currently writing my masters thesis on rarefied gas dynamics
simulation, I use Haskell for implementation. This project arouse from
the work I've been doing for my thesis, which is using Haskell for
implementing Direct Simulation Monte Carlo method. I use CSG to define
a complex body and boundary conditions must be considered using the
method proposed in this project. At some point I realized that the
same code may be used to raycast the body (and check that the
definition produces body which looks like what is expected).

I'm going to pursue PhD degree in Computer Science after graduating
and this project will give me much experience with bleeding-edge
developments in field of parallelism.

I have part-time employment experience in Haskell, writing the server
storage backend for customer relationship management system built atop
Snap web framework. I already contributed the code I wrote for this job to
Hackage: snaplet-redis, snaplet-redson, snap-auth-cli. The source is
available at GitHub as well.

I have long-standing commitment to open source community, using
GNU/Linux since 2006. In 2009, I've participated in Google Summer of
Code with Emacs/GDB integration project, which involved hacking on
Lisp code used in Emacs:
http://google-melange.appspot.com/gsoc/project/google/gsoc2009/dzhus/4001.
The codebase was improved and extended during the project and included
in Emacs 24.

I also try helping Gentoo Linux project occasionally. I enjoy free
software because it allows me to learn a lot, communicate with
interesting people. More of random code by me is available at http://dzhus.org/hg/

Other prior art

There're several other relevant projects in the field:

- Ben Lippmeier provided a parallel real-time ray-tracer using REPA
  library: http://www.youtube.com/watch?v=jBd9c1gAqWs, which uses
  gloss-field vectorization of pixel color calculation. My aim would
  be to consider complex CSG objects, use GPU and support
  non-raytracing applications.
                  
- Hpysics parallel physical engine, also done as GSoC project by Roman
  Cheplyaka, aimed at polyhedra collision detection with DPH. The
  geometry model used by this project is different, with emphasis to
  using a compositions of basic geometry primitives.

- Implicit library designed specifically for CAD applications with CSG
  approach as well. To export the defined CSG bodies, this library
  uses marching cubes methods for building isosurface of scalar field
  describing the CSG shape. The aim of proposed object is to use
  analytical ray-primitive intersection calculations instead.

- Glome is a ray-tracer written in Haskell. Currently Glome utilizes
  its own ray-body intersection code, which is not parallel. The
  project would allow integrating Glome with a faster engine to speed
  up scene rendering considerably.

Community interaction


I am quite comfortable with mailing lists and prefer public
discussions. I will send status reports to my mentor and ask questions
on related mailing lists, as well stay on #haskell. I asked for ideas
at haskell-cafe@ and #haskell channel, I also asked Ben Lippmeier for
tips about my current implementation of CSG engine and it was him who
prompted me to apply for GSoC with this project. Discussing the design
of library will be important before coding starts. I'm open to
suggestions and I'm inclined to learn from this project and get
feedback about my code from more experienced hackers.

I graduate in June and this means that roughly till mid-term
evaluation I will be able to dedicate slightly less time than after
the mid-term (30-35 hrs per week). After that I will be able to work on
this project full 40 hours a week.

In development I value agility, small commits and incremental changes.
I plan to roll fresh code to public as soon as it becomes available to encourage
wider review, to get more feedback and possibly testing.

I can be contacted through mail dima@dzhus.org. My timezone is UTC+4.