Two years and a half after the 1.0 release, 2.0 is finally out. Although I have not implemented each and every feature detailed in the 2.0 roadmap, the 2.0 milestone is 95% complete. This project is above all a learning experience for me and, during these 30 months (almost an era in the world of computer graphics), I have learned that some of XRT original design ideas are now obsolete and must be reviewed. Because I feel more efficient to implement the missing 5% on a stronger code base, it is high time to move to greener pastures. I'll detail the 3.0 roadmap in a future post.
The image of the day is a 2 million particle system generated with a Python RenderMan procedural from a "strange attractor" equation. The particle hues are defined according to the motion speed on the attractor curve (the faster, the warmer). This example (now bundled with XRT examples archive) is derived from work done by a student from Pr Malcolm Kesson at Savannah College of Art and Design.
For the mathematically-inclined and just for the pleasure to write a few LaTeX formulas, the attractor is a Polynomial A whose equation is:(1)
You will find more attractors at Chaoscope.org
Posted: 29 Nov 2012 22:21
Tags: atmosphere gallery shader
This post is a sort of follow-up to the previous one. Although "Production Volume Rendering" only deals with voxel buffers, its reading has been inspirational enough for me to improve on XRT volume rendering. For now, XRT capabilities are based on RenderMan 3.2 atmosphere shaders1.
How atmosphere shaders work
This part is lifted from Pixar Application Note #20 available here
Atmosphere shaders are bound to surfaces, just like surface or displacement shaders. So you must have an object in the background for the atmosphere shader to run. In other words, pixels with no geometry at all "behind" them will not run any atmosphere shaders.
The general idea behind the smoke effects is to ray march along the incident ray I, sampling illumination and accounting for atmospheric extinction. Typically, this is done with the following algorithm:
Choose an appropriate step size for marching along the ray. total_len = length(I) current_pos = P; while total_len > 0 do: sample the smoke density and light at current_pos adjust Ci/Oi to add new light and extinguish due to smoke opacity. current_pos += stepsize * normalize(-I); total_len -= stepsize; endwhile
Volume shaders of this type can be very expensive. The computational expense is proportional to the number of iterations of the while loop, which is determined by the step size and the length of I. Therefore, it is important to choose your stepsize carefully — too large a stepsize will result in banding and quantization artifacts, while too small a stepsize results in very long render times.
A smarter shader
This example, borrowed from the Gelato example set, features spinning gears in fog lighted with a spot light. Because there are holes and dents in the gears, parts of the fog are either obscured or lighted. This is the famous "god rays" effect.
On this kind of scene, the smoke shader example that comes with the Application Note #20 performs badly. The scene is quite large and there are many small details that requires a small step size to be properly caught. However, you will get a huge speed boost if you realize that the space outside the spot shape is not lit and does not need to be raymarched. If the shader is passed information regarding the spot shape (a cone here) and orientation, it can compute much tighter bounds for the raymarching algorithm and avoid useless steps in the dark void. First, the volume ray origin and position are transformed into the spot canonical space, then the new volume ray is intersected against the canonical cone shape (a mere second degree equation to solve).
You will get a better grasp of the "god rays" effect in the following animation:
You will find also this example and the companion video in the Gelato gallery.
Posted: 13 Nov 2012 19:54
Tags: book pvr review
Finding your way through the computer graphics litterature jungle is hard: for most of the subjects, you will find a plethora of papers, all claiming to bring forward a decisive breakthrough. Most of the time, you will find that the brand new technique does not fit into existing rendering architectures, gobbles gigs of memory or that it adresses only a subset of your problem. Building a consistent rendering system is a difficult task.
"Production Volumetric Rendering: design and implementation" written by Magnus Wrenninge, a technical director at Sony Pictures Imageworks, is an attempt to clear the mess for a specific domain: volumetric rendering techniques. It does not try to describe all rendering techniques available in the litterature, instead, it focuses only on techniques used in the visual effects industry by people that need to deliver on budget and time. Following "Physically Based Rendering" path, it provides a complete rendering system (PVR) and centers the book around the source code.
The architecture Wrenninge advocates is deceptively simple: modeling tools fill voxel buffers raymarched by renderers. Of course, in the book, you will get a lot more details but, in the end, it does not get any further.
The book is divided in three main parts: basics, volume modeling, volume rendering. It may seem strange to discuss modeling in a rendering book but, while creating a polygon soup is quite obvious, building convincing volumes is not for the casual user. After all, one must fill these damn voxel buffers!
I have mixed feelings about this book. In all three parts, the technical content is really excellent and interesting (I particularly liked the chapters on the theoritical foundations of raymarching or on phase functions). The design choices are well explained and the example images (all computed with PVR) are a good proof of their validity.
However, the comments on the companion source code are frequently over extended when they address topics that deal more with programming than rendering. Even worse, they are sometimes redundant (for instance, the various discussions on attributes)1. The modeling part, even for my naïve eyes, is over simplified: there is no mention of fluid dynamics for example.
Finally, this book lacks a conclusion chapter: clues for the reader to extend and improve the system, hints to create animations, something to give the reader the compelling need to go beyond.
To summarize, the book could have been better but you will really get valuable information from it.
- http://magnuswrenninge.com the author's website.
- ProductionVolumeRenderingFundamentals2011.pdf from the Siggraph 2011 courses: this document will give you an accurate (though abridged) idea of the book contents.
- ProductionVolumeRenderingSystems2011.pdf, a companion document written by various studio people that describes the tools and techniques they use.
- https://github.com/pvrbook/pvr contains the source code for PVR, the rendering system described in the book and the source for the example images.
- http://code.google.com/p/smoke3d features a smoke simulator and renderer. Quite interestingly, it follows PVR design guidelines although it was released long before the book.
Posted: 19 Sep 2012 07:46
This example from the Advanced RenderMan book has long been a problematic render for XRT. I am happy to say that I finally addressed the remaining issues. This scene is now included into the XRT example scenes archive. There are only a few primitives but the shaders are quite complex and challenging. Aside from the flashy lensflare effect, look at the subtle blue atmosphere surrounding the planet.
As it was the last remaining item from the book examples, it was time to put a fresh coat of paint on the Advanced RenderMan gallery. The layout has been improved and the number of pictures has been greatly expanded. For good measure, I have even tried to recreate some of the book pictures for which the RIB files were not provided. Hope you will enjoy them!
|Catmull-Clark hydra from Sitex Graphics's Air examples|
Catmull-Clark subdivision surfaces …
The major feature in XRT 1.5.0 is a new geometric primitive: Catmull-Clark subdivision surfaces. Except for texturing, everything you would expect from it is supported: creases and corners, whether they are sharp or smooth, holes and boundaries. For a definition of these terms, please refer to my previous posts on the subject (basics, corners and creases, holes and boundaries).
You have certainly noticed that these posts are more than one year old and, indeed, the current implementation has been sitting on my hard drive from this time. I could pretend that I have been distracted by the implementation of other appealing features but the real reason is that I am not happy with the result: it's kinda slow (maybe my expectations were too high …), it suffers from tesselation artifacts and it's all my fault !
The intersection algorithm recursively refines the surface until the resulting patches are flat or small enough. The problem lies in the stopping criteria. It should be computed using derivative information which XRT current design is not able to provide1. So, it is stopped at an arbitrary subdivision level.
The result is that, depending on the zoom factor, a surface may be over-subdivided (which is bad for performance and sometimes leads to precision problems) or under-subdivided. The next picture is the horrifying result of a bad refinement.
It was bad one year ago and it still is. So, why release it now ? For one thing, it does decent pictures most of the time and I believe it cannot be improved without changes that go far beyond what I planned originally for XRT 2.0. Therefore, it will have to wait for work on the next major version to start. I will go into further details in future posts.
… and the rest
- XRT frontend has been completely rewritten for improved help message, versioning information, more control on threading, debugging output and statistics.
- shadow bias is now supported.
- a myriad of bugs has been fixed.
- the complete list of changes is available in the ChangeLog.
This version and the updated documentation are available in the Downloads page.
Posted: 16 Jun 2012 22:14
Tags: downloads examples gallery
Reaching new heights
XRT 1.4.1 is out. Compared to the previous release, this one dramatically improves performances:
- on a single core, it is 30% faster in most cases and nearly 100% faster with scenes that do volume rendering
- with multiple cores, rendering times are now more than 90% linear with the number of cores in all test cases (ie on a quad core, the speedup exceeds 3.6) whereas, with the previous release, rendering times were nearly the same whether you had a dual or a quad core.
Of course, both acceleration factors combine for a much much faster renderer.
There were two major sources of slowdown which illustrate quite well the pitfalls of multithreaded programming.
The first was an incorrect usage of OpenImageIO ustrings (which stands for unique strings) where I was repeatedlly calling ustring constructors instead of reusing them. This was the major limiting factor in single thread rendering mode. To make the matter worse, the ustring constructor accesses a table protected by a mutex creating a bottleneck when multithreading is on.
The second problem is a bit more subtle and deals with shared pointers. To preserve memory, XRT shares shaders and transformations between primitives using reference counting and copy-on-write policy. This is a very effective way to manage memory and to avoid memory leaks but is not without constraints. In a multi-threaded environment, shared pointers must rely on counters implemented using atomic primitives. Amongst the many solutions available (for an accurate discussion, see Implementing Scalable Atomic Locks for Multi-Core Intel® EM64T and IA32 Architectures), XRT uses vanilla atomic operations (compare/exchange and the likes).
So far, so good, but on Intel processors, atomic operations lock the cache. While rendering, shared pointers to transformation matrices are accessed zillions of times. The result is that the cache is very often locked by one core while the others are waiting.
Fortunately, if shared pointers are really handy to manage ressources while parsing scene files, they are useless while rendering. There is never any need for keep pointing to a particular ressource of the scene once a pixel has been rendered and therefore there is no need to count references and dereferences: accessing directly the raw pointer is both safe and fast.
For now, as a kind of brute force solution, I have replaced reference counting on transformations by a global transformation cache. It is very efficient but I know there are some corner cases left which may leak memory. I'll improve on that later on.
Other than that, this release restores the gamma correction feature lost with the OpenImageIO package integration.
One more eye candy
Today's picture is a new procedural sample added in the XRT examples archive. It generates a million points organized to build a well-known 3D fractal: the Sierpinki Gasket.
Posted: 27 May 2012 09:35
This release is my first attempt at multithreaded rendering. Building upon work done for version 1.2, XRT now fires rays in a parallel fashion. Really, the algorithm is a no-brainer: the whole image is divided in small tiles stored in a work queue. While the queue is not empty, each rendering thread picks up a new tile and computes its pixels.
The tricky part is to make sure that the code is thread-safe. Namely, global ressources are evil things. Each thread must be granted exclusive write access while others are waiting for read access; otherwise, bad things happen.
There are only two solutions:
- protect the global ressources against shared access using atomics, mutexes, … Just be aware that synchronisation primitives have an intrinsic run-time cost and that the more threads wait, the less efficient the program becomes.
- make sure that each thread has its own copy of the data (re-entrancy is the buzzword here).
So, do not (yet) expect wonders. Synchronisation between threads on XRT is taking its toll and I have noticed that rendering times do not scale well with the number of processors. With 4 threads, I get only a 2.5x speed increase. I am looking at it.
As a side note, OIIO has also been upgraded to version 1.0.4. Except for a slight modification for XP, this is the genuine version.
The list of changes is fully detailed in the change log.
This version and the updated documentation are available in the Downloads page.
Posted: 10 Mar 2012 19:04
The major change in this release is the upgrade to OIIO 1.0. Be aware that the version bundled with XRT differs slightly from the genuine 1.0 version. It fixes a problem with the maketx utility (to be commited soon to the GitHub master) and compatibility with XP has been restored (yes, I still have an XP box!).
Environment mapping is now working as demonstrated on the right. Previously, a call to environment() in a shader would always fire rays. Now, depending on the argument, it will also sample a texture. The RIB file for this picture is included in the XRT examples archive. I have also updated a few examples to account for the change of behaviour and restore raytracing where needed.
I have made some changes to XRT C++ API. In the original Gelato specification, Input only accepts parameters through a single string which has to be parsed. This is not very flexible and akin to reinventing the wheel. Actually, using "user attributes" to pass parameters is much easier: the Generator calls GetAttribute to get values set before the Input call with Attribute. The only issue is that attributes are persistent and can possibly interfere with other Generators. You are safe if you use PushAttributes/PopAttributes to keep the "user attributes" stack clean. There has to be a better way.
Things get much simpler if Input behaves like Camera, Output, Shader, Light, or any geometric primitive: all calls to Parameter are saved into a “pending parameter” list. This list is passed to the Generator constructor and is cleared by XRT afterwards. This allows for a greater flexibility and a greater consistency.
Finally, a very annoying bug has been fixed: sometimes, rendering was freezing on startup.
As usual, this version and the updated documentation are available in the Downloads page.
For a complete list of changes, see the change log.
Posted: 03 Feb 2012 14:10
Some additions to the LuxRays gallery
Posted: 03 Feb 2012 13:55
OpenImageIO or OIIO (an open source project managed by Larry Gritz, a living legend of the CGI industry) is a library for reading and writing images that is format agnostic — that is, a "client app" doesn't need to know the details about any particular image file formats. Specific formats are managed by DLL/DSO plugins. The list of plugins is already rather impressive and growing (TIFF, JPEG/JFIF, OpenEXR, PNG, HDR/RGBE, Targa, JPEG-2000, DPX, Cineon, FITS, BMP, ICO, RMan Zfile, Softimage PIC, DDS, SGI, PNM/PPM/PGM/PBM, Field3d, WebP). Additionally, a TextureSystem class provides filtered MIP-map texture lookups. Texture reads are handled through a cache mechanism which performs access to vast amounts of image data using only a tiny amount (tens of megabytes at most) of runtime memory.
These features make OIIO a very attractive component to include in any renderer, including mine. Actually, because its design is an improved version of the Gelato specification, it has been quite straightforward to replace XRT image I/O plugins and texturing system. The result is available with this new version in the Downloads page.
Frankly speaking, aside from a much wider access to image file formats, it does not improve XRT a lot for now. Further work is needed to fully take advantage of all enhancements OIIO provides. It has been also an opportunity to remove some dust in the interfaces, to clean up and to complete parts of the implementation. Once again, this looks like "invisible" work for the end user. Visible improvements are left for future releases.
For a complete list of changes, see the change log.