Thursday, July 22, 2010

CUDA 3.1 out

Just read that cuda 3.1 toolkit is available (has been for some time in fact)

As you can see from the release notes, CUDA 3.1 also gives 16-way kernel concurrency, allowing for up to 16 different kernels to run at the same time on Fermi GPUs. Banks said a bunch of needed C++ features were added, such as support for function pointers and recursion to allow for more C++ apps to run on GPUs as well as a unified Visual Profiler that supports CUDA C/C++ as well as OpenCL. The math libraries in the CUDA 3.1 SDK were also goosed, with some having up to 25 per cent performance improvements, according to Banks.

The support for recursion and concurrent kernels should be great for CUDA path tracers running on Fermi and I'm curious to see the performance gains. Maybe the initial claims that Fermi will have 4x the path tracing performance of GT200 class GPUs could become true after all.

Thursday, July 15, 2010

OnLive + Mova vs OTOY + LightStage

I've just read Joystiq's review of OnLive, which is very positive regarding the lag issue: there is none...

As it stands right now, the service is -- perhaps shockingly -- running as intended. OnLive still requires a faster than normal connection (regardless of what the folks from OnLive might tell you), and it requires a wired one at that, but it absolutely, unbelievably works. Notice I haven't mentioned issues with button lag? That's because I never encountered them. Not during a single game (even UE3).

A related recent Joystiq article about OnLive mentions Mova, a sister company of OnLive developing Contour, a motion capture technology using a curved wall of camera's, very reminiscent of OTOY's LightStage (although the LightStage dome is bigger and can capture the actor from 360 degrees at once). The photorealistic CG characters and objects that it produces are the real advantage of cloud gaming (as was being hinted at when the Lightstaged Ruby from the latest Ruby demo was presented at the Radeon HD 5800 launch):

What he stressed most, though, was Perlman's other company, Mova, working in tandem with OnLive to create impressive new visual experiences in games. "This face here," Bentley began, as he motioned toward a life-like image that had been projected on a screen before us, "is computer generated -- 100,000 polygons. It's the same thing we used in Benjamin Button to capture Brad Pitt's face. Right here, this is an actress. You can't render this in real time on a standard console. So this is the reason OnLive really exists." Bentley claims that Mova is a big part of the reason that a lot of folks originally got involved with OnLive. "We were mind-boggled," he exclaimed. And mind-boggling can be a tremendous motivator, it would seem -- spurring Bentley to leave a successful startup for a still nascent, unknown company working on the fringes of the game industry.

In fairness, what we saw of Mova was terrifyingly impressive, seemingly crossing the uncanny valley into "Holy crap! Are those human beings or computer games?" territory. Luckily for us, someone, somewhere is working with Mova for games. Though Bentley couldn't say much, when we pushed him on the subject, he laughed and responded, "Uhhhh ... ummm ... there's some people working on it." And though we may not see those games for quite some time, when we do, we'll be seeing the future.

Just like OTOY, I bet that OnLive is developing some voxel ray tracing tech as well, which is a perfect fit for server side rendering due to it's massive memory requirements. Now let's see what OTOY and OnLive with their respective cloud servers and capturing technologies will come up with :-)

Wednesday, July 14, 2010

Real-time Energy Redistribution Path Tracing in Brigade!

A lot of posts about Brigade lately, but that's because the pace of development is going at break neck speeds and the intermediate updates are very exciting. Jacco Bikker and Dietger van Antwerpen, the coding brains behind the Brigade path tracer, seem unstoppable. The latest contribution to the Brigade path tracer is the implementation of ERPT or Energy Redistribution Path Tracing. ERPT was presented at Siggraph 2005 and is an unbiased extension of regular path tracing which combines Monte Carlo path tracing and Metropolis Light Transport path mutation to obtain lower frequency noise and converge faster in general. Caustics benefit greatly as well as scenes which are predominantly lit by indirect lighting. The original ERPT paper can be found at and offers a very in-depth and understandable insight into the technique. A practical implementation of ERPT can be found in the paper "Implementing Energy Redistribution Path Tracing" (

The algorithm seems to be superior than (bidirectional) path tracing and MLT in most cases, while retaining it's unbiased character. And they made it work on the GPU! You could say that algorithm-wise, the addition of ERPT makes Brigade currently more advanced than the other GPU renderers (Octane, Arion, LuxRays, OptiX, V-Ray RT, iray, SHOT, Indigo GPU, ...) which rely on "plain" path tracing.

The following video compares path tracing to ERPT in Brigade at a resolution of 1280x720(!) on a GTX 470:

This image directly compares path tracing on the left with ERPT on the right (the smeary pixel artefacts in the ERPT image are mostly due to the youtube video + JPEG screengrab compression, but I presume that there are also some noise filters applied as described in the ERPT paper):
ERPT seems to be a little bit darker than regular path tracing in this image, which seems to be a by product of the noise filters according to

On a side note, the Sponza scene in the video renders very fast for the given resolution and hardware. When comparing this with the video of Sponza rendering in the first version of SmallLuxGPU on a HD 4870 (which I thought looked amazing at the time), it's clear that GPU rendering has made enormous advancements in just a few months thanks to more powerful GPU's and optimizations of the path tracing code. I can hardly contain my excitement to see what Brigade is going to bring next! Maybe Population Monte Carlo energy redistribution for even faster convergence? ;)

Monday, July 12, 2010

Sparse voxel octree and path tracing: a perfect combination?

I have been wondering for some time if SVO and path tracing would be a perfect solution for realtime GI in games. Cyril Crassin has shown in his paper "Beyond triangles: Gigavoxels effects in videogames" that secondary ray effects such as shadows can be computed very inexpensively by tracing a coarse voxel resolution mipmap without tracing all the way down to the finest voxel resolution, something that is a magic intrinsic property of SVO's and which is to my knowledge not possible when tracing triangles (unless there are multiple LOD levels), where all rays have to be traced to the final leaf containing the triangle, which is of course very expensive. A great example of different SVO resolution levels can be seen in the video "Efficient sparse voxel octrees" by Samuli Laine and Tero Karras (on , video link at the right, a demo and source code for CUDA cards is also available).

I think that this LOD "trick" could work with all kinds of secondary ray effects, not just shadows. Particularly ambient occlusion and indirect lighting could be efficiently calculated in this manner, even on glossy materials. One limitation could be high frequency content, because every voxel is an average of the eight smaller voxels that it's constituted of, but in such a case the voxels could be adaptively subdivided to a higher res (same for perfectly specular reflections). For diffuse materials, the cost of computing indirect lighting could be drastically reduced.

Another idea is to compute primary rays and first bounce secondary rays at full resolution, and all subsequent bounces at lower voxel LODs with some edge-adaptive sampling, since the 2nd, 3rd, 4th, ... indirect bounces contribute relatively little to the final pixel color compared to direct + 1st indirect bounce. Not sure if this idea is possible or how to make it work.

Voxelstein3d has already implemented the idea of pathtracing SVOs for global illumination ( with some nice results. Once a demo is released, it's going to be interesting to see if the above is true and doesn't break down with non-diffuse materials in the scene.

UPDATE: VoxLOD actually has done this for diffuse global illumination and it seems to work nicely:

Thursday, July 8, 2010

Tokaspt, an excellent real-time path tracing app

Just stumbled upon this very impressive CUDA based path tracer: for exe and source (The app itself has been available since January 2009)

Although the scenes are quite simple (spheres and planes only), it's extremely fast and it converges literally in a matter of milliseconds to a high quality image. Navigation is as close to real-time as it gets. There are 4 different scenes to choose from (load scene with F9) and they can be modified at will: parameters are sphere size, color, emitting properties, 3 material BRDFs (diffuse (matte), specular (mirror) and refractive (glass)) and sphere position. Path trace depth and spppp (samples per pixel per pass) can also be altered on the fly thanks to the very convenient GUI with sliders. When moving around and ghosting artefacts appear, press the "reset acc" button to clear the accumulation buffer and get a clean image. Definitely worth messing around with!

Wednesday, July 7, 2010

New version of Brigade path tracer

Follow the link in this post to download. There's some new features + performance increase. Rename cudart.dll to cudart32_31_9.dll to make it work.

The next image demonstrates some of the exceptional strenghts of using path tracing:

- indirect lighting with color bleeding: notice that every surface facing down (yellow arrows) picks up a slightly greenish glow from the floor plane, due to indirect light bouncing off (this picture uses path trace depth 6)
- soft shadows
- indirect shadows
- contact shadows (ambient occlusion)
- superb anti-aliasing
- depth of field
- natural looking light with gradual changes

all of these contribute to the photorealistic look and it's all interactive (on high end cpu+gpu)!

Saturday, July 3, 2010

Friday, July 2, 2010

Brigade path tracer comparison

The following screenshots are taken from the Brigade real-time path tracer demo, available at
Rendered with CPU only at resolution 832x512
Images with 100 and 800 spp were taken without frame averaging (only 1 iteration)
Images with 2, 8, 16, 32 spp taken with frame averaging (averaging samples of several frames)

2 spp

8 spp

16 spp

32 spp

100 spp

800 spp

To top it off, one big image comparing 800, 8, 16 and 32 spp. It amazes me that the quality of just 8 samples is already great and with some filtering it could rival the quality of the 800 spp image:

Thursday, July 1, 2010

Gaikai's cloud bussiness model: play games for free in your browser

Interview with Gaikai's Dave Perry on Joystiq:

Gaikai focuses on delivering game demo's, not complete games: you see an advertisement of a game on a website (could be Gamespot,, Eurogamer) or read a game review, and with just one click you can play a demo of that game in your browser without paying a cent. The game publisher pays for your playing time, 1 cent per minute per user.
This approach is economically safer, more practical and more retail/publisher/gamer friendly than what OnLive is doing. With the current network infrastructure of the internet and its bandwidth limitations, this is probably the most successful route for cloud gaming. Gaikai has already signed EA (, so I think cloud gaming is gonna get big pretty soon.

UPDATE: another video of OnLive, showing mouse latency in F.E.A.R. 2 behind a router: