I recovered an old pair of RG glasses and I'd like to give this a try too. My recent game BackBack
First, this is all just my opinion on this and I may have some of it wrong but I have spent a bit of time on it and it seems to make sense to me even though some of the information I've found online doesn't really agree. In the demo I posted I get quite a good impression of the cubes coming right out the screen and floating somewhere between me and the screen which particularly in movies is something I often have difficulty with especially when there's even slight ghosting so I reckon I'm on the right track. I am of the opinion that any system just using two cameras, real or virtual, will not be quite right
Most of the information I found related to using two cameras with some seperation and either parallel or slight cross over but when I tried that there was always some problems and there were difficulties setting up the camera offsets and the perspective calculations.
So what I'm doing is the transforms to screenspace as usual with the virtual dimensions matching (or somewhere near) real dimensions then instead of the usual perspective I am tracing vectors from every vertex to both eyes and finding the points where they intersect the screen and draw the image using the points for each eye to different buffers then finally combine the buffers. Also, I'm not clipping on the z plane as the objects have to be seen on both sides of the screen but I will have to put it back in in some form (clipping to eyes instead of screen) for more complex scenes.
Because it uses the vertex->eye vectors I think it would work perfectly with headtracking too a long as the actual headtracking was accurate enough.