Author Topic: z fighting  (Read 17791 times)

0 Members and 1 Guest are viewing this topic.

Offline Stonemonkey

  • Pentium
  • *****
  • Posts: 1315
  • Karma: 96
    • View Profile
z fighting
« on: March 06, 2012 »
In my current experiment into some sort of software shaders I'm trying rendering just the depth (after a z sort of the objects), in that I'm also working out new clipping for each triangle so that during the shader pass I'm not doing a lot of useless interpolation. The problem is that because I'm jumping to the clipped coords the z value is very slightly different from when the whole triangle is interpolated and I end up with z fighting.

I've got this as my z test in the inner loop atm:
Code: [Select]
dz=*p_start-scan.z 'subtract interpolated 1/z from 1/z in buffer
*cast(integer ptr,@dz) and=&h7fffffff  'get absolute
if dz<0.0000001 then  'do the z test
but is there a better way to deal with this?

EDIT: just done some tests and ABS isn't as bad as I thought it was so this looks better

if abs(*p_start-scan.z)<0.0000001 then

but I still think that might cause problems in some cases, maybe with large tris running away from the screen which would give a bigger z error.
« Last Edit: March 06, 2012 by Stonemonkey »

Offline hellfire

  • Sponsor
  • Pentium
  • *******
  • Posts: 1294
  • Karma: 466
    • View Profile
    • my stuff
Re: z fighting
« Reply #1 on: March 06, 2012 »
Quote
I'm also working out new clipping for each triangle so that during the shader pass I'm not doing a lot of useless interpolation.
How does this "clipping" work?
I guess you're not creating new vertices (otherwise you'd work on the same data) but just remember horizontal offsets for each scanline?
In that case the interpolated values are slightly different due to floating point rounding errors:
Code: [Select]
z += deltaZ * offset;is numerically different from this:
Code: [Select]
for (int x=0;x<offset;x++) z+= deltaZ;
The solution to this problem is simple:
Use integers.
« Last Edit: March 07, 2012 by hellfire »
Challenge Trophies Won:

Offline Stonemonkey

  • Pentium
  • *****
  • Posts: 1315
  • Karma: 96
    • View Profile
Re: z fighting
« Reply #2 on: March 09, 2012 »
Yep, what you've described there is the problem, it's the same with the edge interpolation too.

I've got this working to some extent using my hack above though not got it fully set up yet but my thinking is this, after the z sort (front to back) of the objects, do a z fill pass and any tri with at least 1 fragment that passes is added to a list, while doing the z fill triangle I also find out the max/min of x and y for the fragments that pass and will store that along with the tri so it's a bounding box around the visible part of each tri.

A slightly different idea I've had is to still get the bounding box but also write a triangle ID number into the colour buffer during the z fill, then I wouldn't need to touch the z buffer in the texture filling.

Another thing that might be useful from using the lists is that up untill now I've not been sorting anything by texture, only by object. Using lists I think I'll be able to make seperate lists for each texture, possibly even by mipmap.

By use integers, do you mean use fixed point for everything? or just the z buffer?

Offline hellfire

  • Sponsor
  • Pentium
  • *******
  • Posts: 1294
  • Karma: 466
    • View Profile
    • my stuff
Re: z fighting
« Reply #3 on: March 09, 2012 »
By use integers, do you mean use fixed point for everything? or just the z buffer?
Back in the days I converted all floating-point vertex-data to fixed-point when doing the perspective projection (after clipping all polygons against the 3d view frustum).
For the screen coordinates you don't need much more than 4bit of fractional which allows to use lookup-tables for "1.0 / (v2.y - v1.y)" when interpolating along the edges of a polygon (later the Playstation2 used exactly the same numerical range for screen-coordinates).
Using integers you can also interpolate each pair of vertex-attributes (x,z,u,v) in a single mmx register which saves half of the work and removes most of the shifts and memory accesses.

Quote
after the z sort (front to back) of the objects, do a z fill pass and any tri with at least 1 fragment that passes is added to a list, while doing the z fill triangle I also find out the max/min of x and y for the fragments that pass and will store that along with the tri so it's a bounding box around the visible part of each tri.
Have you profiled how many fragments you actually save with this approach?
After all you're doing the z-compare twice (and add a bit of book-keeping overhead for each polygon) but most of the overdraw is already removed by presorting your polys.

Quote
A slightly different idea I've had is to still get the bounding box but also write a triangle ID number into the colour buffer during the z fill, then I wouldn't need to touch the z buffer in the texture filling.
It doesn't really matter which buffer you test, you just wouldn't have to interpolate z but you need that anyway for perspective correction.

When we did this I experimented a lot with span- and occlusion-buffers because the memory was really slow and zbuffering had quite a speed-impact.
The span-buffer clips individual scan-lines of the polygons against each other (so you don't have to test every single pixel).
This makes the final rendering very elegant: just draw the span-list from left to right, there are no more intersections.
But the performance depends very much on the structure of the 3d-scene (it's only fast if you can discard many scanlines completely).

Quote
Another thing that might be useful from using the lists is that up untill now I've not been sorting anything by texture, only by object. Using lists I think I'll be able to make seperate lists for each texture, possibly even by mipmap.
Internally I've already splitted "objects" into poly-lists which share the same material and those are sorted front to back. So when I'm drawing a set of polys they already use the same texture(s).
Sorting by mipmap should be tricky because with perspective correction you might use a different mipmap whenever deltaU/deltaV changes.
No idea if you can figure out a "main mipmap level" in advance...
Challenge Trophies Won:

Offline Stonemonkey

  • Pentium
  • *****
  • Posts: 1315
  • Karma: 96
    • View Profile
Re: z fighting
« Reply #4 on: March 09, 2012 »
By use integers, do you mean use fixed point for everything? or just the z buffer?
Back in the days I converted all floating-point vertex-data to fixed-point when doing the perspective projection (after clipping all polygons against the 3d view frustum).
For the screen coordinates you don't need much more than 4bit of fractional which allows to use lookup-tables for "1.0 / (v2.y - v1.y)" when interpolating along the edges of a polygon (later the Playstation2 used exactly the same numerical range for screen-coordinates).
Using integers you can also interpolate each pair of vertex-attributes (x,z,u,v) in a single mmx register which saves half of the work and removes most of the shifts and memory accesses.
Not sure what I'm doing then, I've been having a look at fixed point and found I need 8 bit fraction for the edge interpolation and just getting away with 10 bits for u and v

Quote
Quote
after the z sort (front to back) of the objects, do a z fill pass and any tri with at least 1 fragment that passes is added to a list, while doing the z fill triangle I also find out the max/min of x and y for the fragments that pass and will store that along with the tri so it's a bounding box around the visible part of each tri.
Have you profiled how many fragments you actually save with this approach?
After all you're doing the z-compare twice (and add a bit of book-keeping overhead for each polygon) but most of the overdraw is already removed by presorting your polys.
This isn't to reduce overdraw, if only a small part of a triangle passes in the z pass then the bounding box surrounds the part that passed so that when the textured tri is drawn most of the hidden part of the tri isn't scanned, if a tri in the scene is 50 pixels high but only the top 10 rows of pixels are visible then only the top 10 rows are scanned.

Quote
Quote
A slightly different idea I've had is to still get the bounding box but also write a triangle ID number into the colour buffer during the z fill, then I wouldn't need to touch the z buffer in the texture filling.
It doesn't really matter which buffer you test, you just wouldn't have to interpolate z but you need that anyway for perspective correction.
Yep, I was thinking I could either use the value from the buffer or an interpolated value but using an interpolated value would mean I don't need the z buffer in the cache while doing the texturing.

Quote
When we did this I experimented a lot with span- and occlusion-buffers because the memory was really slow and zbuffering had quite a speed-impact.
The span-buffer clips individual scan-lines of the polygons against each other (so you don't have to test every single pixel).
This makes the final rendering very elegant: just draw the span-list from left to right, there are no more intersections.
But the performance depends very much on the structure of the 3d-scene (it's only fast if you can discard many scanlines completely).
I've not quite got the idea of span buffers yet, do you know any decent links explaining it?

Quote
Quote
Another thing that might be useful from using the lists is that up untill now I've not been sorting anything by texture, only by object. Using lists I think I'll be able to make seperate lists for each texture, possibly even by mipmap.
Internally I've already splitted "objects" into poly-lists which share the same material and those are sorted front to back. So when I'm drawing a set of polys they already use the same texture(s).
Sorting by mipmap should be tricky because with perspective correction you might use a different mipmap whenever deltaU/deltaV changes.
No idea if you can figure out a "main mipmap level" in advance...
I'm mipmapping on a triangle basis, there's a bit of popping but it's ok

Offline hellfire

  • Sponsor
  • Pentium
  • *******
  • Posts: 1294
  • Karma: 466
    • View Profile
    • my stuff
Re: z fighting
« Reply #5 on: March 09, 2012 »
Not sure what I'm doing then, I've been having a look at fixed point and found I need 8 bit fraction for the edge interpolation and just getting away with 10 bits for u and v
The "4 bits" are just to convert the floating-point 2d screen coordinates of each vertex to an integer format with a reasonable fractional part (of course you can use as much precision as you like but you won't need more to perform sub-pixel correction with sufficient precision).
The actual interpolation along y and across the scanline requires higher precision of course.

Quote
This isn't to reduce overdraw, if only a small part of a triangle passes in the z pass then the bounding box surrounds the part that passed so that when the textured tri is drawn most of the hidden part of the tri isn't scanned, if a tri in the scene is 50 pixels high but only the top 10 rows of pixels are visible then only the top 10 rows are scanned.
Yes, I already figured that out.
With the inital z-pass each pixel is guaranteed to be drawn only once in the second pass.
So there's basically no overdraw but a bit of overhead for the z-pass.
But without the z-pass and a rough front-to-back ordering you'll skip most pixels in the z-test anyway (except of some pixels where the pre-ordering just didn't make it).
So I was wondering how many pixels you actually save in the second pass by doing an initial z-pass.
I guess you can't really figure out if a polygon is completely invisible as you might draw polygons later which obscure parts of polygons which were already drawn.

Quote
I was thinking I could either use the value from the buffer or an interpolated value but using an interpolated value would mean I don't need the z buffer in the cache while doing the texturing.
Oh, of course, z is already in the buffer...
I did a post-processing fogging-pass using the z-buffer here.
That wasn't particularly much faster but saved me a few bytes of file-size as I didn't had to implement a separate polyfiller to do texture+fog.

Quote
I'm mipmapping on a triangle basis, there's a bit of popping but it's ok
That doesn't work well with a large ground plane for example.
That's why they use it as the typical showcase for mipmaps :)

Quote
I've not quite got the idea of span buffers yet, do you know any decent links explaining it?
Unfortunately not, there are some articles on the net but none of them shows the gory details.
For every scanline of your screen you have a list of "spans" (which is just one scanline of your polygon).
Each span stores all the attributes which are required to draw and clip it:
Code: [Select]
struct Span {
  int startX;
  int EndX;
  int u;
  int v;
  int z;
  int polyId;
};
The "polyId" references another list with attributes which are constant for the whole polygon so you don't have to store it for each span (like deltaU/V/Z, texture, etc).
When rasterizing your polygon you create a span for each scanline and insert it into the according list.
When inserting the span you have to check several cases:
- an existing span (or a set of continous spans) is completely in front of it.
  So you can simply discard the current span
- an existing span is completely behind it.
  So you can discard the existing span
- the current span intersects an existing span.
  So you have to clip the two against each other (and check the rest of your span against the following spans)
- the current span is in front but somewhere in the middle of an existing span
  So you split the existing span into two parts and insert the current span in between

To make the testing a bit easier I initially inserted two dummy span into each scanline with (startX=0, z=FarClippingDistance) and (startX=ScreenWidth, z=FarClippingDistance)
This way you have no empty parts in a scanline and the end of one span is always the start of the next (so you don't need "endX").
To have quick access to an arbitrary position of the span-list you should organize it as a (self balancing) binary search tree (otherwise you always have to search your list from the beginning until "startX" is reached).
When rastering your polygon top-to-bottom you're always inserting one new span into a different span-list.
This is a bad idea as the data in each list will be scatter all over the memory. So you should pre-allocate continuous span-data for each list in advance.

Once your span-buffer is build up you can draw each scanline left to right.
« Last Edit: March 09, 2012 by hellfire »
Challenge Trophies Won:

Offline Stonemonkey

  • Pentium
  • *****
  • Posts: 1315
  • Karma: 96
    • View Profile
Re: z fighting
« Reply #6 on: March 09, 2012 »
Yes, I already figured that out.
With the inital z-pass each pixel is guaranteed to be drawn only once in the second pass.
So there's basically no overdraw but a bit of overhead for the z-pass.
But without the z-pass and a rough front-to-back ordering you'll skip most pixels in the z-test anyway (except of some pixels where the pre-ordering just didn't make it).
So I was wondering how many pixels you actually save in the second pass by doing an initial z-pass.
I guess you can't really figure out if a polygon is completely invisible as you might draw polygons later which obscure parts of polygons which were already drawn.

I've just tried something out, with a load of random cubes I counted the number of fragments tested (both pass and fail) in the texture stage

With clipping I'm getting 210,000 to 240,000 tests

without clipping (except to screen) 420,000 to 550,000 tests

The actual number that pass and are filled is 170,000-190,000 (same for both methods)

and about 600,000 to 800,000 tests in the z pass
« Last Edit: March 09, 2012 by Stonemonkey »

Offline hellfire

  • Sponsor
  • Pentium
  • *******
  • Posts: 1294
  • Karma: 466
    • View Profile
    • my stuff
Re: z fighting
« Reply #7 on: March 09, 2012 »
Quote
and about 600,000 to 800,000 tests in the z pass
How many of those pass the z test?
Challenge Trophies Won:

Offline Stonemonkey

  • Pentium
  • *****
  • Posts: 1315
  • Karma: 96
    • View Profile
Re: z fighting
« Reply #8 on: March 09, 2012 »
Quote
and about 600,000 to 800,000 tests in the z pass
How many of those pass the z test?
Looks like pretty much the same as the number that pass in the texture stage.

Offline hellfire

  • Sponsor
  • Pentium
  • *******
  • Posts: 1294
  • Karma: 466
    • View Profile
    • my stuff
Re: z fighting
« Reply #9 on: March 10, 2012 »
Looks like pretty much the same as the number that pass in the texture stage.
That's bad news.
In the texture-pass there should be significantly less pixels passing the depth-test than in the z-pass.
Otherwise the z-pass is redundant.
In this case that's probably because of the layout of your scene:
Each cube doesn't have self-overdraw and all cubes are perfectly sorted.
The z-pass makes only sense when you can't sort the objects/polys properly.
The result should be significantly different when eg drawing back-to-front.
Challenge Trophies Won:

Offline Stonemonkey

  • Pentium
  • *****
  • Posts: 1315
  • Karma: 96
    • View Profile
Re: z fighting
« Reply #10 on: March 10, 2012 »
Not necessarily bad news, if I'm wanting to interpolate quite a few things, say uv,rgb,normal then that adds quite a lot of work per fragment whether it passes or not.

Without the early z pass I'd have to interpolate those things across the 6-800,000 fragments, as well as the setup for those values on the polys and scanlines that are discarded.

Instead, I only interpolate z over the 6-800,000 fragments, discard some of those tris and reduce the number of scanlines and fragments on the remaining tris.

Offline hellfire

  • Sponsor
  • Pentium
  • *******
  • Posts: 1294
  • Karma: 466
    • View Profile
    • my stuff
Re: z fighting
« Reply #11 on: March 10, 2012 »
Not necessarily bad news, if I'm wanting to interpolate quite a few things, say uv,rgb,normal then that adds quite a lot of work per fragment whether it passes or not.
Okay.
Would be interesting to see how many vertex attributes are rquired until the z-pass pays out - after all it's "just" an add per pixel.
Especially when interpolating rgb the polys are usually sufficiently small so you can probably interpolate them linearly and with smaller precision (thus some in parallel).
Basically the same applies to normals (nobody will notice if they are half a degree off).

You could also include the z-pass for each scanline into the texture-pass to check if the scanline is actually visible, so you don't have to process your data twice.
When laying out the zbuffer in a hierarchial structure you're even able to check the visibility of multiple fragments (or even a complete polygon).
« Last Edit: March 10, 2012 by hellfire »
Challenge Trophies Won:

Offline Stonemonkey

  • Pentium
  • *****
  • Posts: 1315
  • Karma: 96
    • View Profile
Re: z fighting
« Reply #12 on: March 13, 2012 »
You could also include the z-pass for each scanline into the texture-pass to check if the scanline is actually visible, so you don't have to process your data twice.
Yep, that's what I'm doing, i find the min and max x and y of the fragments that pass the z test on the poly and then store those values which are used for clipping of the poly in the texture stage, the rasterizer steps straight into the poly at the clipped coords. If there was something in front of a poly and splitting it in 2 however, it would still rasterize the complete poly.
Quote
When laying out the zbuffer in a hierarchial structure you're even able to check the visibility of multiple fragments (or even a complete polygon).
Thanks, I'll have a read through that.

Offline hellfire

  • Sponsor
  • Pentium
  • *******
  • Posts: 1294
  • Karma: 466
    • View Profile
    • my stuff
Re: z fighting
« Reply #13 on: March 13, 2012 »
Yep, that's what I'm doing, i find the min and max x and y of the fragments that pass the z test on the poly and then store those values which are used for clipping of the poly in the texture stage, the rasterizer steps straight into the poly at the clipped coords. If there was something in front of a poly and splitting it in 2 however, it would still rasterize the complete poly.
Yes, in the first pass you're rasterizing the whole polygon and in the second pass only the visible part.
So I thought you could combine the two by including a z-test prior to each scanline and just raster the scanline (with all attributes interpolated) if it's visible.
Code: [Select]
bool visible= false;
for (int x=left;x<right;x++,z+=deltaz)
  if (z > zbuffer[x]) visible= true;

if (visible)
  for (int x=left;x<right;x++)
  {
     if (z > zbuffer[x])
      screenbuffer[x]= color;
    z+=deltaz;
    u+=deltau;
    v+=deltav;
    r+=deltar;
    g+=deltag;
    b+=deltab;
  }
Challenge Trophies Won:

Offline Stonemonkey

  • Pentium
  • *****
  • Posts: 1315
  • Karma: 96
    • View Profile
Re: z fighting
« Reply #14 on: March 13, 2012 »
Hmmm, could do. Maybe to take that further

Code: [Select]
int visible_l,visible_r,*visible=&visible_l;
for (int x=left;x<right;x++,z+=deltaz)
  if (z > zbuffer[x])
  {
    *visible=x;
    visible=&visible_r;
  }
//some code to step into scanline

//draw scanline from visible_l to visible_r

Offline Stonemonkey

  • Pentium
  • *****
  • Posts: 1315
  • Karma: 96
    • View Profile
Re: z fighting
« Reply #15 on: March 15, 2012 »
Another thought I've had from time to time is this, to have 2 loops inside the inner loop that are selected by the z test. To me it looks like it should work but it doesn't always seem to be the case.

Code: [Select]
sub textured_triangle(byval buffer as gfx_buffer ptr,_
                        byval triangle as triangle ptr)
   
'sort high and low verts
    dim as vertex ptr v0=triangle->v(0),v1=triangle->v(1),v2=triangle->v(2)
    if v1->sy<v0->sy then swap v0,v1
    if v2->sy<v0->sy then swap v0,v2
    if v2->sy<v1->sy then swap v1,v2

    if v2->sy>v0->sy then
       
'calculate offset between pixel and depth buffers
        dim as integer buffer_offset=(cast(integer,buffer->pixels)-_
                                        cast(integer,buffer->depth))shr 2

'types and stuff for rasterizing
        dim as gradient edge,scan
        dim as single ptr x0=@edge.x,x1=@edge.e
        dim as vertex ptr vr=v0
        dim as brush ptr brush=triangle->brush
        dim as gfx_buffer ptr t=brush->texture
   
        scope
           
'mipmapping (still needs some tinkering)
            dim as single area=abs((v1->sx-v0->sx)*(v2->sy-v0->sy)-_
                                (v2->sx-v0->sx)*(v1->sy-v0->sy))_
                                /(t->wwidth*t->height)
            while (area<triangle->tex_area)and(t->mipmap<>0)
                t=t->mipmap
                area*=4.0
            wend
            v0->su=((v0->u+brush->offset_u)*t->wwidthf-.4999)*v0->sz
            v0->sv=((v0->v+brush->offset_v)*t->heightf*t->wwidthf-.4999)*v0->sz
            v1->su=((v1->u+brush->offset_u)*t->wwidthf-.4999)*v1->sz
            v1->sv=((v1->v+brush->offset_v)*t->heightf*t->wwidthf-.4999)*v1->sz
            v2->su=((v2->u+brush->offset_u)*t->wwidthf-.4999)*v2->sz
            v2->sv=((v2->v+brush->offset_v)*t->heightf*t->wwidthf-.4999)*v2->sz

'gradient for short edge
            edge.de=(v1->sx-v0->sx)/(v1->sy-v0->sy)
           
'gradients for long edge
            dim as single dy=1.0/(v2->sy-v0->sy)
            edge.dx=(v2->sx-v0->sx)*dy
            edge.dz=(v2->sz-v0->sz)*dy
            edge.da=(v2->sa-v0->sa)*dy
            edge.dr=(v2->sr-v0->sr)*dy
            edge.dg=(v2->sg-v0->sg)*dy
            edge.db=(v2->sb-v0->sb)*dy
            edge.ds=(v2->ss-v0->ss)*dy
            edge.du=(v2->su-v0->su)*dy
            edge.dv=(v2->sv-v0->sv)*dy
   
'gradients for horizontal scanlines
            dy=v1->sy-v0->sy
            dim as single dx=1.0/(v1->sx-(v0->sx+edge.dx*dy))
            scan.dz=(v1->sz-(v0->sz+edge.dz*dy))*dx
            scan.da=(v1->sa-(v0->sa+edge.da*dy))*dx
            scan.dr=(v1->sr-(v0->sr+edge.dr*dy))*dx
            scan.dg=(v1->sg-(v0->sg+edge.dg*dy))*dx
            scan.db=(v1->sb-(v0->sb+edge.db*dy))*dx
            scan.ds=(v1->ss-(v0->ss+edge.ds*dy))*dx
            scan.du=(v1->su-(v0->su+edge.du*dy))*dx
            scan.dv=(v1->sv-(v0->sv+edge.dv*dy))*dx

'flip left/right edges if required
            if edge.dx>edge.de then swap x0,x1
           
        end scope

'rasterize
        for tri_half as integer=0 to 1
            if v1->sy>v0->sy then
               
                dim as integer y_start=v0->sy+0.4999,y_end=v1->sy-0.4999
                if y_start<0 then y_start=0
                if y_end>=buffer->height then y_end=buffer->height-1
                dim as single dy=y_start-vr->sy
                edge.x=vr->sx+edge.dx*dy
                edge.e=vr->sx+edge.de*dy
                edge.z=vr->sz+edge.dz*dy
               
                while y_start<=y_end
                    dim as integer x_start=*x0+0.4999,x_end=*x1-.4999
                    if x_start<0 then x_start=0
                    if x_end>=buffer->wwidth then x_end=buffer->wwidth-1
                    dim as single dx=x_start-edge.x
                    scan.z=edge.z+scan.dz*dx
                    dim as single ptr p_start=buffer->depth+y_start*buffer->pitch+x_start
                    dim as single ptr p_end=buffer->depth+y_start*buffer->pitch+x_end
                   
                    while p_start<=p_end
                       
                        if *p_start<scan.z then
                           
                            scan.a=vr->sa+edge.da*dy+scan.da*dx
                            scan.r=vr->sr+edge.dr*dy+scan.dr*dx
                            scan.g=vr->sg+edge.dg*dy+scan.dg*dx
                            scan.b=vr->sb+edge.db*dy+scan.db*dx
                            scan.s=vr->ss+edge.ds*dy+scan.ds*dx
                            scan.u=vr->su+edge.du*dy+scan.du*dx
                            scan.v=vr->sv+edge.dv*dy+scan.dv*dx
                           
                            while (p_start<=p_end)and(*p_start<scan.z)

*p_start=scan.z
dim as single zz=1.0/scan.z
dim as uinteger argb=t->pixels[((scan.u*zz) and t->u_mask)or((scan.v*zz) and t->v_mask)]
'shade argb
cast(uinteger ptr,p_start)[buffer_offset]=argb
                               
                                scan.z+=scan.dz
                                scan.a+=scan.da
                                scan.r+=scan.dr
                                scan.g+=scan.dg
                                scan.b+=scan.db
                                scan.s+=scan.ds
                                scan.u+=scan.du
                                scan.v+=scan.dv
                                dx+=1.0
                                p_start+=1
                            wend
                        else
                            while (p_start<=p_end)and(*p_start>=scan.z)
                                scan.z+=scan.dz
                                dx+=1.0
                                p_start+=1
                            wend
                        end if
                    wend
                    edge.x+=edge.dx
                    edge.e+=edge.de
                    edge.z+=edge.dz
                    dy+=1.0
                    y_start+=1
                wend
            end if
            edge.de=(v2->sx-v1->sx)/(v2->sy-v1->sy):v0=v1:v1=v2:vr=v2
        next
    end if
end sub
« Last Edit: March 15, 2012 by Stonemonkey »

Offline hellfire

  • Sponsor
  • Pentium
  • *******
  • Posts: 1294
  • Karma: 466
    • View Profile
    • my stuff
Re: z fighting
« Reply #16 on: March 19, 2012 »
Interesting approach.
Have you already tested how much this saves versus a naive ordered zbuffer loop?
Challenge Trophies Won:

Offline Stonemonkey

  • Pentium
  • *****
  • Posts: 1315
  • Karma: 96
    • View Profile
Re: z fighting
« Reply #17 on: March 20, 2012 »
Something else I'm doing is using a bounding cube around the object and testing that against z so I've tried the comparisons with and without that. And I'm interpolating 7+z values across the tri but only using 2 for the texture lookup

With the bounding cube test (completely hidden objects are rejected) there's 4-5% speedup with using what I posted above.

Without the bounding cube test (all objects are passed for rendering)  there's around 8% speedup.

That's in FB, I'm kind of going between that and C++. Having seperate z pass and z fail loops means there's less tri_half setup and if the scanline is completely hidden then there's less scanline setup.

and in c++ the z fail loop is:
Code: [Select]
dx+=1.0f;
while (  ((p_start+=1)<=p_end)  &&  (*p_start>=(scan.z+=scan.dz))  )  dx+=1.0f;
but i've not tested this yet.

Offline hellfire

  • Sponsor
  • Pentium
  • *******
  • Posts: 1294
  • Karma: 466
    • View Profile
    • my stuff
Re: z fighting
« Reply #18 on: March 21, 2012 »
Slightly different topic but when I was looking at your textured_triangle function, I noticed there's no sub-pixel correction present.
Maybe you left it out for readability but I once wrote a bit about it here.
This might be another reason for the inconsistent interpolation you mentioned in your first post.

edit: Ah, don't mind me. I just wasn't looking close enough  :whack:
« Last Edit: March 21, 2012 by hellfire »
Challenge Trophies Won:

Offline Stonemonkey

  • Pentium
  • *****
  • Posts: 1315
  • Karma: 96
    • View Profile
Re: z fighting
« Reply #19 on: March 21, 2012 »
The code that does the clipping also deals with sub pixel correction

dim as single dy=y_start-vr->sy

in c++ it's

float dy=(float)y_start-vr->sy;

In the top half of the tri vr is the top vert and y_start=(int)ceil(vr->sy) so dy ends up as the fractional part plus any lines that are clipped off, the same happens for dx.

EDIT: ah, someone on phone while i was writing that so didn't see you worked it out.