Author Topic: float to int using slightly different magic number (seems unstable) (Read 8789 times)

Stonemonkey · « **on:** June 03, 2007 »

Been looking into some other ways of doing this and came up with the possibility of adding the magic number before interpolation but it appears to be a bit unstable, I think the divide by z for perspective correction may be making it worse. Any Ideas on this anyone? am I doing anything wrong and could it be made to work? Seems to work fine for this example but it has problems when it's actually used in my rendering although there might be something else causing the problems, i'll keep at it anyway.

Code: [Select]

option explicit

dim as integer i
dim as double x0,x1
dim as short pointer xi
xi=cast(short ptr,@x0)+2

'start and end values
x0=-10066.00
x1=1767.000


dim as single dx,dy

'number of steps
dy=10.0

'add the magic numbers
x0+=1572864.0
x1+=1572864.0

'interpolate
dx=(x1-x0)/dy
for i=0 to 10
    print *xi
    x0+=dx
next

sleep
end

Stonemonkey · « **Reply #1 on:** June 03, 2007 »

Actually, it seems to work pretty well linearly but just has problems with the perspective correction.

ninogenio · « **Reply #2 on:** June 03, 2007 »

please do keep at it fryer!

its all very intresting!

Jim · « **Reply #3 on:** June 04, 2007 »

I don't think that will work. When you do an operation on a pair of floats, the scale will change depending on the result. So you need to add the correction in afterwards.
It might be informative to print out the hex version of the float to see what's changing, the top bit is the sign, the next 11 bits are the exponent, and the next 52 are the mantissa.
Maybe:
print hex$(((int ptr)@x0)(0))

Jim

Stonemonkey · « **Reply #4 on:** June 04, 2007 »

The example works as long as the values are within the range of signed or unsigned shorts .... as I was just typing 'not too much of a problem with shading or uv coords' I just realised that I shift the v coord up before the interpolation and that's probably going out of that range.

EDIT: Pretty sure that's the problem, it's just happening with the tiled textures.

Stonemonkey · « **Reply #5 on:** June 04, 2007 »

Bah, wrong again. Too late for this atm anyway.

Stonemonkey · « **Reply #6 on:** June 04, 2007 »

Another example, now with perspective correction and it still seems to be working:

Code: [Select]

option explicit

dim as integer i
dim as double u0,u1,z0,z1,du,dy,dz,result
dim as short ptr ui=cast(short ptr,@result)+2

'start and end values
u0=-1234.0
z0=1000.0

u1=10255.0
z1=2000.0

dy=15.0

'add the magic numbers
u0+=1572864.0
u1+=1572864.0

z0=1.0/z0
z1=1.0/z1
u0*=z0
u1*=z1

'interpolate
du=(u1-u0)/dy
dz=(z1-z0)/dy

for i=0 to 15
    result=u0/z0
    print "my_short=";*ui;"    double=";result-1572864.0
    u0+=du
    z0+=dz
next

sleep
end

Stonemonkey · « **Reply #7 on:** June 05, 2007 »

On a slightly different note, while searching for anything related to this I found what is supposed to be quake3s approx. invsqrt, the code in C++

Code: [Select]

public static float invSqrt(float x)
{
float xhalf = 0.5f * x;
int i = Float.floatToRawIntBits(x);
i = 0x5f3759df - (i >> 1);
x = Float.intBitsToFloat(i);
x = x * (1.5f - xhalf*x*x);
return x;
}

and re written for FB:

Code: [Select]

function invSqrt(byval x as single)as single
    dim as single xhalf = 0.5 * x
    dim i as integer ptr=cast(integer ptr,@x)
    *i=&h5f3759df-(*i shr 1)
    x=x*(1.5-xhalf*x*x)
    return x
end function

The FB function is only very slightly faster that using 1/sqr(x) but improves a little bit if it's written inline rather than calling the function.

I've not tried it in c++ yet to do any comparisons but I'm not sure why it uses 'i' when it could use *(int*)&x

Stonemonkey · « **Reply #8 on:** June 07, 2007 »

Anyone know if there's any method similar to the invsqrt above for either dividing or reciprocal, I can use invsqrt for reciprocal but have to square the input first but it still looks like that's at least as fast as just doing the divide. There is a slight error in the results but it's good enough for texture mapping.

Stonemonkey · « **Reply #9 on:** June 07, 2007 »

Another interesting trick for divides:

http://www.stereopsis.com/2div.html

taj · « **Reply #10 on:** June 07, 2007 »

Fantastic link. Karma++

Jim · « **Reply #11 on:** June 07, 2007 »

The fast routine there is using Newton-Raphson approximation (1 level) to get a reasonable approximation for 1/sqrt. I don't think this technique is applicable to division directly.
http://en.wikipedia.org/wiki/Newton's_method
http://betterexplained.com/articles/understanding-quakes-fast-inverse-square-root/

Jim

Stonemonkey · « **Reply #12 on:** June 07, 2007 »

cool stuff, most of the links i found on it didn't even try to explain it. Still, not like i understand it now.

Stonemonkey · « **Reply #13 on:** July 27, 2007 »

Not only can you use the magic number to convert from float to int, but it works the other way too (or at least seems to so far and the high int has to be set first). Probably not much use though.

Code: [Select]

dim as double test

dim as integer pointer itest
itest=cast(integer pointer,@test)


'convert float to int
test=1659.6555
test+=6755399441055744.0
print *itest

'convert int to float
*itest=98765
test-=6755399441055744.0
print test

sleep
end

Had another thought too, when using pointers, the program has to look up the address of the pointer, read the address stored there and then look up that address. Would using unions be a better idea as it wouldn't need to look up an address to get an address?

EDIT: my idea of int to float doesn't work for negative numbers.

Jim · « **Reply #14 on:** July 28, 2007 »

Quote

Would using unions be a better idea as it wouldn't need to look up an address to get an address?

The only way to tell is to try it and disassemble the results. If you use the value more than once, keeping pointers hanging round is going to win, I would guess.

Jim

Stonemonkey · « **Reply #15 on:** July 28, 2007 »

This is the sort of thing I was meaning:

Code: [Select]

union float_to_int
{
  double in;
  int out;
};



image_struct *texture=triangle->texture;
int u_mask=texture->wwidth-1,v_mask=(texture->height-1)*texture->wwidth;
double magic=6755399441055744.0;
union float_to_int convert_u,convert_v,convert_s;




//z_buffer, tiled pc texture lookup and shading

if(z>*z_buffer)
{

  *z_buffer=z
  zr=1.0f/z;

  convert_u.in=u*zr+magic;
  convert_v.in=v*zr+magic;
  convert_s.in=s*zr+magic;

  argb=*(texture->argb+((convert_u.out&u_mask)|(convert_v.out&v_mask)));

  *(z_buffer+argb_offset)=((((argb&0x00ff00ff)*convert_s.out)&0xff00ff00)|(((argb&0x0000ff00)*convert_s.out)&0x00ff0000))>>8;

}

And I'm still convinced I can take the magic number addition out of the loop.

Stonemonkey · « **Reply #16 on:** August 01, 2007 »

Sorry, that was from my C++ code, here's how it could be done in FB:

Code: [Select]

option explicit

union float_to_int_conversion
    _in as double
    _out as integer
end union

const magic=6755399441055744.0

dim shared as float_to_int_conversion float_to_int



sub main()
    
    dim as single my_float=123.456

    float_to_int._in=my_float+magic

    print float_to_int._out

    sleep
    
end sub

main()
end

And in this case it does use 1 less memory lookup, whatever difference that might make.

asm output doing the conversion using unions:

Code: [Select]

fld qword ptr [_Lt_0006]
fadd dword ptr [ebp-4]
fstp qword ptr [_FLOAT_TO_INT]
push 1
push dword ptr [_FLOAT_TO_INT]
push 0
call _fb_PrintInt@12

asm output doing the conversion using pointers:

Code: [Select]

fld qword ptr [_Lt_0006]
fadd dword ptr [ebp-4]
fstp qword ptr [_DOUBLE_IN]
push 1
mov eax, dword ptr [_INT_OUT]
push dword ptr [eax]
push 0
call _fb_PrintInt@12

Jim · « **Reply #17 on:** August 01, 2007 »

Are you sure? The 3 pushes at the end of each are the code to put the parameters on the stack for the call to PrintInt. It's not clear in this example whether you've gained anything.

I've been thinking about your technique of adding the conversion factor before the interpolation.
I'm pretty sure that as long as your converted number never changes its power-of-2 (that is, it doesn't double or half during the interpolation), then you will always be OK. I can't see how that could ever happen in your case since the conversion factor is huge in comparison to your interpolates.

The only other possible flaw is you will lose precision every time you do an add because most of the bits are being taken up by the conversion factor.
ie. your mantissa used to look like (for a value of, say, 5)
before
010000...00000000000000000000
^this is the 5
after
100000...00000000000000001000
^this is the correction factor
^ this is the 5
(not accurate bit patterns, just an example)
so instead of having 53bits of precision you may only have 4 or 5.

Jim

Stonemonkey · « **Reply #18 on:** August 01, 2007 »

The line:

push dword ptr [eax]

in the pointers version is pushing the contents of address [eax] onto the stack so it has to load the address into eax then retrieve the data from that address then write it to the stack whereas the version with unions uses an absolute address that it reads from and writes to the stack.

I think the problem with my other method might be due to innacuracy due to the division, I was adding the magic number to the U/V values before the division by z but I think due to there being a very large part (the magic number) and a very small part (the actual U or V) some of the very small part might be getting lost in the division. It's not a huge error but the textures don't line up on adjacent polys and tend to jump/slide about a little bit.

It might still be possible but could involve some shifting or masking which would probably not give much advantage.

Jim · « **Reply #19 on:** August 02, 2007 »

Your magic is 2^53 + 1/2 x 2^53 (=2^53 + 2^52), which shunts all the remainder of the precise bits down into the lower 32. If your range of numbers is much smaller than 2^(53-32) (+-2Meg) then you're going to be losing bits compared with what you had when you were working in ints. Every operation you do will reduce the precision, especially division.

I guess to get more precision, you could use (say) 2^33+2^32, but then as you say, you'd have to scale the resulting integers after you'd converted them.

Jim