Dark Bit Factory & Gravity
PROGRAMMING => Freebasic => Topic started by: Xalthorn on June 16, 2008
-
Hi folks,
I'm going crazy, I've spent a couple of days banging my head against what should be a simple piece of asm code :(
HELP!
In Freebasic, I prepare the following array:
DIM SHARED PSINES(512) AS INTEGER
DIM I AS INTEGER
DIM RAD AS DOUBLE
FOR I=0 TO 512
RAD=(I*0.703125)*0.0174532
PSINES(I)=SIN(RAD)*511
NEXT
Now, unless I'm going nuts, this array now holds a sine table as signed integers.
I now want to grab four of those values and add them together in asm. So I prepare the following variables outside of my subroutine
DIM SHARED TPOS1 AS UINTEGER=0
DIM SHARED TPOS2 AS UINTEGER=0
DIM SHARED TPOS3 AS UINTEGER=0
DIM SHARED TPOS4 AS UINTEGER=0
These variables will hold values from 0-511.
I then prepare the following pointer in my routine
DIM PTRSINE AS INTEGER PTR
The appropriate piece of the asm code to add these numbers together is:
mov edi,[PTRSINE]
mov esi,[tpos1]
mov eax,[edi+esi]
mov ebx,eax
mov edi,[PTRSINE]
mov esi,[tpos2]
mov eax,[edi+esi]
add ebx,eax
mov edi,[PTRSINE]
mov esi,[tpos3]
mov eax,[edi+esi]
add ebx,eax
mov edi,[PTRSINE]
mov esi,[tpos4]
mov eax,[edi+esi]
add ebx,eax
I'm fairly sure that I'm doing something silly. I'm wondering if two's complement is killing me. I don't know how INTEGERS are stored and am assuming that it's the grabbing and adding of these numbers that is messing me around.
Then again, it could be the logic of my additions (I've gone through assorted variations, that's just the current version). I'm hoping it isn't down to particular registers being used for addition functions by default and I'm oblivious to registers that I'm using being overwritten by the calculations.
I'm going to keep at it, but any input/advice is really really welcome.
-
I can't see what you've done wrong, but I know that if you do what you are trying to do directly with FB it will be simpler and more correct. You should use that as a sanity check. I thing -s option for fbc writes out the asm. Or is it -S?
Jim
-
Hi Jim,
I'm not sure what you mean (brain is becoming increasingly frazzled at the moment). I wrote the routine first of all in plain FreeBasic code, and I'm running the two versions alternately to see what isn't working in my asm version.
For example, trying the 'kerplunk' method of programming, I've changed both versions to do very little but throw a static value to the screen, and then I've started adding little changes as I go to both versions to see if they output the same result.
I'm curious about the -s option. Does this mean that I can write something in FreeBasic and see what the compiled version generates?
Am I being really silly, do I actually need to drop into asm for speed? Or is that only for optimising where I think that the default compiled version could be improved?
For example, as there is no way (at least I can't find it) in FreeBasic to throw huge chunks of screen data around, is that a perfect situation for a fast asm copy loop?
In other words, if I'm just messing around with individual pixels, will I actually gain any benefit from writing it in asm?
If I'm chasing my tail on a pointless exercise, I'll laugh myself silly. Although I'd still like to learn why my code is falling over.
-
I have no clue about basic, but in c++ integers have a size of 4 bytes, so you should take that into account when adressing:
mov edi,[PTRSINE]
mov esi,[tpos1]
mov eax,[edi+esi*4]
-
I have no clue about basic, but in c++ integers have a size of 4 bytes, so you should take that into account when adressing:
mov edi,[PTRSINE]
mov esi,[tpos1]
mov eax,[edi+esi*4]
LOL
I knew it would be something amusingly, embarrassingly, annoyingly, stupidly, simple :D
Yep, there's me, wondering why I can't simply pull a particular sine value out of the array when I'm incrementing by byte rather than word.
A quick twiddle of the code later and it works as it should do.
Thanks ever so much, now if I had only posted this Friday night rather than wrestling with a myriad of alternate variations only to hit the same result, I might have actually finished this code by now.
-
Are you really using asm to add some sine tables together?
I am thinking this is probably unnecessary optimisation, but I am glad you have it
working now. :)
I tend to only use asm for things like rasterising triangles and copying large chunks of data.
For nearly everything else I stick to the usual command set (using pointers wherever I can).
-
Are you really using asm to add some sine tables together?
I am thinking this is probably unnecessary optimisation, but I am glad you have it
working now. :)
I tend to only use asm for things like rasterising triangles and copying large chunks of data.
For nearly everything else I stick to the usual command set (using pointers wherever I can).
Nope, I'm using asm to do an effect which involves sine tables and stuff. As that's only one part of the demo, I figured I'd convert it to asm for optimisation.
Whether it has gained me any 'slack' or not has yet to be tested, but it's never a bad idea to try ;)
Either way, with that annoying thing out of the way (I figured it would take me half an hour tops to convert, rather than a whole weekend of faffing about) I can get on with the other parts of the demo.
-
edi doesnt change in that code so it is safe to only move PTRSINE into edi once. You should use the accumulator for all arithmetic. I would personally use ebx in place of edi because that is a perfect role for the base pointer. This would free edi and you would be using ebx for its intended purpose (its special ability). Whenever you are making use of a registers intended uses the resulting code is faster and often smaller. I would rewrite this as
mov ebx,[PTRSINE] ; <- only need be done once (no values get stored in PTRSINE during the code block)
mov esi,[tpos1]
mov eax,[ebx+esi*4] ; <- Move value into accumulator
mov esi,[tpos2]
add eax,[ebx+esi*4] ; <- using accumulator for arithmetic
mov esi,[tpos3]
add eax,[ebx+esi*4] ; <- more accumulator arithmetic
mov esi,[tpos4]
add eax,[ebx+esi*4] ; <- saves having to use the base pointer
-
edi doesnt change in that code so it is safe to only move PTRSINE into edi once. You should use the accumulator for all arithmetic. I would personally use ebx in place of edi because that is a perfect role for the base pointer. This would free edi and you would be using ebx for its intended purpose (its special ability). Whenever you are making use of a registers intended uses the resulting code is faster and often smaller. I would rewrite this as
mov ebx,[PTRSINE] ; <- only need be done once (no values get stored in PTRSINE during the code block)
mov esi,[tpos1]
mov eax,[ebx+esi*4] ; <- Move value into accumulator
mov esi,[tpos2]
add eax,[ebx+esi*4] ; <- using accumulator for arithmetic
mov esi,[tpos3]
add eax,[ebx+esi*4] ; <- more accumulator arithmetic
mov esi,[tpos4]
add eax,[ebx+esi*4] ; <- saves having to use the base pointer
After a couple of days banging my head against this, I started to make every calculation long and drawn out.
When I got it working, I stripped the extra edi loads, but didn't know I could add directly into the accumulator.
Thanks for that :)
-
Hi Xalthorn, freebasic should be pretty fast and can do a fair bit of pixel work but inline asm in some cases can help a lot. If you're using arrays for screen/image buffers avoid using 2D arrays and stick to 1D arrays although I would recommend using pointers like shockwave said.
If you add -r to the compiler command line the asm output of the compiler is saved in the same folder where the .bas file is.
-
Just read over at the FB site that on the latest versions of FB while the -r compiler option saves the asm ouput it no longer continues to output the .exe file.
-
Hi Xalthorn, how's the scroller coming along?
-
Umm.... slowly
I did a basic sine scroller, that was easy. But I'm playing with another idea now and I've had a few occasions where I've banged my head against a brick wall to find that the eventual solution was simple.
I'm suffering from feature creep though.... because Freebasic can cope with so much, I keep adding ideas to it. So I'm trimming it down to something a little simpler so that I can at least call the little project finished and move on to something else, taking what I've learned with me.
I think rather than try to cram loads of ideas into one demo so early on, I should do a few little ones and then bring them together later.