Heh! It works now!!!
Well thanks! Hmm,. so it's just a 2d version of the dot product and upon the screen points, NOT the projected ones (that was my main problem). I think I get it now. That really works well now alone with Z-buffer disabled for two of my demo objects like the plasma cube or the led blur energy drink cylinder

For the rest of the objects I should also code a little polygon sorting routine. Mmm.. I can take the Z average of the three points of each triangle and then use a quicksort I have here for my vectorballs.
My original purpose was to remove Z-buffer in my GP32 demo. Yesterday, accidentally I disabled the Z-buffer in one object on my demo and was puzzled at first why it is so much faster (Too much memory read and write with Z-buffer here, also the GP32 has only 16kb data cache ;PPP). And then I thought that especially objects like the cube or the cylinder could do without z-buffer and even without z-sorting if I had the thing in question. I had the other thing with 3d dot product after projection but this one would even display the polygons that are close to the 90 angle which although because of the projection should be hidden, so even in these objects you could see some glitches! Now I get double speed and flat and gouraud objects without z-buffer but just a 25% increase in textured object because the texture reading still kills the cache. Althogh my plasma cube got almost double speed too, because I calculate the plasma in realtime by putting U,V in some plasma function directly in my rasterizer routine

Yey! Some parts on my demo started crawling on my GP32 but now I can save. I am still impressed how many more optimizations I can do in this machine, but won't do now because I have to release the demo lol. But maybe for the new GP2X I'll optimize the engine much more. There is hidden power I guess if coded corectly
