Author Topic: Coding 1k Intro (Read 11095 times)

va!n · « **on:** March 20, 2008 »

hi guys,

i will try to find someone at the BP who will show me hopefully the basics how to use and work with VC Studio. Since i have a 1k framework (thanks rbraz!) and a small simple effect (two loops and just plotting a color), i will try to get a 1k intro finish at party for the compo or a next comming party.

The question is following, since i will try to code in CPP (visual Studio 2008 installed) and using DX9... What the best (smalltest) thing to do screenplotting (like plasma or any other effect where you have to plot each pixel)?

- Software (just drawing on a bitmap and resize this to the screen as texture)?
- Drawing directly to the DX/D3D ScreenBuffer?
- Using PixelShader?

PS: If i manage to get the fx work with VC and in 1k, i will try to add the 1k music (if possible), else i will release it without music this time ^^

Jim · « **Reply #1 on:** March 21, 2008 »

Hard to say which would be the smaller without trying them all. I'd say OpenGL Pixel Shaders, followed by locking the Direct3D backbuffer, followed by doing a GDI blit, but I could easily be wrong. You will almost certainly need to code in C and not C++. C++ code relies too much on the libraries, but again, ymmv.

I've not been able to get crinkler working with VC 2008. I wonder if that's just me?

Jim

Rbz · « **Reply #2 on:** March 21, 2008 »

Since you are going to make a pixel based effect, you can try Iq's framework, it's really easy to use and tiny.

http://in4k.untergrund.net/index.php?title=Drawing_Pixels

va!n · « **Reply #3 on:** March 21, 2008 »

@rbz:
thanks... still have problems...

Jim · « **Reply #4 on:** March 21, 2008 »

Neat little GDI framework there rbz! Possibly just missing an ExitProcess(0) at the end to make it exit properly.
What problems Va!n?
Jim

va!n · « **Reply #5 on:** March 21, 2008 »

@jim:
i will write a pm in a hurry

va!n · « **Reply #6 on:** April 17, 2008 »

First i have to thanks all the guys here, helping me to start coding 1k intros! thanks!

I am trying to optimize and fix my first 1k intro from the BP party for a 100% final version. I managed it to optimize the intro a lot in its size, by trying (learning by doing) what happens when changing parts of the code or rewrite complete things with great success

Atm i am still using CodeBlocks with GNU GCC compiler and i will try to move over to VisualStudio2008 and its compiler very soon. Now i have some general C/CPP and optimizing questions:

Quote

The first line works fine and temp is set to static. Its count temp +1 every loop and reset it to 0, when its reached 2048 (tried to use mod). I have tried to optimize the filesize by changing the line... it produce a smaller exe but the variable temp will not be reset to 0 ... so is there any other way to get a smaller filesize, i.e. using % by using inline asm?
Code: [Select]
temp = (temp+1)%2048; // static, this works fine, packed exe == 1091 bytes (temp ++) % 2048; // static, does not work, packed exe == 1083 bytes temp++; // static, does not work, packed exe == 1083 bytes temp++; // no static, does not work, packed exe == 1076 bytes

Quote

i am using mciSendString but there is mciSendCommand available, you could possible save some bytes by writing directly to its structures (MCI_LOAD_PARMS, MCI_PLAY or MCI_VD_PLAY_PARMS i think). Sadly i dont get managed the following two lines, when using mciSendCommand.
Code: [Select]
mciSendString("open c:\\windows\\ourmovie.mpg type mpegvideo alias v", NULL, 0, NULL); mciSendString("play v",NULL,0,NULL);If someone can help here, it would be really nice!

Quote

When exit the program, using ExitProcess(0); ... is it important to write "return 0;" after "ExitProcess(0);" or is their no problem when removing the "return 0"; thing?
Code: [Select]
ExitProcess(0); return 0;

Quote

Creating the fx, i am using two loops (x and y)...
Code: [Select]
float t1,t2,t3; for (x=639; x>0; --x) { t2 = x*0.017453292519943295f; // x*(2*#PI/360) // t1 = Sin(t2); // t3 = Cos(t2); __asm __volatile__ ("fsincos" : "=t" (t3), "=u" (t1) : "0" (t2)); for (y=339; y>0; --y) { buffer[x+ ((y+70)*640) ]= RGB((t1*y+temp),(t3*y-temp),240); } }Is there any way to put this complete fx into just only one loop? I have tried it but it seems somewhere is something wrong and i am not really sure how to manage it to put all in just one loop. (info: the buffer is 640*480 in size)
Code: [Select]
for (i=640*340-1; i>0; --i) { t2 = i*0.017453292519943295f; // x*(2*#PI/360) // t1 = sin(t2); __asm __volatile__ ("fsincos" : "=t" (t3), "=u" (t1) : "0" (t2)); // for (y=339; y>0; --y) buffer[i+44800]= RGB((t1*(i+44800)+temp),(t3*(i+44800)-temp),240); }

Quote

Atm the intro is compiled with GCC... i think and still hope that the same intro will compiled using VisualStudio2008 and its compiler to a smaller exe size!? (i have to change some code parts because of incompatibility between GCC and VC2008 syntax ^^

thanks in advance!

rain_storm · « **Reply #7 on:** April 17, 2008 »

Ive seen people using ret in assembler instead of ExitProcess NULL and it seems to work just fine and it saves you having to import ExitProcess. There may be situations where a simple ret is not enough though. But when you are using ExitProcess there is no need to tag a ret on there aswell that will only save you between 1 byte and 5 bytes depending on the size of the value you are returning in eax

va!n · « **Reply #8 on:** April 17, 2008 »

mhhhh.... i thought about just using "return 0" instead using ExitProcess(0) too... but i think ExitProcess is doing some more things (sytem internally) instead just RET... (so i think it would be a very dirty hack, isnt it?)

Jim · « **Reply #9 on:** April 17, 2008 »

You need ExitProcess(0) and no ret, otherwise it will hang at exit on Vista.

Jim

va!n · « **Reply #10 on:** April 17, 2008 »

@jim:
thanks for the info. i still thought that any other way would be a very dirty hack and will make problems.

So i only need:

Code: [Select]

ExitProcess();

and not...

Code: [Select]

ExitProcess();
Return 0;

right?

Any ideas or tipps to my other qestions?

thx

Jim · « **Reply #11 on:** April 18, 2008 »

You need ExitProcess(0); is all.

Anything you try to do in your inner loops to make it one loop rather than two is futile. The looping construct, while it looks like more text, is actually smaller than trying to decode the 'x' value from the loop value, IMHO.

Jim

hellfire · « **Reply #12 on:** April 18, 2008 »

Quote

Any ideas or tipps to my other qestions?
Quote
Creating the fx, i am using two loops (x and y)
[...]
Is there any way to put this complete fx into just only one loop?

Why would you want that?
You probably expect your code to get smaller if you can save one loop, but it won't, because:
- Your computation depends on your loop-counters
- You have per-scanline setup to do.
However, you can simply increase your pointer (*buffer++=...) instead of permanently calculating the next pixel's address (buffer[x+ y*640]=...).

rain_storm · « **Reply #13 on:** April 18, 2008 »

Not really, it is very much possible to discard the outter loop. But first you must define the array to have only one dimension. Then, and this is the most important part, make sure that your array is easy to manipulate to extract the missing dimension. 640*480 is not really suitable for this. You must pick an X resolution that is an exact power of two, lets say 256*196 (thats 0x100*0xC4 in hex) which is 4:3 ratio giving square pixels, looking at the index into the array you can see that the index is simply a word with the high byte containing the y displacment*256 and the low byte contains the x displacement

Now its a simple matter of doing :

Code: [Select]

 y = and index, 0xFF00
 y = shr y, 8
 x = and index,0x00FF

of course at this resolution the problem is overly simplified but you can tweak the bit masks and apply the appropriate shifts to the y part.

Edit: If you are really clever about it you can just stuff the index into ax then ah = y al = x no need to and or shift

hellfire · « **Reply #14 on:** April 18, 2008 »

Quote

You must pick an X resolution that is an exact power of two

And that's why I didn't even mention it.
The resolutions are:
256 - ridicolously low
512 - not too compatible for fullscreen
1024 - probably too much for software-processing

With all other resolutions you'll most probably spend more space *and* time on extracting your x-coordinate than a second loop would ever cost.

rain_storm · « **Reply #15 on:** April 18, 2008 »

^all valid points

Code: [Select]

for (y=339; y>0; --y)
{
	buffer[x+ ((y+70)*640) ]= RGB((t1*y+temp),(t3*y-temp),240);
}

thats the inner loop but the only thing being done in the outer loop is

Code: [Select]

for (x=639; x>0; --x)
{
	t2 = x*0.017453292519943295f;		// x*(2*#PI/360)
	(inner loop
}

its gonna take alot more bytes to perform the outter loop than its worth since its just a degrees -> radian

Code: [Select]

fild [x]
fmul [0.017453292519943295]
fstp [t2]

just three instructions and only two memory accesses, add in the loop and there are additional memory accesses for the comparison the increment and then storing x as well as taking up byte space. So just keep one index and one loop replace x*(2*#PI/360)
with x = y*[1/640] : t2 = x*[2*#PI/360] now it becomes

Code: [Select]

fld [y]
fmul [0.0015625] // 1/640
fstp [x]
fmul [0.017453292519943295] // 2*pi/360
fstp [t2]

thats just five instructions and all done in one loop it may not be as fast but its smaller

Stonemonkey · « **Reply #16 on:** April 18, 2008 »

I didn't know you could do this:

Code: [Select]

fmul [0.017453292519943295]

How does the assembler deal with that, does it write the value into the code or does it store the value elsewhere and address it?

hellfire · « **Reply #17 on:** April 18, 2008 »

Quote

just keep one index and replace x = y*[1/640]

Keep in mind that your calculation is floating point.
So for each scanline "y", x now doesn't start with 0.0 but with y/640.0 - probably unnoticeable, probably undesireable.
And instead of converting deg to rad for every pixel, one would linearly interpolate in radians in the first place.

rain_storm · « **Reply #18 on:** April 18, 2008 »

@SM the value would be at some address my bad

im all outta ideas perhaps you Va!n can strip out an import here and there might get it down to 1024b

va!n · « **Reply #19 on:** April 18, 2008 »

wow i am surprised about all your postings trying to help me... there are so much postings, that i have to read them all carefully, trying to understand and think about..

btw, the buffer i am using is an 1D array... and yes i think i can save some bytes when changing the 2-loop part to 1-loop. (i just removed one For/Next loop (without removing the code inside this loop) and i still saved some bytes when compiled and crinkled. (31 bytes different in packed version!)

Code: [Select]

static long buffer[307200];  // 640*480

		float t1,t2,t3;

		For (x=639; x>0; --x)
		{
			t2 = x*0.017453292519943295f;		// x*(2*#PI/360)
//	  t1 = Sin(t2);
//		t3 = Cos(t2);
			__asm __volatile__ ("fsincos" : "=t" (t3), "=u" (t1) : "0" (t2));

			For (y=339; y>0; --y)
			{
			  buffer[x+ ((y+70)*640) ]= RGB((t1*y+temp),(t3*y-temp),240);
			}
		}


This version is 31 bytes smaller (packed), just a fast dirty test:


		float t1,t2,t3;

		For (x=639; x>0; --x)
		{
			t2 = x*0.017453292519943295f;		// x*(2*#PI/360)
//	  t1 = Sin(t2);
//		t3 = Cos(t2);
			__asm __volatile__ ("fsincos" : "=t" (t3), "=u" (t1) : "0" (t2));

//			For (y=339; y>0; --y)
//			{
			  buffer[x+ ((y+70)*640) ]= RGB((t1*y+temp),(t3*y-temp),240);
//			}
		}

Optimizing the 2loops to 1loop is one idea... another ideas are: Replacing mciSendString stuff with mciSendCommand and using structures to save possible some bytes. And to find an alternative for the temp = (temp+1)%1024 stuff, which could save bytes in packed version too.

At least i will try to get work my 1k in VisualStudio 2008... due fact of this, i have to change some parts of the code like the ASM line... afaik i have to change the ASM line (GCC compiler) to something like this:

Code: [Select]

__forceinline void mySinCos(const float x,float &sine,float &cosine)
{
  __asm
  {
    fld x;
    fsincos;
    mov eax,[cosine];
    fstp dword ptr [eax];
    mov eax,[sine];
    fstp dword ptr [eax];
  }
}