Dark Bit Factory & Gravity
PROGRAMMING => C / C++ /C# => Topic started by: XiA on May 07, 2011
-
Hi!
This is my first post; I've just gotten started for REAL doing demo stuff in Windows, and I'm building a little framework for handling "the basics", like opening windows, get mp3 playing in background, syncing audio and visuals etc, using BASS and DX9.
I have a specific texture that I use for "manual drawing", i e reading/writing bytes as if it were a buffer in a DOS demo (I wish).
The actual question: Right now, I'm clearing this buffer with just a for() loop, writing DWORDs with the default values (which are often just 0x00000000, but sometimes I'll default the alpha value to something else, depending on the effect). Surely there must be a way to have the hardware do this via DirectX?
I *am* already using something similar for clearing my backbuffer, but the DX documentation leaves me with more questions than it answers, and as far as I can see, I *might* be able to just set my texture as RenderTarget and do it that way, but device->Clear() also has stencil buffer and z buffer stuff, and it all just gives me a headache.
Is there a simple(r) way?
Thanks in advance,
XiA
-
Hello,
For clearing something, in Dx10 (sorry, I don't have code for Dx9) it's:
const float background_colour[] = { 0, 0, 0.25f, 0 };
_dev->ClearRenderTargetView( _backbuffer_view, background_colour );
I guess for Dx9 it's similar.
-
I have a specific texture that I use for "manual drawing", i e reading/writing bytes as if it were a buffer in a DOS demo
If "reading/writing bytes" means that you have not specified the "D3DUSAGE_WRITEONLY" flag when creating your texture, Direct3D will keep an internal copy of your texture in system memory. That's because random access to the gpu-memory is very slow as the pci-express architecture is designed to transfer continous chunks of data.
Direct3D will update the whole region when you unlock the texture and you won't be able to use the texture as a rendertarget.
-
Not being able to use the texture as a rendertarget is just fine by me; I have a background image, this texture (where I'm drawing realtime stuff like lines, circles etc) and then D3DX sprites on top of that.
-
For clearing something, in Dx10 (sorry, I don't have code for Dx9) it's:
const float background_colour[] = { 0, 0, 0.25f, 0 };
_dev->ClearRenderTargetView( _backbuffer_view, background_colour );
I guess for Dx9 it's similar.
Thanks for the reply, but that method doesn't seem to be available in DX9. I did find ColorFill() tho, but that works on a surface, and VC++ screams "cannot convert parameter 1 from 'LPDIRECT3DTEXTURE9' to 'IDirect3DSurface9 *'" at me when I try.
Any idea on how to make that work, like a way to create a dummy surface and point it to my texture?
-
pd3dDevice->Clear( 0, NULL, D3DCLEAR_TARGET, iColor, 1.0f, 0 )
Thats one way to clear the backbuffer in dx9, Clear is a method of the direct3d9 device and iColor is in my case a 32bit int for the XRGB color.
Dont know if there's a way to use that on a texture, dx not my area atm but I suspect you could make a surface and use that to read/write your "pixel" information to then blit it to the screen when needed. Think Jim made a ptc style lib using a surface and a full screen quad render for clyde to use, might be worth taking a look at that.
-
pd3dDevice->Clear( 0, NULL, D3DCLEAR_TARGET, iColor, 1.0f, 0 )
Thats one way to clear the backbuffer in dx9, Clear is a method of the direct3d9 device and iColor is in my case a 32bit int for the XRGB color.
Hi! Thanks for pitching in, but as far as I can see, device->Clear() only works on buffers. I may very well be wrong, and if so, I'd welcome someone letting me know 8-)
-
So I assume your using a "IDirect3DTexture9" texture ? Then locking it to draw and unlocking it after? If so then there probably isnt a hardware accelerated way to do it, since your only actually using the hardware to render the texture at some point.
But if your using a constant value across the whole texture maybe you can fill faster, Perhaps if you post the code your using to clear the texture someone might offer a faster way.
-
So I assume your using a "IDirect3DTexture9" texture ? Then locking it to draw and unlocking it after? If so then there probably isnt a hardware accelerated way to do it, since your only actually using the hardware to render the texture at some point.
Yeah, you might be right about that...
But if your using a constant value across the whole texture maybe you can fill faster, Perhaps if you post the code your using to clear the texture someone might offer a faster way.
Absolutely, good idea! Here's my code (be gentle with me, I'm a noob at both DX and C++ 8-) ):
hr=texturePixelsBlock->LockRect(0,&rectPixelsLocked, NULL, D3DLOCK_NOSYSLOCK);
if(SUCCEEDED(hr)){
////////////////////////////////////////////////////////////////
// The actual drawing
pPixel32 = (DWORD *)rectPixelsLocked.pBits;
pPixel8 = (BYTE *)rectPixelsLocked.pBits;
// Clear texture
if(1==1){
for(signed int i=(rectPixelsLocked.Pitch/4);i<((rectPixelsLocked.Pitch/4)+WindowWidth*WindowHeight);i++){
pPixel32[i]=0x00000000;
}
}
// And all the rest of the drawing follows here
}
It's been a couple of years since I did 386 asm, but I suspect this code is unlikely to compile to something *really* fast, like a strcpy, but then again, I suspect there's tons of new instructions (i e since the 30386DX was the hottest CPU on the market, LOL) for this that VC++ might make of it. Maybe?
Also, thanks for taking the time to help out here, I really appreciate it!
-
As you have the memory address for the texture could you use memcpy or a little rep stosd assembly loop to clear the buffer?
I usually use these methods when I need to clear out or copy a chunk of memory (admittedly I don't use C++ so am not certain if you can use these techniques here).
-
Try this code below, not sure if it will perform any better than MSVC for/loop.
//Create a black texture
g_pd3dDevice->CreateTexture(32, 32, 1, D3DUSAGE_DYNAMIC,
D3DFMT_A8R8G8B8, D3DPOOL_DEFAULT , &blackTex, NULL);
blackTex->LockRect( 0, &lockTex, NULL, D3DLOCK_DISCARD);
_asm{
mov edi, lockTex.pBits
mov ecx, (32*32)
mov eax, 0xff000000
rep stosd
}
blackTex->UnlockRect(0);
-
Try this code below, not sure if it will perform any better than MSVC for/loop.
Yup, that made a measurable difference! My C++ for() loop averaged roughly 1.9 ms on this machine, your asm code averages just below 1.4 ms! Thanks a lot mate, I'm keeping that one. 8-)
-
_asm{
mov edi, lockTex.pBits
mov ecx, (32*32)
mov eax, 0xff000000
rep stosd
}
In this special case of a managed resource it might be save to assume no additional pitch but I'd never count on it.
Actually there can be quite a lot of pitch on gpu resources which is useless to clear.
Nvidia usually uses padding to the next power of two, so eg for an 32bit texture of 640 pixels width (2560 bytes) you'd actually be clearing 4096 bytes for each scanline...
-
In this special case of a managed resource it might be save to assume no additional pitch but I'd never count on it.
Actually there can be quite a lot of pitch on gpu resources which is useless to clear.
Nvidia usually uses padding to the next power of two, so eg for an 32bit texture of 640 pixels width (2560 bytes) you'd actually be clearing 4096 bytes for each scanline...
Very good point!
Ok, so let's assume I have a DWORD LineWidth in C++, how do I put that in ecx?
-
// width and height of your texture
int height= ...;
int width= ...;
// fill color
unsigned int color= 0xff000000;
// lock texture
texture->LockRect( 0, &rect, NULL, D3DLOCK_DISCARD);
// fill each scanline
unsigned char *dst= (unsigned char*) rect.pBits;
for (int y=0; y<height; y++)
{
_asm {
mov edi,dst // destination pointer
mov ecx,width // number of argb pixels to fill
mov eax,color // argb color
rep stosd
}
dst += rect.Pitch;
}
texture->UnlockRect( 0 );
-
http://msdn.microsoft.com/en-us/library/bb174353(v=vs.85).aspx
There is the ColorFill method for surfaces too.
If you only have a texture you can get the surface by calling texture->GetSurfaceLevel.
You'd want something like
IDirect3DSurface9 *surface;
texture->GetSurfaceLevel(0, &surface);
device->ColorFill(surface, NULL, 0xff000000);
surface->Release();
That only works with certain surface types and of course you can only blit a single colour.
I suspect the asm code isn't going to be much faster (maybe a tiny bit) than using memset or a slightly unrolled for loop.
Jim
-
I suspect the asm code isn't going to be much faster (maybe a tiny bit) than using memset
I suspect memset to be faster than a stosd-loop as it's optimized to clear buffers of a few hundred bytes or more - but I'm too lazy to actually measure it ;)
But clearing a buffer is certrainly the least significant optimization-option when doing software-pixel-processing anyways...
-
Agreed, have a look at the source for the VC runtime included in visual studio. IIRC it moves 128 bits (4 dwords) at a time using SSE, which will be marginally better than stosd.
Jim
-
(code)
Ah, it's THAT easy! Thanks a bunch!
-
unsigned char *dst= (unsigned char*) rect.pBits;
for (int y=0; y<height; y++)
{
_asm {
mov edi,dst // destination pointer
mov ecx,width // number of argb pixels to fill
mov eax,color // argb color
rep stosd
}
dst += rect.Pitch;
}
Yup, this is actually SLIGHTLY faster, about 1.3ms on this laptop, thanks! 8-)