Author Topic: using mutlithreading in freebasic (Read 11147 times)

ninogenio · « **on:** April 02, 2013 »

hello all just wondering if anyone has tried multi threading in freebasic as id love too give it a try but dont know where i would even begin im working on a demo that does loads of number crunching rendering to different buffers applying filters too these buffer and combining etc. however i am on a core i7 so my processor is largely going to waste atm..

when doing multithreading is there any specific library that can be used?? or is certain instructions suffice. cheers for any help in advanced..

ninogenio · « **Reply #1 on:** April 02, 2013 »

just realized there is a threading example in the free basic folder, just offloading one of my intensive blur filters too a second core my fps has gone from 45ish too 105-109 so im gob smacked!!

hellfire · « **Reply #2 on:** April 04, 2013 »

Quote from: ninogenio on April 02, 2013

offloading one of my intensive blur filters too a second core my fps has gone from 45ish too 105-109

The trick with "intensive blur filters" is to come up with a fast downsampling filter to halve or quarter the source image without introducing aliasing.
This way the complex filter has to process only 1/4 or 1/16 of the original number of pixels.
Try to put the rgb-processing innerloops into mmx or sse. Then go for threads.

ninogenio · « **Reply #3 on:** April 05, 2013 »

cheers hellfire,

im currently downsampling to 1/4 blurring then upsampling, have never tried any mmx sse as im useless at x86 asm i can see the benefit of pixel packing though. would you happen to have a very basic example of mmx or sse i could give a try..

there is a bit more too my multi threading that i first posted. i split my blur up into 4 quads carefully too avoid mutex binding then run each quad on individual threads in parallel. it gives very cheep massive speed gains.. but of course if packing groups of data together for procesing gives good returns too, im all for that.

hellfire · « **Reply #4 on:** April 06, 2013 »

Quote from: ninogenio on April 05, 2013

would you happen to have a very basic example of mmx or sse i could give a try..

You can think of mmx as an additional operating mode for your fpu.
When you use an mmx instruction, all the data on your fpu is automatically saved and it switches to mmx mode.
When you're done with mmx operations, a special instruction ("emms" = end mmx sequence) restores all the fpu data and brings it back into the floating point operation mode (which is somewhat costly).
If you try to do any fpu operation when you're still in mmx-mode, you will just get garbage and the compiler might expect some values to still exist in fpu registers and will fail miserably.
This means you can't use floating-point and mmx operations at the same time and you should make as few operation switches as possible.
So you basically need to remove all floats from your innerloop to avoid permanent mode switching.

In mmx mode you have 8 64bit registers called mm0...mm7 (just as many as the fpu has; not a big surprise).
These registers can be interpreted as 8 bytes, 4 shorts or 2 ints (each data-type has different instructions).

The easiest application for mmx is "rgba addition with saturation":
(sorry for the C example but I haven't touched freebasic for years)