Dark Bit Factory & Gravity

PROGRAMMING => General coding questions => Topic started by: Kirl on July 22, 2011

Title: Making sense of sound specrum analysis
Post by: Kirl on July 22, 2011
The title says it all really. Yesterday I managed to get a sound spectrum analyser into my flash projects (using a trial version of swift mp3 for those interested) to be able to easily synch animation to music. You can see a simple test here (http://kirl.nl/soundSpectrumAnalyser.swf). Use the up and down keys to start music and to change tracks (you may have to click the screen to set the focus). Trial version only allowed 30 seconds of audio to be analysed, so every track loops after 30 seconds.

So now I'm wondering what kind of information can be extracted from this spectrum analysis. In my example the circle is simply scaled by the sum of all 18 frequencies(?). Which works okayish but I have actually no idea what any of those 18 values stand for.

Of course I googled and wikipediated the topic a bit, but I didn't really find what I was looking for. I'd like to be able to recognise high and low tones for example and generally get a better understanding of what I'm looking at and what kind of information this spectrum contains.

Any and all info on this topic would be greatly appreciated!
Title: Re: Making sense of sound specrum analysis
Post by: hellfire on July 22, 2011
To put it simple, the discrete fourier transform (http://en.wikipedia.org/wiki/Discrete_Fourier_transform) gives "amplitude over frequency" from "amplitude over time".
All you can see here is the amount of bass in the lower bands.
If you want to get more information from the spectrum I'd suggest to use at least a 256 band analyser so you've got one bin for each possible note.

Title: Re: Making sense of sound specrum analysis
Post by: Kirl on July 26, 2011
Thanks, I read a bit about fourier transforms and I was hoping I wouldn't have to get into that, it looked pretty complicated. Good to have something like that on the shelf for a rainy day though! :)

I came across Flashamp which lets me choose the nr of bands, but it doesn't go any higher then 16 either. Would a fourier transform on 16 bands (as opposed to 256) even yield some usefull data?
Title: Re: Making sense of sound specrum analysis
Post by: hellfire on July 26, 2011
Quote
but it doesn't go any higher then 16 either
That's probably because it uses a filterbank (a set of cascaded bandpass filters).
For a small number of bands this is notably faster than applying the fft.

Quote
Would 16 bands even yield some usefull data?
That's useful for a simple beat detection (finding repeating peaks in the lower bands) which is usually enough for sync.
But it's not possible to find the characteristics of a certain sound because each band covers almost a full octave.
At higher resolutions you can also see the harmonic contents for a fundamental wave.
For example see here (http://esp.sr.unh.edu/lynette/sal_data/efield/hf34dn_surf.gif).
Title: Re: Making sense of sound specrum analysis
Post by: Kirl on July 26, 2011
Thanks for shining some light on this! :)
Title: Re: Making sense of sound specrum analysis
Post by: combatking0 on July 26, 2011
Which version of Flash does this use? I've been trying to make something like this for a while, and thiis looks great.
Title: Re: Making sense of sound specrum analysis
Post by: Kirl on July 27, 2011
It's Flash 8, I've been looking for this for quite some time as well until I recently stumbled upon swift mp3, which I used for this test.

I now also found FlashAmp, which does he same thing, has more options and is offered free because it is no longer supported, but I have not yet managed to properly read out the needed values from the generated swf.
Title: Re: Making sense of sound specrum analysis
Post by: combatking0 on July 28, 2011
[jaw drops to the ground]

I've got to do some research :D K+

(edit)
I have found Flash Amp at http://www.marmalademedia.com.au/ and it seems that it generates the spectrum data as an importable file rather than working as an add-on for the SWF to perform the analysis at run time, but this is as close as we're going to get with AS2 :)

It could even be possible to create a Vib-Ribbon clone or another audio generated game :o
Title: Re: Making sense of sound specrum analysis
Post by: combatking0 on July 28, 2011
It looks like using the #import function is the best way to use Flash Amp, unless you prefer Load Vars.

Attached is a simple test, along with the .mp3 file and the .as file I used in the example.

I will attempt to make a proper graphic equalizer in the morning, as I have run out of time tonight - this example uses only the amplitude data.

(edit) I'm not sure if there's a synching problem, or if my PC is running slow again, but the effect seems to last longer than the audio. I'll look into it.

(edit 2) I have removed the attachment - a newer version exists later in the thread.
Title: Re: Making sense of sound specrum analysis
Post by: combatking0 on July 29, 2011
Even though Flash Amp doesn't analyse more than 16 frquency bands at a time, it is possible to set the width of the bands manually.

With some fiddling - well, lots of fiddling - it may be possible to get 256 bands by analysing the same track 16 times using the 16 band setting - 16 * 16 = 256, so in theory, you should be able to get the information you want.
Each analysis must point at a different part of the spectrum, so for example, the first run would be 20 - 643 Hz, then 644 to 1267 Hz, 1268 to 1891 Hz, 1892 to 2515 Hz, 2516 to 3139 Hz, 3140 to 3763 Hz, 3764 to 4387 Hz, 4388 to 5011 Hz, 5012 to 5635 Hz, 5636 to 6259 Hz, 6260 to 6883 Hz, 6884 to 7507 Hz, 7508 to 8131 Hz, 8132 to 8755 Hz, 8756 to 9379 Hz, and then finally 9380 to 10000 Hz.

You'll then need to rename the previous output file as something else before you create the next one each time (as Flash Amp will overwrite without asking), and then consolidate the "spectrum" arrays into one .as file, giving each a different name. You could even create an array of arrays ;)

It may not be convinient, but it is possible.

Attached is my latest test, using just 6 bands to adjust the size of 6 circles.
The red circle represents low frequency, going up through yellow, green, cyan, blue and then magenta with the highest frequencies.
Title: Re: Making sense of sound specrum analysis
Post by: hellfire on July 29, 2011
Each analysis must point at a different part of the spectrum, so for example, the first run would be 20 - 643 Hz, then 644 to 1267 Hz, 1268 to 1891 Hz, 1892 to 2515 Hz, 2516 to 3139 Hz, 3140 to 3763 Hz, 3764 to 4387 Hz, 4388 to 5011 Hz, 5012 to 5635 Hz, 5636 to 6259 Hz, 6260 to 6883 Hz, 6884 to 7507 Hz, 7508 to 8131 Hz, 8132 to 8755 Hz, 8756 to 9379 Hz, and then finally 9380 to 10000 Hz.
Human sound perception is not linear, that's why there's a fixed number of notes per octave (each octave doubles the frequency).
So you should not split the spectrum into bands of fixed frequency but for example analyse each octave separately.
Usually music doesn't contain much frequencies below 40hz.
Title: Re: Making sense of sound specrum analysis
Post by: combatking0 on July 29, 2011
I shall improve my calculations, based on this new information.

Flash Amp's default range is 20 to 10KHz, so I was using that as a starting point, but I will make an adjustment.

What about:
40 to 80 Hz
80 to 160 Hz
160 to 320 Hz
320 to 640 Hz
640 to 1280 Hz
1280 to 2560 Hz
2560 to 5120 Hz
5120 to 10240 Hz (I'm not sure if Flash Amp can analyse frequencies above 10KHz, but I'll try it)

This gives us 8 octaves, and when analysed with Flash Amp on the 16 Bands setting, should give us a total of 128 bands. Not quite the full 256 bands, but nearly there. Using the logarithmic setting in Flash Amp should give each band the correct frequency distribution.

We could try half-octaves to get a total of 256 bands, but I will need to adapt the mathematics further...

(edit)
Is this relevant or useful?
http://www.recordingeq.com/EQ/req0400/OctaveEQ.htm
Title: Re: Making sense of sound specrum analysis
Post by: Kirl on July 29, 2011
Hey interesting progress here, good stuff ck0! That link is very usefull to me, my friend and housemate will certainly be thankfull for it as well as he's been struggling with mastering his music for a while.

It's an interesting topic I always wanted to explore, good to see you diving in here too! I had some trouble getting a hold of the values flashAmp generates, so it's great to see you're well past that hurdle already!
Title: Re: Making sense of sound specrum analysis
Post by: hellfire on July 29, 2011
What about:
40 to 80 Hz
80 to 160 Hz
160 to 320 Hz
320 to 640 Hz
640 to 1280 Hz
1280 to 2560 Hz
2560 to 5120 Hz
5120 to 10240 Hz (I'm not sure if Flash Amp can analyse frequencies above 10KHz, but I'll try it)
That should work much better.
In your earlier attempt the lowest band would have contained the notes C-0 to D#5 which is more than half of the musical spectrum.
I'm not sure how this flash amp thing distributes the bands across the given range but I'd try to match the individual bands with the frequency of a note (see here (http://www.phy.mtu.edu/~suits/notefreqs.html)).
So you'd either use only 12 bands per octave or (if that's not possible) use a range of 16 halftones instead of an octave.
Title: Re: Making sense of sound specrum analysis
Post by: combatking0 on July 30, 2011
I had some trouble getting a hold of the values flashAmp generates, so it's great to see you're well past that hurdle already!

Accessing the values for Amplitude and Spectrum can be achieved as follows, though there may be other ways of doing it.

Code: [Select]
current_amplitude=Amplitude[current_frame];
for(i=0;i<bands;i++){
current_band[i]=Spectrum[current_frame][i];
}

Where
"current_amplitude" is a numeric variable.
"current_frame" is also numeric, but can be generated either by counting the number of frames since the audio started OR through a better calculation involving the current position of the track compared with its duration and the sample rate of the data (which I haven't tried to make yet, but this should work better than the first option for synching - I'll give it a try tonight).
"bands" is a numeric variable which holds the number of analysed spectrum bands.
"current_band" is an array of numbers.
"Amplitude" and "Spectrum" are imported within the .as file.

I'm not sure how this flash amp thing distributes the bands across the given range but I'd try to match the individual bands with the frequency of a note (see here (http://www.phy.mtu.edu/~suits/notefreqs.html)).
So you'd either use only 12 bands per octave or (if that's not possible) use a range of 16 halftones instead of an octave.

It is possible to run Flash Amp with any number of bands between 2 and 16, so 12 should work. The link to the note frequencies is also very useful. This information could also be used to create a Flash "midi" player of sorts, using the gathered note data to play notes.

It's a shame Flash 8 can only handle a maximum of 8 simultaneous audio tracks, but that should be enough for a proof of concept. For now, I'll try to stay on topic, and put that idea on the back burner.
Title: Re: Making sense of sound specrum analysis
Post by: Kirl on July 30, 2011
Thanks CK, after I finnish up my entry I'll give it a go!

Fixed a misplaced code tag which messed up your post..
Title: Re: Making sense of sound specrum analysis
Post by: combatking0 on July 30, 2011
Thanks Kirl, I was in a bit of a rush earlier.

I'll keep looking into this too, and see what I can come up with. I'll need a track with a rich, varied range of sounds with which to test the eventual 96-band (8*12) analysis.