FIR Audio Spectrum Analyzer

TouEnMasm · October 15, 2014, 02:56:28 AM

A more simple method to load a real

Quote
FLT4 MACRO value
   local etiquette
   .data
   etiquette REAL4 value
   .code
; EXITM fpc( @CatStr( <REAL4 >,value ) )
   EXITM<etiquette>
ENDM

FLT8 MACRO value
   local etiquette
   .data
   etiquette REAL8 value
   .code
; EXITM fpc( @CatStr( <REAL8 >,value ) )
   EXITM<etiquette>
ENDM

guga · October 20, 2014, 02:45:42 PM

Hi Marinus...I´m trying but i´m geting a bit confused

The function Float32_Int16LSBinterleaved is called any time the app starts even if no music is being loaded. This is because it is called from DSTimerThread without any checkings if WAVdataPTR is filled or not.

Is that a normal behaviour ?

Also, you said to get the left and right audio channels, but how i do that ? All i found was GetWavAudio, that internally have a ptr to the audio data (Does WAVdataPTR already refers to both channels ?)

And...the variable AudioBufferLeftChannel seems to be useless for this purpose, since i can´t see any AudioBufferRightChannel being used.

One question regarding GetWavAudio function.

This part of the code is never used ? (At leats, when debugging, this part of the code nevers break/pause at debugging)

Code Select


    neg         edx                 ; edx = size of data to get from the start of the wav file to copy to last part of the buffer
    sub         ecx,edx             ; ecx = size of dat from the last part of the wav file

    mov         WavDataSizeLast,ecx
    mov         WavDataSizeNew,edx

    mov         WAVdataPos,edx      ; update next audio position in wav file ( wav file loops )

    mov         ecx,WavDataSizeLast
    test        ecx,ecx
    jz          CopyLastWavDataDone

    call        Int16LSBinterleaved_Float32

    mov         eax,WavDataSizeLast
    and         eax,15
    test        eax,eax
    jz          CopyLastWavDataDone ; do we have some samples left?
    mov         edx,16
    sub         edx,eax
    sub         esi,edx             ; adjust pointer
    sub         edi,edx             ; adjust pointer
    mov         ecx,16              ; copy last 1 2 or 3 unaligned samples

    call        Int16LSBinterleaved_Float32

guga · October 20, 2014, 07:53:26 PM

One question. Why you are using AUDIOBUFFERLENGTH as 65536 ? Isn´t it a too high value considering you are only converting the values from Int16LSBinterleaved_Float32 ?

I´m trying hard to understand, but i´m getting a bit confused here. I used another wave sample (attached below). It is really small, and FirAnalyser crashes because when trying to copy from "movups oword ptr[edi+eax+AUDIOBUFFERLENGTH+16],xmm1" there will be not enough buffer for it.

I presume that all Int16LSBinterleaved_Float32 is doing is copying (and converting) the values from one sample data (3 dwords) to the destination buffer, right ?

But....why copying it 65536 bytes away ??? Isn´t the audio channel (interleaved) simply a sequence of 3 dwords ?

I´m asking because i presume that the total amount of chunks before the filtering must be the same as after, right ? I mean, the total size of the file must not change after using the filter, if i understood it correctly.

Like this:

Code Select

seg000:00000024 aData          db 'data'
seg000:00000028 ChunkSize      dd 156                  ; 156 = 39 elements *4
seg000:0000002C DataChunk      RigthChunk <4184144228, 4177656065, 4158650335>
seg000:00000038                LeftChunk <4128962074, 4091343836, 4049269082>
seg000:00000044                RigthChunk <4006014662, 3965250648, 3929926205>
seg000:00000050                LeftChunk <3902531739, 3884246916, 3875923717>
seg000:0000005C                RigthChunk <3877562142, 3888244673, 3906857181>
seg000:00000068                LeftChunk <3931630167, 3960335373, 3990285782>
seg000:00000074                RigthChunk <4019122062, 4044026122, 4063162926>
seg000:00000080                LeftChunk <4075418345, 4081185601, 4082430804>
seg000:0000008C                RigthChunk <4082168656, 4084200303, 4092392428>
seg000:00000098                LeftChunk <4109432048, 4136105607, 4171233439>
seg000:000000A4                RigthChunk <4211276546, 4252237171, 4289265576>
seg000:000000B0                LeftChunk <24052079, 45154993, 58721152>
seg000:000000BC                RigthChunk <67568647, 75564161, 85984544>

So, alll it is necessary is copy the 3dword (a structure) to float (related to the right channel) and make the same for the right channel. Right ?

Then, once it is copied how to filter it ?

This is kinda confusing me, because the Sample data (right and Left chunks have 13 elements of the arrays of "Chunk structure). Which means 6 for left channel, 6 for right cnahhel and the extra 1 is for what ?

What is the usage of the 1st sample ?

For better understanding, i tried to make a structure similar to this (see the picture in orange, marking 00 00 00 00 as Sample1):
https://ccrma.stanford.edu/courses/422/projects/WaveFormat

Siekmanski · October 21, 2014, 06:54:52 AM

Hi guga,

Let me explain how things are done.

The soundcard is split up in 2 buffers of 4096 16bit interleaved stereo-samples, one part is playing and the other part will be filled with audio from the audio buffer.
The audio-buffer has two seperate buffers ( left and right ) of 4 blocks of 4096 32bit float samples each.

The multimedia-timer checks every 5 milliseconds where the play-position of the soundcard is.
If it is still in the same buffer then it does nothing but if it is in the next buffer then it copies the next block of audio from the audio-buffer to the soundcard-buffer.
This is also called double-buffering.
After that it reads directly the next blok of wav-data from the wav-file to the audio-buffer.

The audio-buffer gets the wav-data 2 blocks ahead so that if we have a large FIR filter let's say 8190 coefficients it doesn't fetch old audio data to do the calculations.
So we have no wrong old audiodata for the filter routine.
See in the picture how the FIR filter ( green colour ) is centered over the ScreenSynchronizedAudioPosition and doesn't mess with the old audio-data.

In the picture you can see the 4 steps of the soundcard buffer swaps and how the audio is handled.

The wav-loader is a very simplistic routine, it just reads 4096 stereo samples each time.
The only check is if it reaches the end of the wav-file and resets to the beginning of the wav-file.
Because the routine reads 4 samples at once, it can happen that there are 1,2, or 3 remaining samples in the wav-file.
If this is the case it adjusts the pointers of the wav-file and the audio-buffer and copy the last 4 samples of the wav-file at once.

If the WAVdataPTR is NULL it copies silent 0.0 samples to the audio-buffer.

Because it copies blocks of 4096 stereo samples at once, the wav-file must have at least 4096+4 is 4100 stereo samples or else it crashes.
"I should have build a warning in this program"

Another thing,
movups oword ptr[edi+eax+16],xmm1 is the pointer to the AudioBufferLeftChannel buffer
movups oword ptr[edi+eax+AUDIOBUFFERLENGTH+16],xmm1 is the pointer to the AudioBufferRightChannel buffer

It loads the address of AudioBufferLeftChannel in edi and adds "AUDIOBUFFERLENGTH+16" to get the address of AudioBufferRightChannel.

I wrote it this way to save a register....

the +16 are the 16 extra bytes for the buffer overflow of the FIR filter routine.

Below a picture how the data is handled.

Marinus

Gunther · October 21, 2014, 08:01:44 AM

Good explanation, Marinus. :t

Gunther

guga · October 21, 2014, 06:38:18 PM

Hi Marinus, i guess i make it work. I´ll review the code tomorrow and post it here to you see if i made it right.

I´m not using any new memory. I´m simply using the pointers to the sample themselves to apply the Fir coefficient on it. Each Fir is applied to a sample. I mean, the 1st part of the sample. (As long i understood it correctly. I´m assuming each sample as a 3 Dword array belonging to a structure.) Thus, FIR applies to each one of the dwords on the sample.

One thing that i noticed. If i change the order to it´s maximum 8190 (for example) the resultant audio data becomes way more cleaned.
The only problem i´m facing is that no matter the value of the order i use, it always inserts low energy that needs to be removed. Ex:

Do i need to convert each sample to Frequency to remove it ? (I presume FFT is the way to do it, but i wonder if there is a better or easier way to convert samples to frequency and vice-versa)

If so, using the goertzelFilter algorithm is better then FFT ?
http://netwerkt.wordpress.com/2011/08/25/goertzel-filter

Note: One thing that i notice. Remember the discussion we had about the Hamming constants ? (Hamming Window = coefficient * ( 0.53836 - 0.46164 * cos( PI2 * filtercounter / filterorder )))
I tested it with the proper values to cancelate the audio and the result is more accurate if you use this in the CalculateFIRcoefficientsReal4 function:

[Float_Hamming_Alpha: F$ 0.536024817009894983757669554]
[Float_Hamming_Beta: F$ 0.463975182990105016242330446]

Siekmanski · October 21, 2014, 09:22:27 PM

QuoteI´m not using any new memory. I´m simply using the pointers to the sample themselves to apply the Fir coefficient on it. Each Fir is applied to a sample. I mean, the 1st part of the sample. (As long i understood it correctly. I´m assuming each sample as a 3 Dword array belonging to a structure.) Thus, FIR applies to each one of the dwords on the sample.

I don't know what you mean,

The calculation is very simple:
s = audio samples
f = fir coeffs

For 3 fir coeffs: insert 1 zero before and after your audio data

ouput(s0) = f0*input(s-1) + f1*input(s0) + f2*input(s1)
ouput(s1) = f0*input(s0) + f1*input(s1) + f2*input(s2)
etc....

For 5 fir coeffs: insert 2 zeros before and after your audio data

ouput(s0) = f0*input(s-2) + f1*input(s-1) + f2*input(s0) + f3*input(s1) + f4*input(s2)
ouput(s1) = f0*input(s-1) + f1*input(s0) + f2*input(s1) + f3*input(s2) + f4*input(s3)
etc....

Quote
One thing that i noticed. If i change the order to it´s maximum 8190 (for example) the resultant audio data becomes way more cleaned.
The only problem i´m facing is that no matter the value of the order i use, it always inserts low energy that needs to be removed. Ex:

There is no maximum for the filter coefficients, if you use more the frequency band roll off will be steeper and so the band will be narower.
The unwanted low energy you want to remove can be done using another window function such as a Blackman-Harris, the frequency band will be wider but the side lobes will be much lower.
You can watch this behaveour in my fft analyzer proggy by loading the sine sweep wav file and then try the different window functions. ( turn off fft smoothing )

This is the Blackman-Harris window: BlackmanHarris = 0.35875 - 0.48829*cos((2.0*PI/(nn-1)*step) + 0.14128*cos(2*(2.0*PI/(nn-1)*step) - 0.01168*cos(3*(2.0*PI/(nn-1)*step)

Quote
Do i need to convert each sample to Frequency to remove it ?

Yes, just add zeros ( total FIR coeffs / 2 ) add the beginning and the end of your audio data.

QuoteIs the goertzelFilter algorithm better then FFT ?

If you want to check one ( or a few ) certain frequency then it is faster.
But to filter a frequency(band) you need a band filter.

Marinus

guga · October 22, 2014, 03:48:16 AM

QuoteYes, just add zeros ( total FIR coeffs / 2 ) add the beginning and the end of your audio data.

Ok, i´ll try making it and ask later if it is correct.

But....1st i´ll start posting the routines i made for your FIR Algo, can you help see if it is correct ? This 1st post is related to the FIR itself, it is a porting of your algo. (The syntax is easy to follow. An i changed very few on your algo. And i maintained your comments for instruction)

The main routine is a variation of your´s "CalculateAnalyzerFrequencyBands" that is uses inside WM_INITDIALOG (or WM_CREATE)

Code Select


    ...Else_If D@Message = &WM_INITDIALOG
        move D$hwnd D@Adressee

        call CalculateAnalyzerFrequencyBands

Code Select

_________________________________________________________________________

    ; a standard 1 octave band analyzer
    ; for frequency band info go to: http://www.sengpielaudio.com/calculator-octave.htm

    ; Just play around with the frequency bands and the order numbers, or even overlap the frequency bands to compensate for the
    ; roll-off factor of the filters.....
    ; If you change this example in a 9 band analyzer by combining band 1 and 2 you need much less calculations and you'll get a much faster analyzer
    ; In this example a 10 band has a total of 16324 orders to calculate each screen refresh
    ; A 9 band analyzer just 7238 orders

    ; Remove band 1 ( 8190 orders ) and 2 ( 4094 orders ) and replace it with 1 band ( 22 - 88 Hz and 3198 orders )
    ; Then this routine will be 2.25 times faster
    ; And then you can do multithreading and make it mega fast on multi core machines.......
    ; And for selected machines you may translate the "CalculateFrequencyBand" routine to AVX (Advanced Vector Extensions) code

    ; Important is you need to "design" a filter to your needs ( frequency response ) in combination with the order numbers and the windowing type

[FIR_ORDER 2] ; the higher the value, more accurate it will be. It must be a multiple of 2 and a maximum of 8192. Ex: 4096, 8190
[FIR_LowPass 0]
[FIR_HighPass 1]
[FIR_BandReject 2]
[FIR_BandPass 3]

[<16 FIRcoefficientsBand1: F$ ? #FIR_MaxTaps] ; No longer needs to be aligned. But, aligned is faster

Proc CalculateAnalyzerFrequencyBands:

    call CalculateFIRcoefficients FIR_ORDER, 44100, 1000, &NULL, FIR_HighPass, FIRcoefficientsBand1

EndP
_________________________________________________________________________

Code Select


_________________________________________________________________________

[AUDIOBUFFERLENGTH 65536]   ; byte size, always a power of two, and must hold minimal 4 times the samples as the soundcard buffer

[FIR_MaxTaps 8192] ; "Keep this a multiple of 4" If FIR_MaxTaps == 512 then maximum FIR_Order = 511
                   ; If Order = 30, == 31 Taps (coefficients), keep the order number even so we have uneven Taps and thus 1 center coefficient
                   ; Notice that a FIR filter is symmetric

[FIRcoefficientsTemp: F$ ? #FIR_MaxTaps]

; Introduction to Digital Filters: http://www.dspguide.com/ch14.htm

Proc CalculateFIRcoefficients:
    Arguments @order, @samplerate, @cutofffrequency1, @cutofffrequency2, @filtertype, @coefficients
    Uses edi, esi, ecx, eax

    call 'kernel32.RtlZeroMemory' D@coefficients, FIR_MaxTaps ; My note: replace this later with a faster routine.
    call CalculateFIRcoefficientsReal4 D@samplerate, D@order, D@cutofffrequency1, D@coefficients ; calculate LowPass
    .If D@filtertype <> FIR_LowPass
        If D@filtertype = FIR_HighPass
            call InvertFIRcoefficients D@order, D@coefficients    ; convert from LowPass to HighPass
            ExitP
        End_If

        ; calculate second set of coefficients for BandPass or BandReject
        call CalculateFIRcoefficientsReal4 D@samplerate, D@order, D@cutofffrequency2, FIRcoefficientsTemp
        call InvertFIRcoefficients D@order, FIRcoefficientsTemp

        ;  calculate BandReject
        lea esi D$FIRcoefficientsTemp
        mov edi D@coefficients
        mov ecx D@order

        @Addcoefficients:
            fld F$esi+ecx*4 | fadd F$edi+ecx*4 | fstp F$edi+ecx*4
            dec ecx | jns @Addcoefficients

        If D@filtertype <> FIR_BandReject
            call InvertFIRcoefficients  D@order, D@coefficients
        End_If

    .End_If

EndP

_________________________________________________________________________

Code Select


_________________________________________________________________________

[Float_Half: F$ (1/2)]
[Float_Zero: F$ 0]

;  Hamming Window = coefficient * ( 0.53836 - 0.46164 * cos( PI2 * filtercounter / filterorder ))
;[Float_Hamming_Alpha: F$ (13459/25000)] ; Try using 0.536024817009894983757669554
[Float_Hamming_Alpha: F$ 0.536024817009894983757669554] ; Try using 0.536024817009894983757669554
; 20*log10((sin((5*pi)/2))/((5*pi)/2))
; log(sin(B))-log(B) = -(A log(5))/2000-(A log(2))/2000
; {FindRoot[-21/50 - (20 Log[Sin[X]/X])/Log[10] == 0, {X, 0.530649, 0.572675}, WorkingPrecision -> 39], FindRoot[-21/50 - (20 Log[Sin[X]/X])/Log[10] == 0, {X, -0.562036, -0.52001}, WorkingPrecision -> 39]}

;[Float_Hamming_Beta: F$ 4.61640000343322754e-1] ; try using 0.463975182990105016242330446
[Float_Hamming_Beta: F$ 0.463975182990105016242330446] ; try using 0.463975182990105016242330446

[Float_Two_PI: F$ 6.2831853071795864769252867665590057683943387987502116]

Proc CalculateFIRcoefficientsReal4:
    Arguments @samplerate, @order, @cutofffrequency, @coefficients
    Local @CutoffRadian, @sum, @filterorder, @filterorder_half, @filtercounter
    Uses edi, ecx, edx, eax

    ; cutoff frequency must have a value between 1 and (samplerate / 2)
    finit
    fclex

    ; CutoffRadian = 2 * PI * cutofffrequency / samplerate
    fld F$Float_Two_PI | fimul F@cutofffrequency | fidiv F@samplerate | fstp F@CutoffRadian

    ;   Sum of all coefficients used for normalisation of the coefficients.
    fldz | fstp F@sum
    fild F@order | fst F@filterorder | fmul F$Float_Half | fstp F@filterorder_half

    mov D@filtercounter 0
    mov edi D@coefficients
    mov ecx D@order

@calculate_coefficients:
    mov edx D@filtercounter
    fild F@filtercounter | fsub F@filterorder_half
    fcomp F$Float_Zero
    fnstsw ax
    sahf | jne @not_zero
    fld F@CutoffRadian  ;   when 0, then coefficient == CutoffRadian
    fstp F$edi+edx*4
    jmp @Addcoefficients

@not_zero:
    ;   sin( CutoffRadian * ( filtercounter - filterorderhalf )) / ( filtercounter - filterorderhalf )
    fild D@filtercounter | fsub F@filterorder_half | fmul F@CutoffRadian | fsin
    fild D@filtercounter | fsub F@filterorder_half | fdivp ST1 ST0 | fstp F$edi+edx*4

    ; if you need another window function insert it here....
    ; info http://en.wikipedia.org/wiki/Window_function

    ; start Hamming Window function

    ;   Hamming Window = coefficient * ( 0.53836 - 0.46164 * cos( PI2 * filtercounter / filterorder ))
    fild D@filtercounter | fmul F$Float_Two_PI | fdiv F@filterorder | fcos
    fmul F$Float_Hamming_Beta | fsubr F$Float_Hamming_Alpha | fmul F$edi+edx*4 | fstp F$edi+edx*4
    ; end of Hamming Window function

@Addcoefficients:
    ;   add coefficient for normalisation
    fld F@sum | fadd F$edi+edx*4 | fstp F@sum
    inc D@filtercounter
    dec ecx | jns @calculate_coefficients
    mov ecx D@order

    ;   normalize all coefficients
@normalize_coefficients:
    fld F$edi+ecx*4 | fdiv F@sum | fstp F$edi+ecx*4
    dec ecx | jns @normalize_coefficients

EndP

_________________________________________________________________________

Code Select



_________________________________________________________________________

Proc InvertFIRcoefficients:
    Arguments @order, @coefficients
    Uses esi, ecx

    mov esi D@coefficients
    mov ecx D@order

@Inverteer_coefficients:
    fld F$esi+ecx*4 | fchs | fstp F$esi+ecx*4
    dec ecx | jns @Inverteer_coefficients

EndP
_________________________________________________________________________

Marinus, If the above routines are ok, i´ll post the rest of the code responsible to apply he FIR itself (including the Int16LSBinterleaved_Float32 etc)

Personal Note: If everything is Ok after all the tests, i´ll try using a double (R$) instead a single Float (F$). It may gain more accuracy.

Siekmanski · October 22, 2014, 08:06:36 AM

I'm not familiar with the notations you use but, as far as i understand it should be ok.
Just give it a try. :t

Can you hear the difference between single and double floats ?

guga · October 22, 2014, 10:00:36 AM

Quote"Can you hear the difference between single and double floats ?"

Nope. But, maybe we can get accuracy, specially concerning noise data. I mean, the more accurate the more fine tune the audio will be, that includes, tones, voices, noises etc.
For what i´m testing so far, setting the order to 8190 refines a lot the resultant audio. Also, it probably enhances background noises as well that was not audible before, but, this is not at all a problem, considering that it can be better removed with other softwares.

Also, let me ask one thing. What are the values of the samples ? I mean, which notation it uses ? They are a value of magnitude in hertz, decibels or what ?

I mean, i´m considering all data samples as a structure formed with 3 dwords as explained before. For what i saw on the tested audio, it always is formed with:

'data' ; data tag related to audio data itself (A dword "D$")
XXXX ; the size of the subchunk (A dword "D$")
[RightChannel1: D$Sample1, Sample2, Sample3] ; < 1st sample data chunk. Right Channel structure
[LeftChannel2: D$Sample1, Sample2, Sample3]; < 2nd sample data chunk. Left Channel structure
(...)

The total size of the data sample chunk is (3*4)*N structures. Since, in stereo, the channels are formed in pairs , (left and right). I mean, accordying to here and here

This seems relevant, specially because the wave structure can be malformed. I mean, we can have one extra DataChannel (Left or Right) at the end Data, or also we can have padding (one single dword) at the end. In both cases, perhaps, they can be simply ignored (changed to zero), or extended to recreate any missing data. For example, let´s say we have a audo data like this:

Code Select

[RightChannel1: D$Sample1, Sample2, Sample3
 LeftChannel2: D$Sample4, Sample5, Sample6
(...)
 RightChannel50: D$Sample51, Sample52, Sample53
 LeftChannel50: D$Sample54, Sample55, Sample56

 RightChannel51: D$Sample57, Sample58, Sample59 ; < end of the data. It ends on a "isolated" right channel
]

On the above example, we can, then add the extra RightChannel based on the Last Right one, to make it be formed in pairs. Resulting in:

Code Select

[RightChannel1: D$Sample1, Sample2, Sample3
 LeftChannel2: D$Sample4, Sample5, Sample6
(...)
 RightChannel50: D$Sample51, Sample52, Sample53
 LeftChannel50: D$Sample54, Sample55, Sample56

 RightChannel51: D$Sample57, Sample58, Sample59
 LeftChannel51: D$Sample60, Sample61, Sample62 ; < the fix is a copy of the right channel above
]

And then, change the value on the RIFF sub chunk structure (The dword after "data" tag), and also fix the value of WAVEFORMATEX.cbSize accordly.

The same thing seems to be valid if we have only an isolate dword (or 2 isolated dwords etc. Like:

Code Select

[RightChannel1: D$Sample1, Sample2, Sample3
 LeftChannel2: D$Sample4, Sample5, Sample6
(...)
 RightChannel50: D$Sample51, Sample52, Sample53
 LeftChannel50: D$Sample54, Sample55, Sample56

 PaddingData: D$Sample57 ; < end of the data. It ends on a "isolated" right channel
]

will be:

Code Select

[RightChannel1: D$Sample1, Sample2, Sample3
 LeftChannel2: D$Sample4, Sample5, Sample6
(...)
 RightChannel50: D$Sample51, Sample52, Sample53
 LeftChannel50: D$Sample54, Sample55, Sample56

 RightChannel51: D$Sample57, Sample58, Sample59 ;< extended the rest of the data based on Sample57 to form the Right Channel
 LeftChannel51: D$Sample60, Sample61, Sample62 ; < and copy the right channel above to form the left one.
]

Also, if we use 2 channels why we can have odd values ? I mean, like this:

Code Select

seg000:00000024 aData          db 'data'
seg000:00000028 ChunkSize      dd 156      ; 156 = 39 dwords *4 = 13*3 structures * 4 => 6 Left Channels + 6 Right Channels + 1 Extra Structure ???
seg000:0000002C DataChunk      RigthChunk <4184144228, 4177656065, 4158650335>
seg000:00000038                LeftChunk <4128962074, 4091343836, 4049269082>
seg000:00000044                RigthChunk <4006014662, 3965250648, 3929926205>
seg000:00000050                LeftChunk <3902531739, 3884246916, 3875923717>
seg000:0000005C                RigthChunk <3877562142, 3888244673, 3906857181>
seg000:00000068                LeftChunk <3931630167, 3960335373, 3990285782>
seg000:00000074                RigthChunk <4019122062, 4044026122, 4063162926>
seg000:00000080                LeftChunk <4075418345, 4081185601, 4082430804>
seg000:0000008C                RigthChunk <4082168656, 4084200303, 4092392428>
seg000:00000098                LeftChunk <4109432048, 4136105607, 4171233439>
seg000:000000A4                RigthChunk <4211276546, 4252237171, 4289265576>
seg000:000000B0                LeftChunk <24052079, 45154993, 58721152>
seg000:000000BC                RigthChunk <67568647, 75564161, 85984544>

What is considered as the start of the channels ? The 1st structure ? If, so, since stereo works in pairs, we will have an extra "structure" at the end. Why ?

My problem is understand what is the "sample" values notation. What they really are. They measure etc. I mean, here it says the sample can be used to measure a magnitude or some sort.

http://stackoverflow.com/questions/3058236/how-to-extract-frequency-information-from-samples-from-portaudio-using-fftw-in-c

magnitude = sqrt(re^2 + im^2))
So, magnitude = sqrt (sample1^2 + sample2^2 + sample3^2) ???

Presuming this magnitude is the decible amount calculate under the form of :
magnitude_dB = 20*log10(magnitude)

It implies always that magnitude must be a positive value, right ?

But, if the sum of all of them must be a positive value, why if i add The totalfir/2 and add the beggining and end it can result on a negative value ? THere are cases where the total sum of FIRs/2 is a negative value. In that situation, what does it means ?

I mean, when you said:

QuoteYes, just add zeros ( total FIR coeffs / 2 ) add the beginning and the end of your audio data.

You meant to compute it as:
Sum_Of_FIRs/2 + Sample1 + Sample3 = Frequency of sample ????

If so, what happens to sample2 ? Why it is not computed ?

And, what happens when the generated value is negative ? What does it means ?

And also, the generated result need to be a value between 0 and 1, right ? So, how to compute the frequency in hertz ??? I multiply this value with what ?

I mean, i tried to determine the frequency of the sample value as you said and created this function (Dispites the name, it is computing now the magntiude and checking if the result is positive or negative. I simply added the sum of firs/2 to that to compthe the magnitude, which can´t be a negative value due to the log10 on the formula above. So, If negative, an error ocurred):

Code Select


[SampleMagnitude: F$ 0]
[Float_One: F$ 1]
[Float_InvertedSquareTwo: F$ 0.7071067811865475244008443621048490392848359376884740]; 1/sqrt(2)

; The SampleMagnitude value must be a positive value betwen 0 and 1.
Proc ConvertSampletoFrequency:
    Arguments @pWaveData, @DataLen, @pAverageFir

    finit
    xorps  xmm0, xmm0
    mov ecx 0
    If D@DataLen = 0
        xor eax eax
        ExitP
    End_If

    call Int16LSBinterleaved_Float32New D@pWaveData, D@DataLen

    mov ecx D@DataLen
    mov esi D@pWaveData
    mov ebx D@pAverageFir

    While ecx <> 0
        ; magnitude = sqrt(re^2 + im^2)) + Sum_Of_FIR/2 ????
        .If_And D$esi <> 0, D$esi+8 <> 0
            fld F$esi | fmul ST0 ST0
            fld F$esi+8 | fmul ST0 ST0
            faddp ST1 ST0 | fsqrt
            fadd F$ebx | fstp F$SampleMagnitude
            Fpu_If F$SampleMagnitude < F$Float_Zero ; Negative value ??? Try to fix, based on the percentage that was negative decreased
                ; Try to find the correct value of the sample1, sample2 or sample3 whose sum needs o be at leastSum_Of_FIRs/2
                ;sqr (A+B)+Coef > 0
                ; sqr (A+B) =  Coef
                ; (A+B) = Coef^2 . If A = A^2, B=B^2, A=B
                ; 2*A^2 = Coef^2
                ; A = coef/(sqr(2))
                ; AMin = coef/(sqr(2))
                ;fld F$SampleMagnitude | fadd F$Float_One  | fst F$esi | fst F$esi+4 | fstp F$esi+8
                ; calculate the minimum magnitude (original value)
                
                ;fld F$ebx | fmul F$Float_InvertedSquareTwo | fabs | fst F$esi | fst F$esi+4 | fstp F$esi+8
                fld F$ebx | fabs | fadd F$Float_One | fmul F$ebx ; compensate with the percentage decreases. -0.6 = -50% of decrease
                fmul F$Float_InvertedSquareTwo | fabs | fst F$esi | fst F$esi+4 | fstp F$esi+8
                ; Ok, now we have all positive values and retrieved the data of sample1 and sample3 but, still have created gaps ?????
            Fpu_End_If
        .End_If

        add esi 12
        sub ecx 12
    End_While

    call Float32_Int16LSBinterleavedNew D@pWaveData, D@DataLen

EndP

For you understand better what i tried to do

I made the function (I know it don´t works as expected...I´m tryingto understand if my assumptions of the link and your code are correct to computeh the magnitude in decibels of a sample after the FIR filter is used) to determine if the resultant values of sample1, sample2 and sample3 after using FIR will have a correct magnitude_db. Since magnitude-db uses a log10, the result must be positive. If somehow it have a result of negative value, something went wrong during FIR computation.

Siekmanski · October 22, 2014, 12:14:30 PM

The wav samples are 16 bit signed ints with a range from -32768 to 32767
Those are converted to 32 bit floats with a range from -1.0 to 0.999969482421875

Just do the fir calculations as explained in the previous posts and try to understand.
The ouput of the fir routine are also samples within a range of -1.0 to 0.999969482421875

To present the sound data as decibels:

dBFS = 20*log10(absolute SampleValue)

Code Select

	fld      SampleValue  ; range from -1.0 to 0.999969482421875
	fabs                  ; remove the sign bit, values are now between 0.0 and 1.0
	fldlg2
	fxch
	fyl2x					
	fmul    FLT4(20.0)
	fstp    dBFSvalue     ; 0 dBFS is the maximum possible digital level.

SampleValues:

1.0 = 0 dBFS ; loudest sound.
0.5 = -6 dBFS
0.25 = -12 dBFS
0.125 = -18 dBFS
.........
0.0000001 = -140 dBFS ; you can't hear this sound anymore.

dBFS = Decibels relative to full scale.

If you want the RMS value:

RMSmagnitude = sqrt ((sample1^2 + sample2^2 + sample3^2)/3) ; notice: Don't forget to divide by the total number of samples.

Then decide if you want it to display the RMS value as decibels or linear.

Some wav-files have meta-data at the end of the file, so the size of the file can be odd, but the audio inside is always even.
In my wav-loader routine i try to strip the meta-data.....

Marinus

guga · October 22, 2014, 12:17:34 PM

OK, i guess i missunderstood the data sample.

I found an app that displays the data samples as the size they are:
http://www.jensign.com/showsamples

They are not a 3 dword structure as i thought

It is a simple dword that contains the right and left channel, right ?

So, a "sample" is in fact:

[Sample:
Sample.Channel.Left: W$ 0
Sample.Channel.Right: W$ 0]

Is it correct ?

guga · October 22, 2014, 12:21:42 PM

Oh...ok..you answered at the same time. :)

QuoteIn my wav-loader routine i try to strip the meta-data.....

Yeah, i saw it. I already converted it, but i´ll have to convert back some routines, because i misunderstood the data sample. I´ll restore as the way you did.

And how to convert sample to hertz ?

Siekmanski · October 22, 2014, 12:25:42 PM

Yes, the audio in the wav-file is like this:

leftchannel,rightchannel,leftchannel,rightchannel,leftchannel,rightchannel.................... each channel sample = 16 bit signed int

Siekmanski · October 22, 2014, 12:27:11 PM

Quote from: guga on October 22, 2014, 12:21:42 PM
Oh...ok..you answered at the same time. :)

And how to convert sample to hertz ?

What do you mean exactly?

The MASM Forum

News:

FIR Audio Spectrum Analyzer

TouEnMasm

guga

guga

Siekmanski

Gunther

guga

Siekmanski

guga

Siekmanski

guga

Siekmanski

guga

guga

Siekmanski

Siekmanski