(http://members.home.nl/siekmanski/FIRanalyzer.png)
Digital Signal Processing Example,
How you can use FIR filters to analyze sound.
How to calculate the FIR coefficients for low-pass, high-pass, band-pass and band-reject filters with a Hamming window.
How you can screen-synchronize the audio with DirectSound and Direct3D9.
I have rewritten everything to make things more flexible.
This example demonstrates a 1 octave 10 band analyzer.
The source code has some text to explain things, and where to get additional information on this subject.
edit:
Line 1474 had an error, each bar has 4 vertices and i was sending 10 vertices instead of 40 to the graphics card. ( see post of nidud )
coinvoke g_pD3DDevice,IDirect3DDevice9,DrawIndexedPrimitive,D3DPT_TRIANGLELIST,0,0,AudioBarCount,0,AudioBarCount*2
must be:
coinvoke g_pD3DDevice,IDirect3DDevice9,DrawIndexedPrimitive,D3DPT_TRIANGLELIST,0,0,AudioBarCount*4,0,AudioBarCount*2
Added a routine ( Reset3DEnvironment ) in case the graphics device has been lost.
It will restore the vertex and index buffers and reset the graphics device.
Corrected 2 bugs found,
There was a buffer overrun when calculating the frequency bands. ( corrected this )
And something very stupid, i was only calculating the left audio channel. ( corrected this )
New attachment,
Marinus
very nice Marinus - and small, too :t
Thank you, Marinus. I'll have a look :t
Excelent work Marinus.
One small isseu only. Is it only me, or the app is leaking memory ?
Well done, Marinus. :t
Gunther
deleted
Thanks guys,
@ guga, are you sure about memory leakage? This proggy uses filemapping, could it be that?
But if you found a bug please let me know.
@ nidud, thanks for this bug report. I think i have found it...
Can you test this example?
If it works well i will correct the source code and post it in the first post of this thread.
Also learned today that the window width and height of a dialog program is not the same on all pc's :biggrin:
deleted
Thanks, i'll make a new attachment. :t
Hi Marinus.
I´m not sure what caused the leakage. I had the antivirus opened when i 1st runned your app, causing it to "freeze" the directory where it was. Most probably was the antivirus scanning the allocated memory from it.
After a reboot, i saw that it have no problems.
But, since you commented it..why using filemapping, instead simply using the heap allocated memory ?
Also, can you make it be editable (and saving the results) ? I mean, it would be nice control the different frequencies. For example if i want to only hear all frequencies between 400Hz to 800 Hz (Human voice, i presume). Also, a pitch identification would be nice. (Not a pitch control, but a pitch identification, to make easier isolate human voice, for example)
@ guga, why using filemapping?
I just map the Wav file to the process space getting a pointer to the beginning of the Wav file data in the processes own memory space.
So i have easy access to the Wav data.
By pitch identification, do you mean to have a variable frequency band width with a controllable center frequency?
This is possible with FIR filters but then you have to calculate the coefficients on the fly.
( we can speed up this routine because the frequency response of the filter is symmetric so we only have to calculate the first half of them )
Then save the filtered data to disk or send it to the sound card.
Hi marinus
QuoteBy pitch identification, do you mean to have a variable frequency band width with a controllable center frequency?
Yes, but with a bit more control then simply adjust the frequency or speed. I mean a way to identify Human voice by isolating it from the background music. (The centered channel is not always the best way to do it, since a audio can have mixed background noise or music centered too)
There is a tool that does that to better exemplify what i´m trying to say.
http://www.celemony.com/en/melodyne/what-is-melodyne (http://www.celemony.com/en/melodyne/what-is-melodyne)
here too have a similar tool http://aubio.org (http://aubio.org/)
One other thing with pitch identification is that it is possible to change the voice of one person with another, matching formands. Below is a tool that "mimics" other people voice almost perfectly.
http://www.voiceconverter.net (http://www.voiceconverter.net/)
See the example of "voice matching - spanish dubbing example". This app uses, in fact, a opensource software called Praat. See here (it also includes the source code) http://www.fon.hum.uva.nl/praat (http://www.fon.hum.uva.nl/praat)
With praat you can see a tutorial on acoustic analysis, i.e. what the waveform, the spectrogram and the pitch curve tell you about durations, formants, pitches etc http://www.fon.hum.uva.nl/paul/papers/AcousticAnalysis8.pdf (http://www.fon.hum.uva.nl/paul/papers/AcousticAnalysis8.pdf)
http://www.fon.hum.uva.nl/paul/praat.html (http://www.fon.hum.uva.nl/paul/praat.html)
One way to identify human voice from the rest of the audio is isolate all frequencies that are not in the band for the human voice and later, trying to retrieve through noise patterns what is voice and what is noise. Or without patterns as celemony or voice converter does.
Human voice frequency bands are described here:
http://en.wikipedia.org/wiki/Voice_frequency (http://en.wikipedia.org/wiki/Voice_frequency)
http://en.wikipedia.org/wiki/Vocal_range (http://en.wikipedia.org/wiki/Vocal_range)
http://www.bnoack.com/index.html?http&&&www.bnoack.com/audio/speech-level.html (http://www.bnoack.com/index.html?http&&&www.bnoack.com/audio/speech-level.html)
http://www.cs.cf.ac.uk/Dave/Multimedia/node271.html (http://www.cs.cf.ac.uk/Dave/Multimedia/node271.html)
http://www.axiomaudio.com/blog/audio-oddities-frequency-ranges-of-male-female-and-children%E2%80%99s-voices/ (http://www.axiomaudio.com/blog/audio-oddities-frequency-ranges-of-male-female-and-children%E2%80%99s-voices/)
The identification of human voice can also be achieved by the crest factor.
http://en.wikipedia.org/wiki/Crest_factor (http://www.bnoack.com/index.html?%5Burl="https://masm32.com/board/http&&&www.bnoack.com/audio/crestfactor.html%5C""%5Dhttp://www.bnoack.com/index.html?%5Dhttp://www.bnoack.com/audio/crestfactor.html%5Dhttp://www.bnoack.com/index.html?%5B/url%5Dhttp&&&www.bnoack.com/audio/crestfactor.html%5B/url%5D%5Burl="http://en.wikipedia.org/wiki/Crest_factor)
http://www.spectrum-soft.com/news/spring2011/crest.shtm (http://www.spectrum-soft.com/news/spring2011/crest.shtm)
An algorithm that isolates the frequencies and also compute the crest factor (above 15 db is always human voice) can lead to a better isolation. Since the peak used to compute the crest factor is a average of all peaks, and since RMS is used to calculate it,a fine tune can be made to isolate frequencies that results on a RMS lower then 15 db.
The formula for crest factor is
C = |a|/RMS
RMS = a/sqr(2)
y = a*sin(2*Pi*Frequency*Time)
Since it is possible to compute the amplitude, is also possible to calculate it´s average and therefore the RMS. Since amplitude is a measure of frequency and time, and also the resultant RMS. It is possible to isolate frequencies by forcing RMS to compute only values above 15db. On this way, the frequency and amplitutes not related to human voice will also be removed to fit RMS > 15
basicsynth.com/uploads/AddendumCh6a.pdf
http://www.indiana.edu/~emusic/acoustics/amplitude.htm (http://www.indiana.edu/~emusic/acoustics/amplitude.htm)
http://www.dspguide.com/ch2/2.htm (http://www.dspguide.com/ch2/2.htm)
http://recordingology.com/in-the-studio/distortion/square-wave-calculations/ (http://recordingology.com/in-the-studio/distortion/square-wave-calculations/)
For peak detection
http://www.wavemetrics.com/products/igorpro/dataanalysis/peakanalysis/peakfinding.htm (http://www.wavemetrics.com/products/igorpro/dataanalysis/peakanalysis/peakfinding.htm)
http://www.calculatefactory.com/calculate/formula/2611 (http://www.calculatefactory.com/calculate/formula/2611)
http://www.ni.com/white-paper/4278/en/ (http://www.ni.com/white-paper/4278/en/)
Also, if using patterns, once you know what are the frequencies not related to human voice, you can try finding the patterns of it. For example, if a given wave sound related to voice is inside another one that is not, it will produce a specific frequency that was interfered by the superimposing bye each other. The goal is that at a certain point (time) we may have before the interference non voice frequencies related to a specific pattern.
So, if we find this pattern we can cancel the non voice frequency including the one that is embeded in the human voice. This is done by cancelation of the wave form as describes here
http://holykaw.alltop.com/this-mathematical-formula-can-cancel-out-all (http://holykaw.alltop.com/this-mathematical-formula-can-cancel-out-all)
http://www-personal.umich.edu/~gowtham/bellala_EECS452report.pdf (http://www-personal.umich.edu/~gowtham/bellala_EECS452report.pdf)
Noice cancelation is done also by audacity, IzotopeRX and other apps. But i´m not sure if it is on the same techique.
Hi guga,
This is all very interesting stuff but not on my todo-list yet. ( maybe it would if there where 64 hours in a day :biggrin:)
Although it might be interesting to get the formants out of the speech audio and let a 3D head speak.
Have you ever done this kind of coding?
Are you a musician?
Hi Marinus
QuoteThis is all very interesting stuff but not on my todo-list yet. ( maybe it would if there where 64 hours in a day
I know what you mean. :greensml:
QuoteHave you ever done this kind of coding?
Are you a musician?
I´m not a musician. My interest in audio edition is mainly because i´m a collector of old movies (I have something around 10000 films/series/cartoons etc) and i´m used to edit/restore the audio the videos i have. (also the video itself).
For that purpose (audio), i use tools like Audacity, Magix Audio Cleaner, IzotopeRX, Melodine, Sony Vegas, etc. And the voice recognition apps i told to try to recover the voice of old narrators or dubbers of the film.
The main problem with old movies (specially those that i had from 16mm films) is the bad quality of the audio. Although the restoration process is a bit of fun, sometimes it takes too long time to edit them on the "conventional" apps.
About coding for it, i didn´t tried yet to code something like that before. I focused my poor free time in video, but audio is something that i´m really interested in try to code eventually.
SIEKMANSKI,
I made a quick scan of your source code,...it looks REALLY interesting,...
However, am running on Windows Seven Professional and have only DirectX Version 11 installed on my system, and so,...the initialization failed.
I get the: "Unable to create DirectSound Object" MessageBox.
Could you provide us with some more information on which DirectSound version your program is compiled against ??? The source includes define the IDirectSound8 interfaces. So where can I get the correct DirectSound DLLs to get the application to operate correctly ???
Any version of DirectX9, it just needs "DSOUND.dll"
IDirectSound8 interfaces are the latest and part of Directx9.
Hi Siekmanski
On Win 8.1 64 the prog runs but the visuals don't. There os no signal drawing and the frequency indicators are not moving. Seems like frozen but sound is played correctly and no error/warning is displayed.
Regards, Biterider
I'm running also win 8.1 64 bit.
So it must be the differences between graphic cards.
I did a minimum of renderstates that might not be enough for all graphic cards?
I've included another renderstate D3DRS_FILLMODE,D3DFILL_SOLID
can you test this one,
And the next with a complete initialisation of the graphics card
both seem to work well under XP SP3, Marinus
Hi again
The problem pesists using both versions.
I moved the whole folder to an XP SP3 system and all 3 exes (original included) work seamlessly (INTEL onboard video chip). If it helps, the Win 8.1 system has an GForce 660 video card with an up to date driver (340.52 version).
Regards, Biterider
Attached a pic of the running exe.
Thanks Dave and Byterider,
I thought it may worked with a minimum of initialisation.
But not all graphic cards will accept it i think?
@ Byterider,
FIRtest2 is a full initialisation of the graphics card so it will be hard for me to see what's missing wy it doesn't work on your graphics card.
I'll have a look at it tomorrow.
Marinus
@ Byterider,
Just saw the picture, the sound is running and drawing to the screen happens but no interaction with the music.
That's strange........ :(
I'll have to think how this is possible and not on every PC ????
Hi
I tried to run the prog in the compatibility mode and after windows repaired somehow the application it ran like it did on the XP system.
The compatibility analysis dialog doesn't help to much. The diagnostic message said: "incompatible program repaired". I set the compatibily mode to Win7 and this did the trick.
Regards, Biterider
Hi, SIEKMANSKI, again,
Now, I'm getting an error message box that says: "Entry Point Not Found"
"The procedure entry point OpenVxDHandle could not be founf in the dynamic link library KERNEL32.dll."
I checked all the asm files in your project (and the include files),...an could NOT find invoke OpenVxDHandle anywhere. It must be being called from within some other routine. I Googled OpenVxDHandle, and found this: Why OpenVxDHandle Is Not Contained in Kernel32.lib (http://support.microsoft.com/kb/141131)
I did find some information: Function: OpenVxDHandle Applicable Versions: 4.0 (Windows) to 4.90
Apparently, a VxD is a Virtual Device Driver.
I suspect that this doesn't really make sense in terms of your FIR Audio Spectrum Analyzer. I've Googled extensively, and haven't found anything that relates to DirectSound. Weird.
The only thing I found that seems to make sense is this: The procedure entry point OpenVxDHandle could not be located in the dynamic link library KERNEL32.dll, Microsoft Community (http://answers.microsoft.com/en-us/windows/forum/windows_vista-system/the-procedure-entry-point-openvxdhandle-could-not/eb4dc118-9c38-4830-b84a-beb9540c2663)
Thanks Biterider,
Sometimes the strangest things happen.... it runs on my windows 8.1 64bit and not on yours. :idea:?
Did you get a message from windows before you did the compatibility repaire?
At first i thought maybe the message pump didn't call the RenderD3d9 routine on your PC........
I can't see anything in my source code that could cause this.
Hi Zen,
You can download "DirectX 9.0c End-User Runtime" from microsoft.com.
That should install all the libraries on your machine.
Hi, SIEKMANSKI,
I opened both your FIR Audio Spectrum Analyzer application and the DirectX version 9 dsound.dll in IDA Pro.
The problem is NOT in your application (I'm fairly certain, at least).
The dsound.dll imports OpenVxDHandle from KERNEL.dll,...and weirdly, calls it in only ONE location. Without further checking,...I don't think that the routine that invokes it is even relevant to your application.
DAMN !!! (By the way, the dsound.dll File Version is: 4.9.0.904)
...But, I have to wonder why I'm the only one who has this EXTREMELY annoying problem,... :dazzled:
Hi Zen,
Are you sure it's the right DirectX 9 Redistributable you downloaded.
If i were you i'll get the "DirectX 9.0c End-User Runtime" from microsoft.com
Hi Siekmanski
No, I didn't get any message. The strange thing is that the prog is running and only the animation doen't work, while you can see that at startup something was drawn (green and orange fat lines). I think i noticed is that the "screen refresh rate" indication and the progressbar are also frozen. Maybe some timer or thread is not working as expeced in win8.
Biterider
Hi Biterider,
The only thread and timer in this application is used for the Sound Card routines and they work on your machine.
It looks like the "RenderD3d9" routine is blocked in some way.....
What i don't understand is we both have Windows 8.1 64-bit running and i have no problems???
It makes me wonder how this is possible and makes me curious what it could be.
Marinus
Hi Marinus
Maybe you can add to the UI some indication or flag to show where the problem is.
Looking into the code, I think that ShowPlayTime is not called on my system. That means that for some reason RenderD3D9 is not reaching the point where the invocation is done. At the beginning of the proc you are doing some checks on the D3D device and at the end for WAVdataPTR. Changing for example the caption on each code branch may help to find the problem.
Regards, Biterider
Hi Biterider,
Is this D3D9 height map example working on your win 8.1 machine. ( press F1 to toggle between windowed and fulls screen mode )
If so then i have something to compare.
I think maybe it has something to do with the TestCooperativeLevel and Reset Methods.
Marinus
Hi Biterider,
I have made 2 examples with different approaches to the rendering routine.
Could you test these on your win 8.1 PC ?
Marinus
Hi Marinus
I found some app with source code that you may find interesting. It displays a graphic spectrum
https://www.relisoft.com/freeware/freq.html (https://www.relisoft.com/freeware/freq.html)
Hi Marinus
D3D9_HM.exe works in FS and windowed mode. Some strange things happen when switching several times from one mode to the other, but basically it works.
Test version 3 & 4 from FIR_analyser didn't show any change compared to the previous one.
Biterider
Thank you very much Biterider for testing. :t
I now have a clue..... some graphic cards act differently in processing the data.
In the D3D9_HM.exe i did a sanity check and always check if i receive a message from the graphics card
and then restore all the vertex and index buffers and reset the graphics card.
QuoteD3D9_HM.exe works in FS and windowed mode. Some strange things happen when switching several times from one mode to the other, but basically it works.
This is an example i have hopefully solved that issue. ( press F1 to toggle between windowed and full screen mode )
Marinus
Beautiful App Marinus :t
Thanks guga.
Hi Biterider,
I finally hope this one solves it............
Marinus
Hi Marinus
Good news, it works now :t
What was the problem?
Biterider
Apparently on some graphic cards the device is not direct operational and therefore rendering is not possible.
The device needs to be reset and have to restore all the vertex and index buffers.
I'll post the new source-code in the first post.
One request, did you test the D3D9_3DStars example in Reply #34 to solve the strange fullscreen toggle thingy ?
It's not easy to get everything working on all PC's
I'm very thankful for your patience to get my code in more working order. :t
SIEKMANSKI,
Sorry about all the negative comments from me,...it's just that the concept of your FIR Audio Spectrum Analyzer is UBER COOL,...I wanted to write something similar once (in C++), and, I know how difficult DSP can be.
...Anyway,...I'll mess around with your code when I get time,...and, try to get it to run on DirectX 11.
By the way,...D3D9_3DStars (above) runs perfectly on my machine (without any DirectX 9 dlls being loaded),...so, I'm fairly certain its just a version issue here. Don't worry about it.
...And, thanks for the source code.
Here are a number of informative Webpages:
Graphics APIs in Windows, MSDN (http://msdn.microsoft.com/en-us/library/windows/desktop/ee417756(v=vs.85).aspx)
Where is the DirectX SDK?, MSDN Blog (http://blogs.msdn.com/b/chuckw/archive/2012/03/22/where-is-the-directx-sdk.aspx)
Where is the DirectX SDK (2013 Edition)?, MSDN Blog (http://blogs.msdn.com/b/chuckw/archive/2013/07/01/where-is-the-directx-sdk-2013-edition.aspx)
DirectX Installation for Game Developers, MSDN (http://msdn.microsoft.com/en-us/library/windows/desktop/ee416805(v=vs.85).aspx)
The latest upload (which does handle D3DERR_DEVICELOST) does work perfectly (win7,x64), even when switching the monitor (the previous one did not work in such cases).
Thanks qWord,
That's cool to know that monitor switching is working too. :eusa_dance:
Hi Zen,
You can download "DirectX 9.0c End-User Runtime" from microsoft.com.
That should install all the libraries on your machine and solve your DSOUND.dll version problem.
D3D9_3DStars calls the d3d9.dll so there are DirectX 9 dlls on your machine.
Marinus
Hi Marinus
D3D9_3DStars.exe works here without problems. Changed several times from FS to windowed mode and all work seamlessly.
Biterider
Thanks Biterider,
I'm a happy person now. :biggrin: :biggrin: :biggrin:
Marinus
Found 2 bugs in my code.
There was a buffer overrun when calculating the frequency bands. ( corrected this )
And something very stupid, i was only calculating the left audio channel. ( corrected this )
I blame it on my motorcycle accident. :biggrin:
New source code and executable in first post.
thanks, Marinus :t
:t
Gunther
Hi marinus
can you make it export the audio ?
Yes, but the exported audio file would be the same as the imported file.
The sound isn't altered.
Hmm, and how to make it be altered ? Export the result without having to play the whole file ? (I mean, like when you load a file in audacity and pass a filter, then export the resultant file)
Hi guga,
You can do that with the filter routines of the FIR analyzer.
But the program has to be completely rewritten.
You have to implement something like a WAV-writer routine to save the filtered audio.
Do you need an example of a WAV-writer routine ?
Hi Dave,
Quotewhat might be fun would be to feed audio into it, and draw a graph of the output :biggrin:
Do you mean a spectrum waterfall graph ?
yah - i meant to post in the other thread on FFT, though
i was thinking spectrum analyzer, of course
with a mixer, it could even be used for frequencies above audio (not an audio mixer)
but, not a lot of bandwidth to work with - it could be stepped to increase the apparent bandwidth
Hi marinus,
If you can, send me an example of wav-writer routine. Your filter is excelent for audio i have that needs a cleanup. I was thinking in using it 1st before i use other apps to remove clicks, noises etc.
I´m curious what the filter should do with a noisy audio file. If it attenuates the noise and enhance the vocals, then another app can better remove the noises completely.
Your routines needs to be used directly in the audio data, right ?
Or perhaps, writting a vst plugin for audacity would be also a good idea
Hi guga,
QuoteYour routines needs to be used directly in the audio data, right ?
You can use the filter routines for any type ( as long it is float 32 bit ) of data, realtime or read a file and filter that data if you want.
There is a free online DSP course at Coursera (Stanford University).
The title is Audio Signal Processing for Music Applications.
Here you can sign in for free: http://coursera.org/course/audio (http://coursera.org/course/audio)
You can learn a lot, it's a great course.
I'll post the routine today.
Hi guga,
here the wav-writer routine
.data
Wave_Header db "RIFF"
Wave_Length dd 0 ; sample data length + (44 - 8)
db "WAVE"
db "fmt "
dd 16 ; format chunk size
Wave_Encoding dw WAVE_FORMAT_PCM
Wave_Channels dw 0
Wave_Samplerate dd 0
Wave_BPS dd 0
Wave_Blockalign dw 0
Wave_SampleBits dw 0
db "data"
Wave_Datalength dd 0
Filename "Dance.wav",0
.code
WriteAudio_2_WAV proc
LOCAL hFileOut,dwBytesDone:DWORD
invoke CreateFile,addr Filename,GENERIC_WRITE,NULL,NULL,CREATE_ALWAYS,FILE_ATTRIBUTE_NORMAL,NULL
inc eax
jz close
dec eax
mov hFileOut,eax
invoke SetFilePointer,hFileOut,44,0,FILE_BEGIN
Writeloop:
; get your audio data from the audiobuffer here with your own routine
; data must be 16bit interleaved
invoke WriteFile,hFileOut,AudioBuffer,AudioLength,addr dwBytesDone,0
; jump to Writeloop till done
; Now prepare the WAV header:
; This example is for a 2 channel 16bit 44100 Hz wav-file
mov eax,44100
mov Wave_Samplerate,eax
mov cx,2
mov Wave_Channels,cx
shl eax,cl
mov Wave_BPS,eax
shl ecx,1
mov Wave_Blockalign,cx
mov Wave_SampleBits,16
mov eax,TotalbytesWritten
mov Wave_Datalength,eax
add eax,36
mov Wave_Length,eax ; sample data length + (44 - 8)
; now write the WAV header
invoke SetFilePointer,hFileOut,0,0,FILE_BEGIN
invoke WriteFile,hFileOut,addr Wave_Header,44,addr dwBytesDone,0
close_file_out:
.if ( hFileOut )
invoke CloseHandle,hFileOut
.endif
mov eax,TRUE
ret
WriteAudio_2_WAV endp
Thanks Marinus. I´ll give it a try.
A few things. I´m analyzing the results of the FIR file using Audacity to record it.
Your FIR is excellent to remove DCoffsets and normalize the audio data as shown in the image below (the 1s track is the file using FIR. The 2nd is the original file)
Awesome Normalization
(http://i62.tinypic.com/2inf4x.jpg)
Problems:
I tested it to see either it was a problem in Audacity´s Recorder or FIR demo. The problem is with the demo. It is breaking the audio data in some places.(I opened the "untouched" file and listened to it. It is cracking the sound wave.
The sound track under FIR was "breaked" in a couple of points, and it extended the length by 2 secs
Fig1: Cracked data
(http://i58.tinypic.com/246v2up.jpg)
Fig2: Extended time
(http://i62.tinypic.com/2hf3xx2.jpg)
Do you want me to send to you the audio file to test ?
Dispites the cracking. The resultant data after using a noise remover (I used IzotopeRX for that) is really awesome. (I passed the Noise remover only twice)
Btw....do you have any idea how to remove/reduce noise using a pattenr ? (as in audacity/IzotopeRX) ? It would be an very good idea implement it with the Demo, since it is doing an amazing job in Normalizing the wave data. (also a click remover would be good too)
Take a look, how clean became the file after using the noise remover on the resultant file generated with FIR.
(http://i60.tinypic.com/x24gf8.jpg)
I´m amazed. The sound "looks" incredible. No muffing or robotic sound whatsoever. (Only the cracking needs a fix)
Hi guga,
In my source i synchronize the FIR-filter to the screen-refresh rate because,
it was intended to show the power of the frequency bands on the screen. ( what you hear you see immediately )
If there are glitches in the screen-refresh rate the filter skips samples.
In my case this is wanted. ( sync. ears and eyes :eusa_dance:)
You need to filter your audio without the screen sync.
Just move the filter window constant forward in your audio data and there will be no cracks in your output.
QuoteJust move the filter window constant forward in your audio data and there will be no cracks in your output.
how i do that ?
You mean to add silence on the beggining of the wave file ?
No need for that.
You could use a ring buffer and write and filter the audio data the number of taps of the filter ahead of the save pointer
and save the audio data then when they are filtered.
You need 2 pointers in your ring buffer, a save pointer and a write pointer. (which is far enough ahead of the save pointer)
When you have written the audio data to the ring buffer you can immediately filter this block of data.
Be sure to have enough distance between the 2 pointers and make the ring buffer large enough.
Hi Marinus.
Tks, but i´m not being able to assemble your file. I recently changed ml.exe with Jwasm and when trying to assembe the code in Radasm, this error shows up:
ml /c /coff /Cp "guga.asm"
JWasm v2.12pre, Nov 27 2013, Masm-compatible assembler.
Portions Copyright (c) 1992-2002 Sybase, Inc. All Rights Reserved.
Source code is available under the Sybase Open Watcom Public License.
guga.asm(243) : Error A2209: Syntax error: )
fpc(33)[dx9macros.inc]: Macro called from
FLT4(1)[dx9macros.inc]: Macro called from
guga.asm(243): Main line code
guga.asm(243) : Error A2209: Syntax error: __0_5
fpc(35)[dx9macros.inc]: Macro called from
FLT4(1)[dx9macros.inc]: Macro called from
guga.asm(243): Main line code
guga.asm(243) : Error A2150: Missing operator in expression
guga.asm(254) : Error A2209: Syntax error: )
fpc(33)[dx9macros.inc]: Macro called from
FLT4(1)[dx9macros.inc]: Macro called from
guga.asm(254): Main line code
guga.asm(254) : Error A2209: Syntax error: __0_0
fpc(35)[dx9macros.inc]: Macro called from
FLT4(1)[dx9macros.inc]: Macro called from
guga.asm(254): Main line code
guga.asm(254) : Error A2150: Missing operator in expression
guga.asm(281) : Error A2209: Syntax error: )
fpc(33)[dx9macros.inc]: Macro called from
FLT4(1)[dx9macros.inc]: Macro called from
guga.asm(281): Main line code
guga.asm(281) : Error A2209: Syntax error: __0_46164
fpc(35)[dx9macros.inc]: Macro called from
FLT4(1)[dx9macros.inc]: Macro called from
guga.asm(281): Main line code
guga.asm(281) : Error A2150: Missing operator in expression
guga.asm(282) : Error A2209: Syntax error: )
fpc(33)[dx9macros.inc]: Macro called from
FLT4(1)[dx9macros.inc]: Macro called from
guga.asm(282): Main line code
guga.asm(282) : Error A2209: Syntax error: __0_53836
fpc(35)[dx9macros.inc]: Macro called from
FLT4(1)[dx9macros.inc]: Macro called from
guga.asm(282): Main line code
guga.asm(282) : Error A2150: Missing operator in expression
guga.asm(589) : Error A2209: Syntax error: )
fpc(33)[dx9macros.inc]: Macro called from
FLT4(1)[dx9macros.inc]: Macro called from
guga.asm(589): Main line code
guga.asm(589) : Error A2209: Syntax error: __512_0
fpc(35)[dx9macros.inc]: Macro called from
FLT4(1)[dx9macros.inc]: Macro called from
guga.asm(589): Main line code
guga.asm(589) : Error A2150: Missing operator in expression
guga.asm(650) : Error A2209: Syntax error: )
fpc(33)[dx9macros.inc]: Macro called from
FLT4(1)[dx9macros.inc]: Macro called from
guga.asm(650): Main line code
guga.asm(650) : Error A2209: Syntax error: __1000_0
fpc(35)[dx9macros.inc]: Macro called from
FLT4(1)[dx9macros.inc]: Macro called from
guga.asm(650): Main line code
guga.asm(650) : Error A2150: Missing operator in expression
guga.asm(761) : Error A2209: Syntax error: )
fpc(33)[dx9macros.inc]: Macro called from
FLT8(1)[dx9macros.inc]: Macro called from
guga.asm(761): Main line code
guga.asm(761) : Error A2209: Syntax error: __10000000_0
fpc(35)[dx9macros.inc]: Macro called from
FLT8(1)[dx9macros.inc]: Macro called from
guga.asm(761): Main line code
guga.asm(761) : Error A2150: Missing operator in expression
guga.asm(762) : Error A2209: Syntax error: )
fpc(33)[dx9macros.inc]: Macro called from
FLT8(1)[dx9macros.inc]: Macro called from
guga.asm(762): Main line code
guga.asm(762) : Error A2209: Syntax error: __200_0
fpc(35)[dx9macros.inc]: Macro called from
FLT8(1)[dx9macros.inc]: Macro called from
guga.asm(762): Main line code
guga.asm(762) : Error A2150: Missing operator in expression
guga.asm(948) : Error A2209: Syntax error: )
fpc(33)[dx9macros.inc]: Macro called from
FLT4(1)[dx9macros.inc]: Macro called from
guga.asm(948): Main line code
guga.asm(948) : Error A2209: Syntax error: __1024_0
fpc(35)[dx9macros.inc]: Macro called from
FLT4(1)[dx9macros.inc]: Macro called from
guga.asm(948): Main line code
guga.asm(948) : Error A2150: Missing operator in expression
guga.asm(954) : Error A2209: Syntax error: )
fpc(33)[dx9macros.inc]: Macro called from
FLT4(1)[dx9macros.inc]: Macro called from
guga.asm(954): Main line code
guga.asm(954) : Error A2209: Syntax error: __1_0
fpc(35)[dx9macros.inc]: Macro called from
FLT4(1)[dx9macros.inc]: Macro called from
guga.asm(954): Main line code
guga.asm(954) : Error A2150: Missing operator in expression
guga.asm(956) : Error A2209: Syntax error: )
fpc(33)[dx9macros.inc]: Macro called from
FLT4(1)[dx9macros.inc]: Macro called from
guga.asm(956): Main line code
guga.asm(956) : Error A2209: Syntax error: __1024_0
fpc(35)[dx9macros.inc]: Macro called from
FLT4(1)[dx9macros.inc]: Macro called from
guga.asm(956): Main line code
guga.asm(956) : Error A2150: Missing operator in expression
guga.asm(960) : Error A2209: Syntax error: )
fpc(33)[dx9macros.inc]: Macro called from
FLT4(1)[dx9macros.inc]: Macro called from
guga.asm(960): Main line code
guga.asm(960) : Error A2209: Syntax error: __20_0
fpc(35)[dx9macros.inc]: Macro called from
FLT4(1)[dx9macros.inc]: Macro called from
guga.asm(960): Main line code
guga.asm(960) : Error A2150: Missing operator in expression
guga.asm(961) : Error A2209: Syntax error: )
fpc(33)[dx9macros.inc]: Macro called from
FLT4(1)[dx9macros.inc]: Macro called from
guga.asm(961): Main line code
guga.asm(961) : Error A2209: Syntax error: __2_126033980826655
fpc(35)[dx9macros.inc]: Macro called from
FLT4(1)[dx9macros.inc]: Macro called from
guga.asm(961): Main line code
guga.asm(961) : Error A2150: Missing operator in expression
guga.asm(962) : Error A2209: Syntax error: )
fpc(33)[dx9macros.inc]: Macro called from
FLT4(1)[dx9macros.inc]: Macro called from
guga.asm(962): Main line code
guga.asm(962) : Error A2209: Syntax error: __128_0
fpc(35)[dx9macros.inc]: Macro called from
FLT4(1)[dx9macros.inc]: Macro called from
guga.asm(962): Main line code
guga.asm(962) : Error A2150: Missing operator in expression
guga.asm(991) : Error A2209: Syntax error: )
fpc(33)[dx9macros.inc]: Macro called from
FLT4(1)[dx9macros.inc]: Macro called from
guga.asm(991): Main line code
guga.asm(991) : Error A2209: Syntax error: __80_0
fpc(35)[dx9macros.inc]: Macro called from
FLT4(1)[dx9macros.inc]: Macro called from
guga.asm(991): Main line code
guga.asm(991) : Error A2150: Missing operator in expression
guga.asm(992) : Error A2209: Syntax error: )
fpc(33)[dx9macros.inc]: Macro called from
FLT4(1)[dx9macros.inc]: Macro called from
guga.asm(992): Main line code
guga.asm(992) : Error A2209: Syntax error: __112_0
fpc(35)[dx9macros.inc]: Macro called from
FLT4(1)[dx9macros.inc]: Macro called from
guga.asm(992): Main line code
guga.asm(992) : Error A2150: Missing operator in expression
guga.asm(993) : Error A2209: Syntax error: )
fpc(33)[dx9macros.inc]: Macro called from
FLT4(1)[dx9macros.inc]: Macro called from
guga.asm(993): Main line code
guga.asm(993) : Error A2209: Syntax error: __144_0
fpc(35)[dx9macros.inc]: Macro called from
FLT4(1)[dx9macros.inc]: Macro called from
guga.asm(993): Main line code
guga.asm(993) : Error A2150: Missing operator in expression
guga.asm(993) : Fatal error A1113: Too many errors
Error(s) occured.
I`m trying to push from memory masm syntax, but i´m no being able to assemble this. There is a error on FLT4 macro when using jwasm as ml.exe ?
I'm not familiar with Jwasm. Try using this macro instead,
FLT4 MACRO float_number:REQ
LOCAL float_num
.data
align 4
float_num real4 float_number
.code
EXITM <float_num>
ENDM
Tks, that worked under the good and old qeditor (I can´t make Radasm assemble the res file:( )
But, the cracking got worst ?
All i did was insert the macros on the asm file (and removed the macro from the .inc one.
FLT4 MACRO float_number:REQ
LOCAL float_num
.data
align 4
float_num real4 float_number
.code
EXITM <float_num>
ENDM
FLT8 MACRO float_number:REQ
LOCAL float_num
.data
align 4
float_num real8 float_number
.code
EXITM <float_num>
ENDM
Now, how to use the ring buffer to remove the cracking ? I´m trying to follow the ode, but it is hard to figure it out where the audio buffer is stored *and also, i need to remember the masm syntaxes as well)
You meant to increase this value ?
FIR_MaxTaps equ 8192 ?
I do not hear any cracking when i run the exe in Audio.zip.
As far as i understand,
1) you want a program that imports a wav file.
2) filter the file.
3) exports the result as a wav file.
Is this correct ?
No cracking ? That´s odd. It must be here something consuming memory. Did you tested on a large file (Like the one i sent by PM) ? If even with a large file you didn´t hear the cracking, iit must be here in my PC. I´ll reboot and try again.
Quote
As far as i understand,
1) you want a program that imports a wav file.
2) filter the file.
3) exports the result as a wav file.
Is this correct ?
Yep. I need the app to fillter the audio with FIR, and then export it, without having to play the audio until the end. Just filtering and exporting.
Hi guga,
Example:
1.) create filter coefficients. ( see: CalculateFIRcoefficients proc )
2.) import the wav data. ( see: Open_WAVfile proc )
3.) convert left and rigth channel to 32bit floats. ( see: Int16LSBinterleaved_Float32 proc )
4.) filter the 2 audio data channels as explained below.
5.) convert back to interleaved 16bit wav data. ( see: Float32_Int16LSBinterleaved proc )
6.) export filtered audio as wav file. ( see: wav writer-routine )
FIR coeffs for an order 2 high-pass filter. (samplerate = 44100 Hz, cutoff frequency = 1000 Hz)
"use even order numbers, then you have 1 center coefficient"
invoke CalculateFIRcoefficients,2,44100,1000,NULL,FIR_HighPass,addr FIRcoefficients
your fir coeffs: (-0.069, 0.138, -0.069) ; order 2 has 3 taps. ( taps == num. orders + 1)
your audio data: (0.5, 1.0, 0.0, 0.4) ;left channel or right channel.
alloc mem for filtered audio: == audio size
audio mem = audio size + order size = 4 + 2 = 6 ;(all zeros)
audio offset = order / 2 = 1
copy audio channel data to allocated mem, starting at audio offset.
audio mem: (0.0, 0.5, 1.0, 0.0, 0.4, 0.0)
Now everything is set up to do the filtering.
filtering pseudo code:
f = fir coeffs
a = audio mem
b = filtered audio mem
b[0] = a[0]*f[0] + a[1]*f[1] + a[2]*f[2]
b[1] = a[1]*f[0] + a[2]*f[1] + a[3]*f[2]
b[2] = a[2]*f[0] + a[3]*f[1] + a[4]*f[2]
b[3] = a[3]*f[0] + a[4]*f[1] + a[5]*f[2]
note:
For simplicity this example uses order 2.
For a better "Frequency Response" you have to use higher order settings.
Marinus
Hi Marinus, i´ll give a truy. I´ll need to do step by step to understand the terms you are using.
So, i´ll create FIR coefficient as it is create in CalculateAnalyzerFrequencyBands, but i need to create only once, instead using the other bands, right ?
But why using value of 2 as a order in "invoke CalculateFIRcoefficients,2,44100,1000,NULL,FIR_HighPass,addr FIRcoefficients"? Shouldn´t it be a multiple of 4 as described here:
FIR_MaxTaps equ 8192 ; "Keep this a multiple of 4" If FIR_MaxTaps == 512 then maximum FIR_Order = 511
; If Order = 30, == 31 Taps (coefficients), keep the order number even so we have uneven Taps and thus 1 center coefficient
; Notice that a FIR filter is symmetric
OrdersBand1 equ 8190 ; Change the order numbers to narrow or widen the frequency band width ( the roll off factor )
OrdersBand2 equ 4094
OrdersBand3 equ 2022
OrdersBand4 equ 1022
OrdersBand5 equ 510
OrdersBand6 equ 254
OrdersBand7 equ 126
OrdersBand8 equ 62
OrdersBand9 equ 30
OrdersBand10 equ 14
And the most important....What exactly "order" stands for ? Is it the frequency of the audio ?
What "tap" means ?
Question:
Accordying to wiki, you used those values (0.53836 and 0.46164):
QuoteHamming Window = coefficient * ( 0.53836 - 0.46164 * cos( PI2 * filtercounter / filterorder ))
Where can i find info about this formula to understand what are those values for alpha and beta ? I mean, the precision of those constants are only 5 digits or it can be more precise ?
All work well on my xp SP3 :eusa_clap:
Ok, i found the coefficients. The hamming window cancellation is identified here
http://www.dtic.mil/cgi-bin/GetTRDoc?AD=ADA034956 (http://www.dtic.mil/cgi-bin/GetTRDoc?AD=ADA034956).
The documentation is entitled as "On the Use of Windows for Harmonic Analysis with the Discrete Fourier Transform"
QuoteThe Hamming window can be thought of as a modified Hanning window. (Note the potential source of confusion in the similarities of the two names.) Referring back to Figs. 15 and 18. we note the inexact cancellation of the sidelobes from the summation of the three kernels. We can construct a window by adjusting the relative size of the kernels as indicated in Eq. (26a) to achieve a more desirable form of cancellation. Perfect cancellation of the first sidelobe (at 0 = 2.5 12r/Ni ) corresponds to the Hamming window as indicated in Eq. (26b).
w (n) =a + (l-) cos L
(26a)
W (0)= a D() +0.5 (1-0D-o2Wjr)D (+2)]
w In) = 0.54 + 0.46 cos r n --N
(26b)
or
w (n) = 0.54 - 0.46 cos--] n 0, 1,2, ... N-I
The Hamming window is shown in Fig. 19. Notice the deep attenuation at the missing sidelobe position. Note also that the small discontinuity at the boundary of the window has resulted in a I/w (6.0 dB per octave) rate of falloff. The better sidelobe cancellation does
result in a much lower initial sidelobe level of-42 dB. Table I lists the parameters of this window. Also note the loss of binary weighting: hence the loss of ease of a spectral convolution implementation
The alpha constant of 0.53836 is equivalent to 13459/25000. So, since this is a cancellation, the value of 0.46164 is retrieved by 1-13459/25000.
Hi guga,
QuoteSo, i´ll create FIR coefficient as it is create in CalculateAnalyzerFrequencyBands, but i need to create only once, instead using the other bands, right ?
Yes.
QuoteBut why using value of 2 as a order in "invoke CalculateFIRcoefficients,2,44100,1000,NULL,FIR_HighPass,addr FIRcoefficients"? Shouldn´t it be a multiple of 4 as described here:
No, that is only when you calculate 4 samples at once.
QuoteAnd the most important....What exactly "order" stands for ? Is it the frequency of the audio ?
No, If spoken of a 5th order it has 6 taps, thus 121th order has 122 taps
QuoteWhat "tap" means ?
taps is the number of coefficients in the filter.
Ok, i guess i found the formula. As far i understood, to retrieve the value of the constants it is necessary to find the perfect cancelation in decibal levels.
Accordying to the document a perfect cancelation is around -42db. The formula used to retrieve this is described here:
http://en.wikipedia.org/wiki/Side_lobe
R = 20*log10(sin(X)/X)
Where X = Hamming window constant
R = the perfect cancelation we want to retrieve. In case -42db (or -42/100 if used on the formula)
I used the hamming constants of 13459/25000 in wolframalpha (http://www.wolframalpha.com/input/?i=20*log10%28%28sin%28%2813459%2F25000%29%29%2F%28%2813459%2F25000%29%29) and got the predicted result for the cancellation.
So, to calculate the Hamming constant (X) for a givven decibel cancelattion (R), we can use the reversed formula.
For see the results. I used R = -42/100 for cancelation, and got this results as X (http://www.wolframalpha.com/input/?i=-42%2F100+%3D+20*log10%28sin%28X%29%2FX%29)
X = +/- (0.536024817009894983757669554)
So, the alpha value for the hamming constant (if used to cancelation of exact -42 db) is: 0.536024817009894983757669554
Now another question. The formula you are using for the hamming constant is supposed to work with a perfect cancelattion of the sidelobe or it is simply the minimum value ?
No, it's just one of the many windowing functions you can use to minimize the side lobes.
Just pick a windowing function that fits your needs. http://en.wikipedia.org/wiki/Window_function (http://en.wikipedia.org/wiki/Window_function)
There are also windowing functions that let you control the roll off factor of the side lobes. ( kaiser window for example )
A more simple method to load a real
Quote
FLT4 MACRO value
local etiquette
.data
etiquette REAL4 value
.code
; EXITM fpc( @CatStr( <REAL4 >,value ) )
EXITM<etiquette>
ENDM
FLT8 MACRO value
local etiquette
.data
etiquette REAL8 value
.code
; EXITM fpc( @CatStr( <REAL8 >,value ) )
EXITM<etiquette>
ENDM
Hi Marinus...I´m trying but i´m geting a bit confused
The function Float32_Int16LSBinterleaved is called any time the app starts even if no music is being loaded. This is because it is called from DSTimerThread without any checkings if WAVdataPTR is filled or not.
Is that a normal behaviour ?
Also, you said to get the left and right audio channels, but how i do that ? All i found was GetWavAudio, that internally have a ptr to the audio data (Does WAVdataPTR already refers to both channels ?)
And...the variable AudioBufferLeftChannel seems to be useless for this purpose, since i can´t see any AudioBufferRightChannel being used.
One question regarding GetWavAudio function.
This part of the code is never used ? (At leats, when debugging, this part of the code nevers break/pause at debugging)
neg edx ; edx = size of data to get from the start of the wav file to copy to last part of the buffer
sub ecx,edx ; ecx = size of dat from the last part of the wav file
mov WavDataSizeLast,ecx
mov WavDataSizeNew,edx
mov WAVdataPos,edx ; update next audio position in wav file ( wav file loops )
mov ecx,WavDataSizeLast
test ecx,ecx
jz CopyLastWavDataDone
call Int16LSBinterleaved_Float32
mov eax,WavDataSizeLast
and eax,15
test eax,eax
jz CopyLastWavDataDone ; do we have some samples left?
mov edx,16
sub edx,eax
sub esi,edx ; adjust pointer
sub edi,edx ; adjust pointer
mov ecx,16 ; copy last 1 2 or 3 unaligned samples
call Int16LSBinterleaved_Float32
One question. Why you are using AUDIOBUFFERLENGTH as 65536 ? Isn´t it a too high value considering you are only converting the values from Int16LSBinterleaved_Float32 ?
I´m trying hard to understand, but i´m getting a bit confused here. I used another wave sample (attached below). It is really small, and FirAnalyser crashes because when trying to copy from "movups oword ptr[edi+eax+AUDIOBUFFERLENGTH+16],xmm1" there will be not enough buffer for it.
I presume that all Int16LSBinterleaved_Float32 is doing is copying (and converting) the values from one sample data (3 dwords) to the destination buffer, right ?
But....why copying it 65536 bytes away ??? Isn´t the audio channel (interleaved) simply a sequence of 3 dwords ?
I´m asking because i presume that the total amount of chunks before the filtering must be the same as after, right ? I mean, the total size of the file must not change after using the filter, if i understood it correctly.
Like this:
seg000:00000024 aData db 'data'
seg000:00000028 ChunkSize dd 156 ; 156 = 39 elements *4
seg000:0000002C DataChunk RigthChunk <4184144228, 4177656065, 4158650335>
seg000:00000038 LeftChunk <4128962074, 4091343836, 4049269082>
seg000:00000044 RigthChunk <4006014662, 3965250648, 3929926205>
seg000:00000050 LeftChunk <3902531739, 3884246916, 3875923717>
seg000:0000005C RigthChunk <3877562142, 3888244673, 3906857181>
seg000:00000068 LeftChunk <3931630167, 3960335373, 3990285782>
seg000:00000074 RigthChunk <4019122062, 4044026122, 4063162926>
seg000:00000080 LeftChunk <4075418345, 4081185601, 4082430804>
seg000:0000008C RigthChunk <4082168656, 4084200303, 4092392428>
seg000:00000098 LeftChunk <4109432048, 4136105607, 4171233439>
seg000:000000A4 RigthChunk <4211276546, 4252237171, 4289265576>
seg000:000000B0 LeftChunk <24052079, 45154993, 58721152>
seg000:000000BC RigthChunk <67568647, 75564161, 85984544>
So, alll it is necessary is copy the 3dword (a structure) to float (related to the right channel) and make the same for the right channel. Right ?
Then, once it is copied how to filter it ?
This is kinda confusing me, because the Sample data (right and Left chunks have 13 elements of the arrays of "Chunk structure). Which means 6 for left channel, 6 for right cnahhel and the extra 1 is for what ?
What is the usage of the 1st sample ?
For better understanding, i tried to make a structure similar to this (see the picture in orange, marking 00 00 00 00 as Sample1):
https://ccrma.stanford.edu/courses/422/projects/WaveFormat (https://ccrma.stanford.edu/courses/422/projects/WaveFormat)
Hi guga,
Let me explain how things are done.
The soundcard is split up in 2 buffers of 4096 16bit interleaved stereo-samples, one part is playing and the other part will be filled with audio from the audio buffer.
The audio-buffer has two seperate buffers ( left and right ) of 4 blocks of 4096 32bit float samples each.
The multimedia-timer checks every 5 milliseconds where the play-position of the soundcard is.
If it is still in the same buffer then it does nothing but if it is in the next buffer then it copies the next block of audio from the audio-buffer to the soundcard-buffer.
This is also called double-buffering.
After that it reads directly the next blok of wav-data from the wav-file to the audio-buffer.
The audio-buffer gets the wav-data 2 blocks ahead so that if we have a large FIR filter let's say 8190 coefficients it doesn't fetch old audio data to do the calculations.
So we have no wrong old audiodata for the filter routine.
See in the picture how the FIR filter ( green colour ) is centered over the ScreenSynchronizedAudioPosition and doesn't mess with the old audio-data.
In the picture you can see the 4 steps of the soundcard buffer swaps and how the audio is handled.
The wav-loader is a very simplistic routine, it just reads 4096 stereo samples each time.
The only check is if it reaches the end of the wav-file and resets to the beginning of the wav-file.
Because the routine reads 4 samples at once, it can happen that there are 1,2, or 3 remaining samples in the wav-file.
If this is the case it adjusts the pointers of the wav-file and the audio-buffer and copy the last 4 samples of the wav-file at once.
If the WAVdataPTR is NULL it copies silent 0.0 samples to the audio-buffer.
Because it copies blocks of 4096 stereo samples at once, the wav-file must have at least 4096+4 is 4100 stereo samples or else it crashes.
"I should have build a warning in this program" :bgrin:
Another thing,
movups oword ptr[edi+eax+16],xmm1 is the pointer to the AudioBufferLeftChannel buffer
movups oword ptr[edi+eax+AUDIOBUFFERLENGTH+16],xmm1 is the pointer to the AudioBufferRightChannel buffer
It loads the address of AudioBufferLeftChannel in edi and adds "AUDIOBUFFERLENGTH+16" to get the address of AudioBufferRightChannel.
I wrote it this way to save a register....
the +16 are the 16 extra bytes for the buffer overflow of the FIR filter routine.
Below a picture how the data is handled.
Marinus
(http://members.home.nl/siekmanski/AudioBuffers.png)
Good explanation, Marinus. :t
Gunther
Hi Marinus, i guess i make it work. I´ll review the code tomorrow and post it here to you see if i made it right.
I´m not using any new memory. I´m simply using the pointers to the sample themselves to apply the Fir coefficient on it. Each Fir is applied to a sample. I mean, the 1st part of the sample. (As long i understood it correctly. I´m assuming each sample as a 3 Dword array belonging to a structure.) Thus, FIR applies to each one of the dwords on the sample.
One thing that i noticed. If i change the order to it´s maximum 8190 (for example) the resultant audio data becomes way more cleaned.
The only problem i´m facing is that no matter the value of the order i use, it always inserts low energy that needs to be removed. Ex:
(http://i62.tinypic.com/2irqmau.jpg)
Do i need to convert each sample to Frequency to remove it ? (I presume FFT is the way to do it, but i wonder if there is a better or easier way to convert samples to frequency and vice-versa)
If so, using the goertzelFilter algorithm is better then FFT ?
http://netwerkt.wordpress.com/2011/08/25/goertzel-filter (http://netwerkt.wordpress.com/2011/08/25/goertzel-filter)
Note: One thing that i notice. Remember the discussion we had about the Hamming constants ? (Hamming Window = coefficient * ( 0.53836 - 0.46164 * cos( PI2 * filtercounter / filterorder )))
I tested it with the proper values to cancelate the audio and the result is more accurate if you use this in the CalculateFIRcoefficientsReal4 function:
[Float_Hamming_Alpha: F$ 0.536024817009894983757669554]
[Float_Hamming_Beta: F$ 0.463975182990105016242330446]
QuoteI´m not using any new memory. I´m simply using the pointers to the sample themselves to apply the Fir coefficient on it. Each Fir is applied to a sample. I mean, the 1st part of the sample. (As long i understood it correctly. I´m assuming each sample as a 3 Dword array belonging to a structure.) Thus, FIR applies to each one of the dwords on the sample.
I don't know what you mean,
The calculation is very simple:
s = audio samples
f = fir coeffs
For 3 fir coeffs: insert 1 zero before and after your audio data
ouput(s0) = f0*input(s-1) + f1*input(s0) + f2*input(s1)
ouput(s1) = f0*input(s0) + f1*input(s1) + f2*input(s2)
etc....
For 5 fir coeffs: insert 2 zeros before and after your audio data
ouput(s0) = f0*input(s-2) + f1*input(s-1) + f2*input(s0) + f3*input(s1) + f4*input(s2)
ouput(s1) = f0*input(s-1) + f1*input(s0) + f2*input(s1) + f3*input(s2) + f4*input(s3)
etc....
Quote
One thing that i noticed. If i change the order to it´s maximum 8190 (for example) the resultant audio data becomes way more cleaned.
The only problem i´m facing is that no matter the value of the order i use, it always inserts low energy that needs to be removed. Ex:
There is no maximum for the filter coefficients, if you use more the frequency band roll off will be steeper and so the band will be narower.
The unwanted low energy you want to remove can be done using another window function such as a Blackman-Harris, the frequency band will be wider but the side lobes will be much lower.
You can watch this behaveour in my fft analyzer proggy by loading the sine sweep wav file and then try the different window functions. ( turn off fft smoothing )
This is the Blackman-Harris window: BlackmanHarris = 0.35875 - 0.48829*cos((2.0*PI/(nn-1)*step) + 0.14128*cos(2*(2.0*PI/(nn-1)*step) - 0.01168*cos(3*(2.0*PI/(nn-1)*step)
Quote
Do i need to convert each sample to Frequency to remove it ?
Yes, just add zeros ( total FIR coeffs / 2 ) add the beginning and the end of your audio data.
QuoteIs the goertzelFilter algorithm better then FFT ?
If you want to check one ( or a few ) certain frequency then it is faster.
But to filter a frequency(band) you need a band filter.
Marinus
QuoteYes, just add zeros ( total FIR coeffs / 2 ) add the beginning and the end of your audio data.
Ok, i´ll try making it and ask later if it is correct.
But....1st i´ll start posting the routines i made for your FIR Algo, can you help see if it is correct ? This 1st post is related to the FIR itself, it is a porting of your algo. (The syntax is easy to follow. An i changed very few on your algo. And i maintained your comments for instruction)
The main routine is a variation of your´s "CalculateAnalyzerFrequencyBands" that is uses inside WM_INITDIALOG (or WM_CREATE)
...Else_If D@Message = &WM_INITDIALOG
move D$hwnd D@Adressee
call CalculateAnalyzerFrequencyBands
_________________________________________________________________________
; a standard 1 octave band analyzer
; for frequency band info go to: http://www.sengpielaudio.com/calculator-octave.htm
; Just play around with the frequency bands and the order numbers, or even overlap the frequency bands to compensate for the
; roll-off factor of the filters.....
; If you change this example in a 9 band analyzer by combining band 1 and 2 you need much less calculations and you'll get a much faster analyzer
; In this example a 10 band has a total of 16324 orders to calculate each screen refresh
; A 9 band analyzer just 7238 orders
; Remove band 1 ( 8190 orders ) and 2 ( 4094 orders ) and replace it with 1 band ( 22 - 88 Hz and 3198 orders )
; Then this routine will be 2.25 times faster
; And then you can do multithreading and make it mega fast on multi core machines.......
; And for selected machines you may translate the "CalculateFrequencyBand" routine to AVX (Advanced Vector Extensions) code
; Important is you need to "design" a filter to your needs ( frequency response ) in combination with the order numbers and the windowing type
[FIR_ORDER 2] ; the higher the value, more accurate it will be. It must be a multiple of 2 and a maximum of 8192. Ex: 4096, 8190
[FIR_LowPass 0]
[FIR_HighPass 1]
[FIR_BandReject 2]
[FIR_BandPass 3]
[<16 FIRcoefficientsBand1: F$ ? #FIR_MaxTaps] ; No longer needs to be aligned. But, aligned is faster
Proc CalculateAnalyzerFrequencyBands:
call CalculateFIRcoefficients FIR_ORDER, 44100, 1000, &NULL, FIR_HighPass, FIRcoefficientsBand1
EndP
_________________________________________________________________________
_________________________________________________________________________
[AUDIOBUFFERLENGTH 65536] ; byte size, always a power of two, and must hold minimal 4 times the samples as the soundcard buffer
[FIR_MaxTaps 8192] ; "Keep this a multiple of 4" If FIR_MaxTaps == 512 then maximum FIR_Order = 511
; If Order = 30, == 31 Taps (coefficients), keep the order number even so we have uneven Taps and thus 1 center coefficient
; Notice that a FIR filter is symmetric
[FIRcoefficientsTemp: F$ ? #FIR_MaxTaps]
; Introduction to Digital Filters: http://www.dspguide.com/ch14.htm
Proc CalculateFIRcoefficients:
Arguments @order, @samplerate, @cutofffrequency1, @cutofffrequency2, @filtertype, @coefficients
Uses edi, esi, ecx, eax
call 'kernel32.RtlZeroMemory' D@coefficients, FIR_MaxTaps ; My note: replace this later with a faster routine.
call CalculateFIRcoefficientsReal4 D@samplerate, D@order, D@cutofffrequency1, D@coefficients ; calculate LowPass
.If D@filtertype <> FIR_LowPass
If D@filtertype = FIR_HighPass
call InvertFIRcoefficients D@order, D@coefficients ; convert from LowPass to HighPass
ExitP
End_If
; calculate second set of coefficients for BandPass or BandReject
call CalculateFIRcoefficientsReal4 D@samplerate, D@order, D@cutofffrequency2, FIRcoefficientsTemp
call InvertFIRcoefficients D@order, FIRcoefficientsTemp
; calculate BandReject
lea esi D$FIRcoefficientsTemp
mov edi D@coefficients
mov ecx D@order
@Addcoefficients:
fld F$esi+ecx*4 | fadd F$edi+ecx*4 | fstp F$edi+ecx*4
dec ecx | jns @Addcoefficients
If D@filtertype <> FIR_BandReject
call InvertFIRcoefficients D@order, D@coefficients
End_If
.End_If
EndP
_________________________________________________________________________
_________________________________________________________________________
[Float_Half: F$ (1/2)]
[Float_Zero: F$ 0]
; Hamming Window = coefficient * ( 0.53836 - 0.46164 * cos( PI2 * filtercounter / filterorder ))
;[Float_Hamming_Alpha: F$ (13459/25000)] ; Try using 0.536024817009894983757669554
[Float_Hamming_Alpha: F$ 0.536024817009894983757669554] ; Try using 0.536024817009894983757669554
; 20*log10((sin((5*pi)/2))/((5*pi)/2))
; log(sin(B))-log(B) = -(A log(5))/2000-(A log(2))/2000
; {FindRoot[-21/50 - (20 Log[Sin[X]/X])/Log[10] == 0, {X, 0.530649, 0.572675}, WorkingPrecision -> 39], FindRoot[-21/50 - (20 Log[Sin[X]/X])/Log[10] == 0, {X, -0.562036, -0.52001}, WorkingPrecision -> 39]}
;[Float_Hamming_Beta: F$ 4.61640000343322754e-1] ; try using 0.463975182990105016242330446
[Float_Hamming_Beta: F$ 0.463975182990105016242330446] ; try using 0.463975182990105016242330446
[Float_Two_PI: F$ 6.2831853071795864769252867665590057683943387987502116]
Proc CalculateFIRcoefficientsReal4:
Arguments @samplerate, @order, @cutofffrequency, @coefficients
Local @CutoffRadian, @sum, @filterorder, @filterorder_half, @filtercounter
Uses edi, ecx, edx, eax
; cutoff frequency must have a value between 1 and (samplerate / 2)
finit
fclex
; CutoffRadian = 2 * PI * cutofffrequency / samplerate
fld F$Float_Two_PI | fimul F@cutofffrequency | fidiv F@samplerate | fstp F@CutoffRadian
; Sum of all coefficients used for normalisation of the coefficients.
fldz | fstp F@sum
fild F@order | fst F@filterorder | fmul F$Float_Half | fstp F@filterorder_half
mov D@filtercounter 0
mov edi D@coefficients
mov ecx D@order
@calculate_coefficients:
mov edx D@filtercounter
fild F@filtercounter | fsub F@filterorder_half
fcomp F$Float_Zero
fnstsw ax
sahf | jne @not_zero
fld F@CutoffRadian ; when 0, then coefficient == CutoffRadian
fstp F$edi+edx*4
jmp @Addcoefficients
@not_zero:
; sin( CutoffRadian * ( filtercounter - filterorderhalf )) / ( filtercounter - filterorderhalf )
fild D@filtercounter | fsub F@filterorder_half | fmul F@CutoffRadian | fsin
fild D@filtercounter | fsub F@filterorder_half | fdivp ST1 ST0 | fstp F$edi+edx*4
; if you need another window function insert it here....
; info http://en.wikipedia.org/wiki/Window_function
; start Hamming Window function
; Hamming Window = coefficient * ( 0.53836 - 0.46164 * cos( PI2 * filtercounter / filterorder ))
fild D@filtercounter | fmul F$Float_Two_PI | fdiv F@filterorder | fcos
fmul F$Float_Hamming_Beta | fsubr F$Float_Hamming_Alpha | fmul F$edi+edx*4 | fstp F$edi+edx*4
; end of Hamming Window function
@Addcoefficients:
; add coefficient for normalisation
fld F@sum | fadd F$edi+edx*4 | fstp F@sum
inc D@filtercounter
dec ecx | jns @calculate_coefficients
mov ecx D@order
; normalize all coefficients
@normalize_coefficients:
fld F$edi+ecx*4 | fdiv F@sum | fstp F$edi+ecx*4
dec ecx | jns @normalize_coefficients
EndP
_________________________________________________________________________
_________________________________________________________________________
Proc InvertFIRcoefficients:
Arguments @order, @coefficients
Uses esi, ecx
mov esi D@coefficients
mov ecx D@order
@Inverteer_coefficients:
fld F$esi+ecx*4 | fchs | fstp F$esi+ecx*4
dec ecx | jns @Inverteer_coefficients
EndP
_________________________________________________________________________
Marinus, If the above routines are ok, i´ll post the rest of the code responsible to apply he FIR itself (including the Int16LSBinterleaved_Float32 etc)
Personal Note: If everything is Ok after all the tests, i´ll try using a double (R$) instead a single Float (F$). It may gain more accuracy.
I'm not familiar with the notations you use but, as far as i understand it should be ok.
Just give it a try. :t
Can you hear the difference between single and double floats ?
Quote"Can you hear the difference between single and double floats ?"
Nope. But, maybe we can get accuracy, specially concerning noise data. I mean, the more accurate the more fine tune the audio will be, that includes, tones, voices, noises etc.
For what i´m testing so far, setting the order to 8190 refines
a lot the resultant audio. Also, it probably enhances background noises as well that was not audible before, but, this is not at all a problem, considering that it can be better removed with other softwares.
Also, let me ask one thing. What are the values of the samples ? I mean, which notation it uses ? They are a value of magnitude in hertz, decibels or what ?
I mean, i´m considering all data samples as a structure formed with 3 dwords as explained before. For what i saw on the tested audio, it always is formed with:
'data' ; data tag related to audio data itself (A dword "D$")
XXXX ; the size of the subchunk (A dword "D$")
[RightChannel1: D$Sample1, Sample2, Sample3] ; < 1st sample data chunk. Right Channel structure
[LeftChannel2: D$Sample1, Sample2, Sample3]; < 2nd sample data chunk. Left Channel structure
(...)
The total size of the data sample chunk is (3*4)*N structures. Since, in stereo, the channels are formed in pairs , (left and right). I mean, accordying to here (https://ccrma.stanford.edu/courses/422/projects/WaveFormat/wave-bytes.gif) and here (https://ccrma.stanford.edu/courses/422/projects/WaveFormat/)
This seems relevant, specially because the wave structure can be malformed. I mean, we can have one extra DataChannel (Left or Right) at the end Data, or also we can have padding (one single dword) at the end. In both cases, perhaps, they can be simply ignored (changed to zero), or extended to recreate any missing data. For example, let´s say we have a audo data like this:
[RightChannel1: D$Sample1, Sample2, Sample3
LeftChannel2: D$Sample4, Sample5, Sample6
(...)
RightChannel50: D$Sample51, Sample52, Sample53
LeftChannel50: D$Sample54, Sample55, Sample56
RightChannel51: D$Sample57, Sample58, Sample59 ; < end of the data. It ends on a "isolated" right channel
]
On the above example, we can, then add the extra RightChannel based on the Last Right one, to make it be formed in pairs. Resulting in:
[RightChannel1: D$Sample1, Sample2, Sample3
LeftChannel2: D$Sample4, Sample5, Sample6
(...)
RightChannel50: D$Sample51, Sample52, Sample53
LeftChannel50: D$Sample54, Sample55, Sample56
RightChannel51: D$Sample57, Sample58, Sample59
LeftChannel51: D$Sample60, Sample61, Sample62 ; < the fix is a copy of the right channel above
]
And then, change the value on the RIFF sub chunk structure (The dword after "data" tag), and also fix the value of WAVEFORMATEX.cbSize accordly.
The same thing seems to be valid if we have only an isolate dword (or 2 isolated dwords etc. Like:
[RightChannel1: D$Sample1, Sample2, Sample3
LeftChannel2: D$Sample4, Sample5, Sample6
(...)
RightChannel50: D$Sample51, Sample52, Sample53
LeftChannel50: D$Sample54, Sample55, Sample56
PaddingData: D$Sample57 ; < end of the data. It ends on a "isolated" right channel
]
will be:
[RightChannel1: D$Sample1, Sample2, Sample3
LeftChannel2: D$Sample4, Sample5, Sample6
(...)
RightChannel50: D$Sample51, Sample52, Sample53
LeftChannel50: D$Sample54, Sample55, Sample56
RightChannel51: D$Sample57, Sample58, Sample59 ;< extended the rest of the data based on Sample57 to form the Right Channel
LeftChannel51: D$Sample60, Sample61, Sample62 ; < and copy the right channel above to form the left one.
]
Also, if we use 2 channels why we can have odd values ? I mean, like this:
seg000:00000024 aData db 'data'
seg000:00000028 ChunkSize dd 156 ; 156 = 39 dwords *4 = 13*3 structures * 4 => 6 Left Channels + 6 Right Channels + 1 Extra Structure ???
seg000:0000002C DataChunk RigthChunk <4184144228, 4177656065, 4158650335>
seg000:00000038 LeftChunk <4128962074, 4091343836, 4049269082>
seg000:00000044 RigthChunk <4006014662, 3965250648, 3929926205>
seg000:00000050 LeftChunk <3902531739, 3884246916, 3875923717>
seg000:0000005C RigthChunk <3877562142, 3888244673, 3906857181>
seg000:00000068 LeftChunk <3931630167, 3960335373, 3990285782>
seg000:00000074 RigthChunk <4019122062, 4044026122, 4063162926>
seg000:00000080 LeftChunk <4075418345, 4081185601, 4082430804>
seg000:0000008C RigthChunk <4082168656, 4084200303, 4092392428>
seg000:00000098 LeftChunk <4109432048, 4136105607, 4171233439>
seg000:000000A4 RigthChunk <4211276546, 4252237171, 4289265576>
seg000:000000B0 LeftChunk <24052079, 45154993, 58721152>
seg000:000000BC RigthChunk <67568647, 75564161, 85984544>
What is considered as the start of the channels ? The 1st structure ? If, so, since stereo works in pairs, we will have an extra "structure" at the end. Why ?
My problem is understand what is the "sample" values notation. What they really are. They measure etc. I mean, here it says the sample can be used to measure a magnitude or some sort.
http://stackoverflow.com/questions/3058236/how-to-extract-frequency-information-from-samples-from-portaudio-using-fftw-in-c (http://stackoverflow.com/questions/3058236/how-to-extract-frequency-information-from-samples-from-portaudio-using-fftw-in-c)
magnitude = sqrt(re^2 + im^2))
So, magnitude = sqrt (sample1^2 + sample2^2 + sample3^2) ???
Presuming this magnitude is the decible amount calculate under the form of :
magnitude_dB = 20*log10(magnitude)
It implies always that magnitude must be a positive value, right ?
But, if the sum of all of them must be a positive value, why if i add The totalfir/2 and add the beggining and end it can result on a negative value ? THere are cases where the total sum of FIRs/2 is a negative value. In that situation, what does it means ?
I mean, when you said:
QuoteYes, just add zeros ( total FIR coeffs / 2 ) add the beginning and the end of your audio data.
You meant to compute it as:
Sum_Of_FIRs/2 + Sample1 + Sample3 = Frequency of sample ????
If so, what happens to sample2 ? Why it is not computed ?
And, what happens when the generated value is negative ? What does it means ?
And also, the generated result need to be a value between 0 and 1, right ? So, how to compute the frequency in hertz ??? I multiply this value with what ?
I mean, i tried to determine the frequency of the sample value as you said and created this function (Dispites the name, it is computing now the magntiude and checking if the result is positive or negative. I simply added the sum of firs/2 to that to compthe the magnitude, which can´t be a negative value due to the log10 on the formula above. So, If negative, an error ocurred):
[SampleMagnitude: F$ 0]
[Float_One: F$ 1]
[Float_InvertedSquareTwo: F$ 0.7071067811865475244008443621048490392848359376884740]; 1/sqrt(2)
; The SampleMagnitude value must be a positive value betwen 0 and 1.
Proc ConvertSampletoFrequency:
Arguments @pWaveData, @DataLen, @pAverageFir
finit
xorps xmm0, xmm0
mov ecx 0
If D@DataLen = 0
xor eax eax
ExitP
End_If
call Int16LSBinterleaved_Float32New D@pWaveData, D@DataLen
mov ecx D@DataLen
mov esi D@pWaveData
mov ebx D@pAverageFir
While ecx <> 0
; magnitude = sqrt(re^2 + im^2)) + Sum_Of_FIR/2 ????
.If_And D$esi <> 0, D$esi+8 <> 0
fld F$esi | fmul ST0 ST0
fld F$esi+8 | fmul ST0 ST0
faddp ST1 ST0 | fsqrt
fadd F$ebx | fstp F$SampleMagnitude
Fpu_If F$SampleMagnitude < F$Float_Zero ; Negative value ??? Try to fix, based on the percentage that was negative decreased
; Try to find the correct value of the sample1, sample2 or sample3 whose sum needs o be at leastSum_Of_FIRs/2
;sqr (A+B)+Coef > 0
; sqr (A+B) = Coef
; (A+B) = Coef^2 . If A = A^2, B=B^2, A=B
; 2*A^2 = Coef^2
; A = coef/(sqr(2))
; AMin = coef/(sqr(2))
;fld F$SampleMagnitude | fadd F$Float_One | fst F$esi | fst F$esi+4 | fstp F$esi+8
; calculate the minimum magnitude (original value)
;fld F$ebx | fmul F$Float_InvertedSquareTwo | fabs | fst F$esi | fst F$esi+4 | fstp F$esi+8
fld F$ebx | fabs | fadd F$Float_One | fmul F$ebx ; compensate with the percentage decreases. -0.6 = -50% of decrease
fmul F$Float_InvertedSquareTwo | fabs | fst F$esi | fst F$esi+4 | fstp F$esi+8
; Ok, now we have all positive values and retrieved the data of sample1 and sample3 but, still have created gaps ?????
Fpu_End_If
.End_If
add esi 12
sub ecx 12
End_While
call Float32_Int16LSBinterleavedNew D@pWaveData, D@DataLen
EndP
For you understand better what i tried to do
I made the function (I know it don´t works as expected...I´m tryingto understand if my assumptions of the link and your code are correct to computeh the magnitude in decibels of a sample after the FIR filter is used) to determine if the resultant values of sample1, sample2 and sample3 after using FIR will have a correct magnitude_db. Since magnitude-db uses a log10, the result must be positive. If somehow it have a result of negative value, something went wrong during FIR computation.
The wav samples are 16 bit signed ints with a range from -32768 to 32767
Those are converted to 32 bit floats with a range from -1.0 to 0.999969482421875
Just do the fir calculations as explained in the previous posts and try to understand.
The ouput of the fir routine are also samples within a range of -1.0 to 0.999969482421875
To present the sound data as decibels:
dBFS = 20*log10(absolute SampleValue)
fld SampleValue ; range from -1.0 to 0.999969482421875
fabs ; remove the sign bit, values are now between 0.0 and 1.0
fldlg2
fxch
fyl2x
fmul FLT4(20.0)
fstp dBFSvalue ; 0 dBFS is the maximum possible digital level.
SampleValues:
1.0 = 0 dBFS ; loudest sound.
0.5 = -6 dBFS
0.25 = -12 dBFS
0.125 = -18 dBFS
.........
0.0000001 = -140 dBFS ; you can't hear this sound anymore.
dBFS = Decibels relative to full scale.
If you want the RMS value:
RMSmagnitude = sqrt ((sample1^2 + sample2^2 + sample3^2)/3) ; notice: Don't forget to divide by the total number of samples.
Then decide if you want it to display the RMS value as decibels or linear.
Some wav-files have meta-data at the end of the file, so the size of the file can be odd, but the audio inside is always even.
In my wav-loader routine i try to strip the meta-data.....
Marinus
OK, i guess i missunderstood the data sample.
I found an app that displays the data samples as the size they are:
http://www.jensign.com/showsamples (http://www.jensign.com/showsamples)
They are not a 3 dword structure as i thought
It is a simple dword that contains the right and left channel, right ?
So, a "sample" is in fact:
[Sample:
Sample.Channel.Left: W$ 0
Sample.Channel.Right: W$ 0]
Is it correct ?
Oh...ok..you answered at the same time. :)
QuoteIn my wav-loader routine i try to strip the meta-data.....
Yeah, i saw it. I already converted it, but i´ll have to convert back some routines, because i misunderstood the data sample. I´ll restore as the way you did.
And how to convert sample to hertz ?
Yes, the audio in the wav-file is like this:
leftchannel,rightchannel,leftchannel,rightchannel,leftchannel,rightchannel.................... each channel sample = 16 bit signed int
Quote from: guga on October 22, 2014, 12:21:42 PM
Oh...ok..you answered at the same time. :)
And how to convert sample to hertz ?
What do you mean exactly?
QuoteWhat do you mean exactly?
It is like i asked on the previous post
"Do i need to convert each sample to Frequency to remove it ?"
"Yes, just add zeros ( total FIR coeffs / 2 ) add the beginning and the end of your audio data."
Since you explained how to convert each sample to decibel (dBFS ), which is awsome, because it can then remove some noise for the audio data, i´m wondering what is the way to convert each sample data (the word value) to frequency (in hertz).
On this way, once i know the frequency of each sample, i can remove unwanted frequencies. For example, if i found a sample that have 4000 Hz and i want it to remove (or simply change this frequency value to another one).
So, when you said, add zeros (total Fir coeffs/2) and add the begginning and the end of my audio data, you meant to add it to each Word pair ? (that is what forms the sample)
Like:
Frequency = (Sum_of_Firs/2) + Word1 (the word representing the left channel of the sample) + Word2 (the word representing the right channel of the sample)
Since a sample can be interpreted as a Word structure like
[SampleData:
Sample1.Channel.Left: W$ 0
Sample1.Channel.Right: W$ 0
Sample2.Channel.Left: W$ 0
Sample2.Channel.Right: W$ 0
(....)
]
To compute the frequency of each sample i can do this ?
Frequency of Sample1 = (Sum_OfFirs/2) + Sample1.Channel.Left + Sample1.Channel.Right
Frequency of Sample2 = (Sum_OfFirs/2) + Sample2.Channel.Left + Sample2.Channel.Right
If it can be done, then, i presume the resultant value would be something between 0 to 1 (or -1 to 1), right ?
If it is, then, if the generate sum above of "Frequency of Sample1" gives me the result of let´s say 0.1256879. How to convert this value to hertz ? I mean, how to know that this "sample1" has a frequency of let´s say 12 Hz for example ?
You can not get a frequency value out of one sample !
To get the frequency response of the audio you should do a DFT or FFT.
In this FIR analyzer example i just showed that you can do a band-pass filter for let's say 1000 Hz and 2000 Hz.
It removes the frequency content below 1000 Hz and removes the frequency content above 2000 Hz
Now we have only the frequencies between 1000 Hz and 2000 Hz.
You could save this as a wav-file and the only sound you will hear is the frequency range of 1000 Hz to 2000 Hz, all other frequencies are filtered out.
But as in this example i use the filtered frequencies to show them on screen to see which frequencies are played.
I showed 3 ways to present those filtered frequencies per frequency band.
1) Peak = find the highest sample value.
2) RMS = multiply each sample with it self, add all multiplied samples and then divide by the number of samples, then get the sqrt.
3) Average = add all samples and then divide by the number of samples.
Why use zero padding at the beginning and the end of the audio data ?
If you have 3 FIR coefficients, you need 1 zero before the audio-data and 1 zero after the audio-data.
Because you start filtering with the first audio sample, you need the sample numbers "-1,0,1" to do the filter calculations.
The same fore the last audio sample you need an extra sample, so there has to be an extra zero ad the end of the audio-data.
So if you have 8191 FIR coefficients you need 4095 zeros before and 4095 zeros after the audio-data.
Now the whole sequence:
1) Create a FIR window:
; example 2th order band-pass
invoke CalculateFIRcoefficients,2,44100,1000,2000,FIR_BandPass,addr FIRcoefficients ; get the frequencies between 1000 and 2000 Hz
2) Load your wav-file to memory ( reserve extra memory for the zero padding )
It looks like this:
0,L,R,L,R,L,R,L,R,0 ( add zeros for the FIR window size == number of FIR coeffs )
3) Now convert this to 2 seperate 32 bit floating point audio channels.
It looks like this:
0,L,L,L,L,L,L,L,L,.....,0 ( Left channel )
0,R,R,R,R,R,R,R,R,.....,0 ( Rigth channel )
And reserve also memory for the 2 output channels if you like to save it as a wav-file.
4) Do the FIR calculations for both seperate audio channels:
s = audio samples
f = fir coeffs
example with a 2th order filter (3 FIR coeffs):
ouput(s0) = f0*input(s-1) + f1*input(s0) + f2*input(s1)
ouput(s1) = f0*input(s0) + f1*input(s1) + f2*input(s2)
etc....
Note: a 2th order FIR filter has 3 coeffs, a 4th order FIR filter has 5 coeffs etc....
5) Your output has now the fitered audio data.
To save as Wav-file, convert the two output 32 bit floats audio channels back as interleaved (L,R,L,R,L,R,L,R,...) 16 bit signed ints.
6) Your done..... :biggrin:
There are 4 types of band filtering:
invoke CalculateFIRcoefficients,OrderNumber,Samplerate,1000,NULL,FIR_LowPass,addr FIRcoefficients ; remove the frequencies above 1000 Hz
invoke CalculateFIRcoefficients,OrderNumber,Samplerate,1000,NULL,FIR_HighPass,addr FIRcoefficients ; remove the frequencies below 1000 Hz
invoke CalculateFIRcoefficients,OrderNumber,Samplerate,1000,2000,FIR_BandPass,addr FIRcoefficients ; remove the frequencies below 1000 and above 2000 Hz
invoke CalculateFIRcoefficients,OrderNumber,Samplerate,1000,2000,FIR_BandReject,addr FIRcoefficients ; remove the frequencies between 1000 and 2000 Hz
Marinus
Quote from: Siekmanski on October 12, 2014, 09:40:31 PMyour fir coeffs: (-0.069, 0.138, -0.069) ; order 2 has 3 taps. ( taps == num. orders + 1)
your audio data: (0.5, 1.0, 0.0, 0.4) ;left channel or right channel.
alloc mem for filtered audio: == audio size
audio mem = audio size + order size = 4 + 2 = 6 ;(all zeros)
audio offset = order / 2 = 1
copy audio channel data to allocated mem, starting at audio offset.
audio mem: (0.0, 0.5, 1.0, 0.0, 0.4, 0.0)
Now everything is set up to do the filtering.
filtering pseudo code:
f = fir coeffs
a = audio mem
b = filtered audio mem
b
- = a
- *f
- + a[1]*f[1] + a[2]*f[2]
b[1] = a[1]*f
b[2] = a[2]*f
b[3] = a[3]*f
note:
For simplicity this example uses order 2.
For a better "Frequency Response" you have to use higher order settings.
Marinus
Hi Marinus. I continued to analyze this app, but i´m still struggling to understand how to apply the Fir coefficients. I succeeded to create a save routine as you said years ago, and can export a wave audio correctly. But, i didn´t understood how the FIR filters are applied.
(https://i.ibb.co/XjMCV8b/Image1.png) (https://ibb.co/XjMCV8b)
One small thing. On your app, here:
CalculateFIRcoefficients proc uses esi edi order:DWORD,samplerate:DWORD,cutofffrequency1:DWORD,cutofffrequency2:DWORD,filtertype:DWORD,coefficients:DWORD
; Introduction to Digital Filters: http://www.dspguide.com/ch14.htm
invoke RtlZeroMemory,coefficients,FIR_MaxTaps
The call to RTLZeromemory must be a multiple of 4 since each FIR_MaxTaps is a dword. So, it must be:
invoke RtlZeroMemory,coefficients, (FIR_MaxTaps*4)
Ok..now let´s back to understand this FIR algorithm. The routine you said to create another buffer in memory etc, seems a little bit different then what is explained here: https://youtu.be/jL_1DwUMD2w?t=327 (https://youtu.be/jL_1DwUMD2w?t=327) and also here https://www.sciencedirect.com/topics/engineering/fir-filters (https://www.sciencedirect.com/topics/engineering/fir-filters)
That´s why i´m getting a bit confused in what to do. Let´s say i have a situation a array of coefficients of let´s say only 6 values, and a Data chunk of 13 values to be applied.
Presuming a order 2 (If i understood it correctly), we must multiply the data at the 1st offset with the one at the same offset in the Fir Table and the others values we multiply by the preceding ones (X-1) pos for Order2, right ?
Order2
Original data
X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12
17 12 1 7 19 25 22 77 13 11 150 2 111
Coefficient Data (Say we have only 6 coefficientes)
A0 A1 A2 A3 A4 A5
22 145 26 88 99 101
Y0 = A0*X0 + A1*(X0-1)+A2*(X0-2)+A3*(X0-3) ....
Y0 = 22*17 + 26*0 . Since the previous position contains nothing, we set it as is 0, right ?
Y0 = 22*17On the next pos at X1 we have:
Y1 = A0*X1 + A1*(x1-1) + A2*(x1-2)....
Y1 = A0*X1 + A1*(X0) + A2*(0)....
Y1 = 22*12 + 145*17 + 26*0
On the pos X2 we have:
Y2 = A0*X2+A1*(X2-1)*A2*(X2-2)+A3*(X2-4)+....
Y2 = A0*X2+A1*(X1)*A2*(X0)+A3*(0)+....
Y2 = 22*1 + 145*12 + 26*17+88*0 + .....
but, on Pos X7 we have more samples then coefficients
Y7 = A0*X7+A1*(X7-1)*A2*(X7-2)+A3*(X7-4)+....
Y7 = A0*X7+A1*(X6)*A2*(X5)+A3*(X4)+A4*X3+A5*X2+A6*X1+A7*X0 + 0+0+0
Y7 = 22*77 + 145*22 + 26*25+88*19+99*7+101*1 +
??*12 + ????*22See ? We have no more coefficients to multiply, since the total size of our data is bigger then the amount of coefficients.
How this works, exactly ?
Can you show an example using the same values ? I mean a Table of coefficients of 6 values and a Table of samples of let´s say 14. And also shows the same example how it works on Order2, 3or 10 ?
One last thing. On your example you set the Order values as a multiple of 2 minus 2, but the
OrdersBand3 equ 2022 seems to be incorrect.
; Change the order numbers to narrow or widen the frequency band width ( the roll off factor )
[ORDERSBAND1 8190] ; (16*2^9)-2
[ORDERSBAND2 4094] ; (16*2^8)-2
;[ORDERSBAND3 2022] ; ((16*2^7)-2-24), but, it should be: (16*2^7)-2[ORDERSBAND3 2046] ; ((16*2^7)-2-24), but, it should be: (16*2^7)-2. This seems to be the correct value
[ORDERSBAND4 1022] ; (16*2^6)-2
[ORDERSBAND5 510] ; (16*2^5)-2
[ORDERSBAND6 254] ; (16*2^4)-2
[ORDERSBAND7 126] ; (16*2^3)-2
[ORDERSBAND8 62] ; (16*2^2)-2
[ORDERSBAND9 30] ; (16*2^1)-2
[ORDERSBAND10 14] ; (16*2^0)-2
And also...why you decreased the equates values by 2 ?