Support

If you have a problem or need to report a bug please email : support@dsprobotics.com

There are 3 sections to this support area:

DOWNLOADS: access to product manuals, support files and drivers

HELP & INFORMATION: tutorials and example files for learning or finding pre-made modules for your projects

USER FORUMS: meet with other users and exchange ideas, you can also get help and assistance here

Audio Compression & Limiting - low CPU%

For general discussion related FlowStone

Re: Audio Compression & Limiting - low CPU%

Postby steph_tsf » Mon Feb 17, 2020 11:47 am

Martin, concerning the limiter, we need to cope with the required gain, that's going to vary from 0 dB to 50 dB depending on the ear malfunction and the frequency band. I don't know if the mono4 arithmetics can cope with the 100 dB of the input signal dynamic, plus the 50 dB of gain setting. We need to remain "near HiFi" on a 150 dB scale.
Therefore, I initially considered that it was foolish to base on the envelope follower, for "previewing" the envelope after 50 dB of gain. The limiter attenuation must base on the "excess" amplitude in dB after the 50 dB gain.
I intuitively felt it was required to observe and follow the 50 dB amplified envelope, instead of assuming what it would be after the amplification, because the end decision is basing on a difference (in dB) between the observed envelope (say 145 dB SPL), and the limiting value (say 130 dB SPL).
Thus, intuitively, the pipeline that I identified as bulletproof was :
raw input ---> first envelope follower ---> first compressor ---> required gain (ranging between 0 dB and 50 dB) ---> second envelope follower ---> second compressor acting as limiter ---> output.
There are thus two envelope followers, and two compressors.
This first envelope follower could be a RMS one, dealing with small signal, needing a high-order Bessel ripple filter, and possibly a delay compensation.
The second envelope follower could be a PEAK one, specialized in big, amplified signals. As it only operates as emergency exit, less care is required.
This would be ideal. Hope it won't require much CPU%.
Your comments are welcome.
Last edited by steph_tsf on Mon Feb 17, 2020 4:28 pm, edited 2 times in total.
steph_tsf
 
Posts: 248
Joined: Sun Aug 15, 2010 10:26 pm

Re: Audio Compression & Limiting - low CPU%

Postby wlangfor@uoguelph.ca » Mon Feb 17, 2020 3:36 pm

I forgot to mention, don't forget to look for the custom compressors made by cytosonic. They're the best.

They're available in the old synthmaker backup site.
User avatar
wlangfor@uoguelph.ca
 
Posts: 792
Joined: Tue Apr 03, 2018 5:50 pm
Location: North Bay, Ontario, Canada

Re: Audio Compression & Limiting - low CPU%

Postby steph_tsf » Tue Feb 18, 2020 2:49 am

the old synthmaker backup site
Many thanks for the info. I was unaware that the Synthmaker forum is still available as archive.
steph_tsf
 
Posts: 248
Joined: Sun Aug 15, 2010 10:26 pm

Re: Audio Compression & Limiting - low CPU%

Postby steph_tsf » Tue Feb 18, 2020 2:59 am

Martin, I see you faced a complication regarding the allpass filters.
A small tweak in the x86 SSE assembler will improve this.
Do not rely on a IIR Biquad that's outputting a BP "constant skirt".
Please rely on a IIR Biquad that's outputting a BP "zero dB".
Let such IIR Biquad deliver the three components LP, BP and HP.
See the attached proposition.
steph_tsf
 
Posts: 248
Joined: Sun Aug 15, 2010 10:26 pm

Re: Audio Compression & Limiting - low CPU%

Postby wlangfor@uoguelph.ca » Tue Feb 18, 2020 8:35 pm

steph_tsf wrote:Martin, I see you faced a complication regarding the allpass filters.
A small tweak in the x86 SSE assembler will improve this.
Do not rely on a IIR Biquad that's outputting a BP "constant skirt".
Please rely on a IIR Biquad that's outputting a BP "zero dB".
Let such IIR Biquad deliver the three components LP, BP and HP.
See the attached proposition.
ZDF AP - the tweak that will save CPU in the Linkwitz-Riley bandsplitter.fsm


Yep, NP.

Hmm, this sounds inuitive. Interesting idea.
User avatar
wlangfor@uoguelph.ca
 
Posts: 792
Joined: Tue Apr 03, 2018 5:50 pm
Location: North Bay, Ontario, Canada

Re: Audio Compression & Limiting - low CPU%

Postby wlangfor@uoguelph.ca » Tue Feb 18, 2020 8:37 pm

User avatar
wlangfor@uoguelph.ca
 
Posts: 792
Joined: Tue Apr 03, 2018 5:50 pm
Location: North Bay, Ontario, Canada

Re: Audio Compression & Limiting - low CPU%

Postby steph_tsf » Wed Feb 19, 2020 1:42 am

They're available in the old synthmaker backup site.

Indeed, I found a few compressor schematics. Here follows the list.

7823_Sketch_of_a_compressor
7334_compressornew
7298_compressor
4930_compressor4u3
3140_sidechain compressor
3074_sidechain compressor

I am in search of a RMS detector that's not plagued by its own noise, or by its own DC drift.
I am in search of a truly linear side-chain compressor, after the RMS detector.

The audio system I am designing features a quite benign 120 dB dynamic, as currently it is relying on a consumer-grade USB soundcard. For some reason, I cannot rely on a analog volume control. Later on, I may hook a USB-controlled relay that's switching a 0 dB / 40 dB analog attenuation, just before the analog power amplifiers, or preferably between the power amplifier output and the passive speaker input (this way, the rumble and the hiss get also attenuated by 40 dB). This will indeed provide the required 160 dynamic range. Yes, a 160 dB dynamic range, obtained with consumer-grade audio gear (and a relay, and a couple of resistors).

Let us concentrate on the system, like it is at the moment. No analog attenuation. There is only 120 dB available between the "no audio" situation (corresponding to the hearing threshold of a young person), and the extremely loud and possibly dangerous 120 dB SPL situation. One may need to apply such level to the eardrum of a hearing impaired person.

Without the 40 dB analog attenuation, one cannot measure the hearing threshold with such apparatus. Let me explain why. Most of the time, the hearing threshold of a young boy or girl is 100 dB below the "digital zero dB" corresponding to 120 dB SPL on the eardrum. The pure sine tone may only exist, inside a 20 dB dynamic. I mean, in the absence of analog volume control at the output, like explained above. Consequently, the "pure sine tone" will resemble a staircase, like it was 3-bit coded. The 40 dB analog attenuation is thus, only required when measuring the hearing threshold of good-hearing people.

Fortunately, the aim of the equipment is not to measure a hearing threshold. Case closed.

The aim of the equipment is to explore in great detail the comfortable audibility zone, in search for the most adapted hearing correction. For a normally hearing person, this is between 60 dB SPL weighted A, and 80 dB SPL weighted A. We are dealing with a social reality. People are continuously and naturally adjusting the loudness of their speech, for sending 60 dB SPL(A) to 80 dB SPL(A) to the intended listeners, in function of the mood and the distance. And the ambient noise, but that's another aspect we'll see later.

The aim of a hearing correction, is to process such 20 dB band of dynamic actually between 60 dB(A) and 80 dB(A) very carefully in order to preserve speech intelligibility. There is no more question of amplifying everything by 50 dB, and hooking a ALC (Automatic Level Control) at the input, in such a way that whatever the microphone is experiencing as sound, the amplified sound remains at say 100 dB on the eardrum.

Today, one is finely adjusting the gain, not for compensating the very degraded hearing threshold of a hearing impaired person, but for ensuring that a 60 dB(A) speech gets easily perceived, and understood. Say the eardrum requires 90 dB(A). A gain of 30 dB is thus required. This can be 20 dB less than what the degraded hearing threshold may have suggested. The anti-Larsen job (feedback counter-measure device) becomes easier. We thus know that 90 dB(A) is the beginning of the comfort zone of the hearing impaired person eardrum. Now that the gain got set, let's deal with the compression ratio. We present 80 dB(A) to the mike. In case the person complains about such sound, being perceived as very loud, as stressing, as rumbling, we apply the required compression. Say a ratio 2.0 compression is required. Which means that the upper limit of the comfort zone of the eardrum is only 100 dB(A). The ratio 2.0 compression is the confirmation that within the hearing impairment causes, there is a neuronal cause (nerves or brain). The metrics that we just did (required gain, required compression ratio, upper limit of the eardrum comfort zone), help following-up of the hearing impairment. This can be done very quickly, contrary to the determination of the hearing threshold using pure tones, that are known to stress the hearing system, and provoke tinnitus after a few minutes. In parallel, one gets perfectly aware that the hearing correction must be cautious in case the sound environment is above 80 dB(A).

A limiter is thus required after the compressor. Such limiter must guarantee that the eardrum will never experience a sound exceeding a certain safe value, say 95 dB(A), which is only 5 dB above the upper limit of the eardrum comfort zone. Consequently, there is a band as narrow as 5 dB for allowing the hearing impaired person to remain able to identify strong alarm sounds and danger sounds, as such.

Consequently, the hearing correction does't impair the social and vital capabilities of the hearing impaired person.

When following such methodology, optimally fitting a multichannel digital hearing aid only takes minutes.
The beauty and the importance is that next-gen multichannel hearing aids will get their silicon organized for embedding such methodology. They'll be the sound generator, the sound level meter, the FFT spectrum meter, and the transfer function analyzer. The aim of my work is to emulate such next-gen hearing aid, and to streamline the user interface. Possibly on a smartphone.

Other people are working on a fundamental question that got raised here, about how to optimally correct a asymmetric hearing impairment. One ear lost 20 dB sensitivity at 2 kHz, reaching a 40 dB sensitivity loss at 4 kHz. The other ear lost 40 dB sensitivity at 2 kHz, reaching a 80 dB sensitivity loss at 4 kHz. How to ensure the synchronization of the two hearing aids? How to avoid the left/right balance fluctuating, causing vertigo and other annoyances? How to avoid over-stressing the best ear, so it doesn't quickly become, as bad as the other one? Can the best ear, benefit from some permanent reeducation, maximizing its life expectancy?

A company like Signia (formerly Siemens Hearing Aids) is against a plain Bluetooth wireless link, for synchronizing the two hearing aids. By synchronization, I mean, ensuring that the gains, the compression and the limitation never diverge. The stereo combination needs to remain stable. Signia issued a report saying that the 2.4 GHz waves get stuck inthe head, instead of surrounding the head. Increasing the 2.4 GHz emission power is not considered as a solution by Signia. Now there is the "DieselGate", they certainly don't want a "BlueGate". Attaching the two hearing aids to a common neckloop acting as 2.4 GHz conveyor and spare battery, is not considered as a solution. Getting a proper, reliable digital link exempt of latency between the two hearing devices is currently considered as a Graal, as it allows the required synchronization, and allows hooking many wireless microphones, synthesizing a highly directional, adaptive sound capture.

Medical expertise and supervision is mandatory, because prior applying intense sound levels on a eardrum, one need to be sure that the hearing loss is caused by a mechanical middle ear malfunction. All precautions must be taken for never overloading and for never damaging the delicate auditory (nerve) cells that one can find inside the inner ear, namely the cochlea. A cochlea is not repairable, albeit vaguely re-educable. This is living matter. The fact that there may be a deep hearing loss at a particular frequency never gives the right to boost the sound at such particular frequency. Most of the time, the exact opposite is required. The more you stimulate the ear, with frequencies that don't get properly perceived and discriminated, the more you generate aberrations inside the inner ear (the nerves), and this prevents the brain to correctly understand speech in a noisy environment. Worth noting, completely halting some frequencies may also be a bad practice. The hearing system, and possibly the cochlea in particular, is like a sieve. The filtered Dirac wavelets (case of voiced speech) arrive all mixed to the eardrum, then enter the cochlea acting as a biologic sieve adding energy to the ambient, and such supplementary energy enters the filtered Dirac wavelets, some mysterious way, still to discover. The energized wavelets experience two different interaction paths, just like inside a sieve. There is the substrate interaction path (think about the grid of a sieve, going asymmetrically, brutally, up and down). And there is the wavelet auto-interaction path (think about the grains - the wavelets - bumping against each other, merging into shapes, kind of extrusion could we say, thanks to the pseudo-chaotic energy that's required for them to "move" and "find" a receptacle. This is the way one could regard the cochlea. This is the way one could do innovative biological research. Anyway, that's not so simple, because the ear is also sensitive to pure sine sounds. Pure sine sounds are the contrary of filtered Dirac pulses (wavelets). But, more and more, I'm convinced that the ear feels uncomfortable when meeting pure sine sound. The ear is always in search of harmonics, especially high-order harmonics like we find in formants (voiced speech). The ear prefers finely grained, micro-pulsed sounds like a pink noise is. Pink noise is relaxing. Pure sine is stressing. Listen to the subjective texture difference between a pink noise that's lowpass filtered using a 8th-order Butterworth lowpass filter, and a pink noise that's lowpass filtered using a 8th-order Bessel lowpass filter. The subjective difference, perceived as emotion difference, seems disproportionate. Surely there is a little bit more high frequencies in the Bessel transition band, bu why-o-why is the emotional perceived difference so big? Time-domain symmetry may play a role. The greatest the Bessel filter order, the less difference there is between a forward filtered wavelet, and a reverse filtered wavelet. Kind of clue. There can be organized chaos, and there can be biased chaos. The ear may excel in discriminating the two kinds. Statistics govern adaptive filters (see the Widrow-Hoff LMS algorithm). The LMS algorithms is incredibly simple. It is a FIR filters, wired as analyzer, seeing the input and the output of a given filter, any kind of filter provided its transition time is reasonably short. The coefficients of such FIR filter are allowed to update themselves, by integrating the successive errors that are existing between a) what they materialize together, as community, as FIR filter (fundamental Gestalt concept in German language, kind of emerging consciousness), and b) the process, the other filter that they are monitoring. After a couple of milliseconds, the FIR LMS adaptive filter behaves exactly like the process it is monitoring. One can stop the learning process anytime, like when it becomes clear that the FIR filter is perfectly reproducing the process it is monitoring. The process (the human) can take some holiday, getting replaced by the FIR filter (the robot). This is to say how "intelligent", some straightforward structure can be. The ear probably embeds a variant, specialized in monitoring sound wavelets (filtered impulses), and building into our mind (our consciousness) a very accurate model of the "process" that originated the data. Fortunately, there is a welcomed help. We, humans, embed into ourselves a sound emitting process. Our mouth. And the brain department that's controlling it. Thus, a close copy of the process that the ear is monitoring, that the ear is trying make emerge as to duplicate, is always available, at hand. Thus, a whole class of hearing impairments may be caused by a malfunction or de-generescence in wiring that's existing between the mouth and the ear. You may hear parasitic sounds, that are not physically existing. This is tinnitus. The best cure against tinnitus, is to listen to the sea, kind of pink noise, possibly FIR-filtered for time-domain shaping the wavelets a certain way. There are (legendary?) cases, exhibiting "audible" tinnitus. This is like physical sound going *out* of the ear. Possibly this can be the eardrum tensor muscle, vibrating. Possibly this can be the stapedius muscle, vibrating. This remains unclear. The ear feels "out of equilibrium" upon brutally receiving less sound. Try this at home. Listen to loud music, say 85 dB(A) at your ears. Then rapidly dim the volume, by say 12 dB. During 0.1 or 0.2 sec, your ear will feel like starving. It will still hear the dimmed sound, but will report a completely wrong frequency. In case there is a singer, it will appear as having changed the key. It takes a small second for the ear to regain its equilibrium. Albeit this being known, such phenomenon remains hardly measurable and quantifiable, because of remaining essentially perceptual.

I found various kinds of compressors on Flowstone. Obviously the only good ones (for such purpose) are side-chain compressors. They are the ones that can implement a precise compression ratio above a certain threshold. Am I right?

On Flowstone forum, I found various side-chain compressors.
They all operate in the "decibel domain" in the "blue domain". Which means, there is a lin2db "blue block" required, and there is also a db2lin "blue block" required. Martin Vicanek designed a lin2dbn routine (log10(x) Approximation) written in x86 SSE assembly, providing a 1e-7 precision. Martin Vicanek designed a db2lin routine (10^x Approximation) written in x86 SSE assembly, providing a 1e-7 precision. There is also a lin2dB "blue block" (rustyou 2-band compressor) implemented in DSP code (probably coming from the Synthmaker era) basing on a 2048-entry (or 1024-entry?) table. There is also a db2lin "blue block" (again, the rustyou 2-band compressor) basing on a x86 assembly routine that's approximating (Pow(x,n)).
They all rely on a RMS detector (envelope generator) or PEAK detector (envelope generator). Of course they deliver a lot of output ripple. They only get "clean" when they react very slowly. There are various implementations. Some implementations are plagued by internal noise, or by internal DC drift. The RMS envelope generator I have relied on, got probably designed by Martin Vicanek. This is the one organized as feedback loop, true RMS thus, permamently computing the (a^x Power Approximation) that Martin Vicanek authored. The layout looks seductive. Such RMS detector works very well, provided the audio level is above - 40 dB (ref zero dB digital). When the audio level is below -47 dB, (ref zero db digital), such RMS detector pretends that there is still audio at -47 dB. It never goes below -47 dB. It says -47 dB, even when there is no audio at the input.

I can't rely on such RMS detector anymore, because the compression threshold must be set to -60 dB (ref zero dB digital), most of the time. I know this looks crazy, but please remember that the social comfort level is approx 60 dB SPL, which is 60 dB below the 120 dB SPL that's corresponding to the zero dB digital.

While writing this, I realize that in the context of multiband systems, say 8 frequency bands, each covering something like 1 octave, a Hilbert filter will shine, providing two outputs whose phase differences are 90 degree +/- 5 degree in the considered octave. It then suffices to compute the square root of sum of the squares, in time domain, for knowing the envelope. Kind of envelope of course, because the time domain signal has lost its shape. But remember, the time-domain signal shape it is essentially sinusoïdal, as the frequency band is only covering 1 octave. I guess there will be less than 5% ripple at the output. Martin Vicanek, are you there?

By the way, I am confused by the vocabulary that's in use. Instead of finding a high-order Bessel lowpass filters, removing the huge RMS detector ripple, or removing the huge PEAK detector ripple, one is finding various kinds of "enveloppe followers" that are exploiting "ballistics" aka "attack time" and "release time" behaving non-linear (containing comparison instructions), written in x86 SSE assembly.

Then, quite a surprise, despite the high degree of sophistication, I could not find a single "Transfer Function" incorporating the user gain (to be considered as known), for allowing such "Transfer Function" do deal, not only with the compression, but also, with the limiting.

Oh, I forgot to mention, the 'envelope follower" I am talking above, doesn't take place between the RMS detector and the "Transfer Function" aka attenuation governor. The "envelope follower" takes place at the end of the side-chain that's governing the gain. In other words, the "envelope follower" is placed between the "Transfer Function", and the "db2lin" block that's issuing the attenuation coefficient, to be exploited by the multiplier that's actually doing the attenuation.

I remain astonished by the sophistication, and deceived by the real-world performance.

Is there a Hilbert filter I can base on, as RMS detector or PEAK detector replacement?
What is the best square rooth calulation in "blue domain" ?
Is there a square root in x86 SSE2? If yes, can Flowstone access it?

I guess I know why one cannot find a high order Bessel filter in "blue domain", processing the output of a RMS detector or PEAK detector. Being forced to operate in "blue domain", clocked at Fs, its Fc will appear very low, provoking noise and saturation.

Please allow me to suggest a general solution. One should opt for the "The Hal Chamberlin's digital state variable filter" that' got presented in his book entitled "Musical Applications of Microprocessors" dating back from 1985. There is an article dating back from 2003, about such topology.
Musical Applications of Microprocessors (Hal Chamberlin).jpg
Musical Applications of Microprocessors (Hal Chamberlin).jpg (42.22 KiB) Viewed 1010 times

https://www.earlevel.com/main/2003/03/02/the-digital-state-variable-filter/
As you can see, this is a strict adaptation of the analog State variable Filter. The differences are minimal. The integrating capacitors are now delays. The resistors that are defining the time constant along with the integrating capacitors, are now a gain (or attenuation).
It suffices to put two such blocks in series, and setting the two Fc and two Q , for implementing a 4th-order Bessel lowpass filter, whose - 20 dB frequency is equal to the lowest frequency of the band that's processed.
Thus, as a rule of thumb, a band that would process 63 Hz to 125 Hz, requires a Hilbert filter causing less than 5 degree phase error between 63 Hz and 125 Hz (it may eventually get implemented using a the Hal Chamberlain digital state variable filter), requires a square root x86 SSE routine, and requires a 4th-order Bessel lowpass filter whose -20 dB frequency is 63 Hz, meaning it may be a Bessel lowpass filter cutting at 10 Hz, something easy for a 32-bit Chamberlain filter operating at 48 kHz.
A band that would process 500 Hz to 1 kHz, requires a Hilbert filter causing less than 5 degree phase error between 500 Hz and 1 kHz (a Hal Chamberlain digital state variable implementation is not required), such band requires a square root x86 SSE routine, and such band requires a 4th-order Bessel lowpass filter whose -20 dB frequency is 500 Hz (a Hal Chamberlain digital state variable implementation is not required).
Quite reassuring in such conception, is that in case the detector is a RMS detector, there are no harsh "attack time" and "decay time" elements. Everything is smooth and progressive, no hiccups, quasi-linear, and computed "in blue". And no de-zipping required.
steph_tsf
 
Posts: 248
Joined: Sun Aug 15, 2010 10:26 pm

Re: Audio Compression & Limiting - low CPU%

Postby steph_tsf » Wed Feb 19, 2020 2:48 am

the old synthmaker backup site

I guess I found the compressors you are referring to.

Unfortunately, the RMS detector or PEAK detector is missing.
Hence the filename "gain reduction".
And, there is no limiter, embedded in such transfer function.
CytoSonic, the author, wrote "this has not been integrated into the transfer function to allow the use of "creative" compression methods: side-chaining, ducking, etc".

Can somebody tell what "ducking" is?

6385_gainReduction(cytoSonic)update1.osm
(37.61 KiB) Downloaded 101 times

6379_gainReduction(cytoSonic).osm
(28.71 KiB) Downloaded 86 times

I can understand such transfer function implementation, to be very precise in the context of adjusting the compression threshold between -26 dBFS and 0 dBFS.
Unfortunately I need a guaranteed precision in the context of adjusting the intervention threshold from -60 dBFS to -30 dBFS. The real improvement I need, is a RMS detector or PEAK detector that's remaining accurate down to -60 dB FS, along with a transfer function that's doing the compression starting from - 60dBFS, doing the main amplification (say 30 dB to give an idea), and doing the limiting.

I recently realized that in the context of compressing the dynamic of narrow octave bands, that the winning combination may be a Hilbert filter, a square root routine (in blue), and a digital state variable filter (Chamberlin).
Following such simplicity, I see no drawback in installing a limiter, completely separate, after the gain, after the compressor. I'll try lighting a LED, lighting when the compressor threshold is exceeded, and another LED when the limiter threshold is exceeded. Hope the two LEDs won't cause a major CPU% increase.

By the way, in X86 SSE assembly, how to organize the "hops", for the execution beginnings to get evenly spread in the available time?
steph_tsf
 
Posts: 248
Joined: Sun Aug 15, 2010 10:26 pm

Re: Audio Compression & Limiting - low CPU%

Postby wlangfor@uoguelph.ca » Wed Feb 19, 2020 3:19 pm

ducking allows sensing something and providing a certain amount of decibels which will automatically be reduced before compression etc. sometimes it's used before makeup.
User avatar
wlangfor@uoguelph.ca
 
Posts: 792
Joined: Tue Apr 03, 2018 5:50 pm
Location: North Bay, Ontario, Canada

Re: Audio Compression & Limiting - low CPU%

Postby steph_tsf » Wed Feb 19, 2020 5:00 pm

ducking allows sensing something and providing a certain amount of decibels which will automatically be reduced before compression etc. sometimes it's used before makeup.
I am afraid, my IQ and my audio culture prevent me from understanding. Sorry. What is "something", what is "providing a certain amount of decibels", and what is is "makup"? Don't waste your time in trying to explain. I'll handle. And come back soon.
steph_tsf
 
Posts: 248
Joined: Sun Aug 15, 2010 10:26 pm

PreviousNext

Return to General

Who is online

Users browsing this forum: RJHollins and 15 guests