Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
101 changes: 101 additions & 0 deletions doc/BufVoiceAllocator.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
:digest: Buffer-Based Dynamic Voice Allocation
:species: buffer-proc
:sc-categories: Libraries>FluidDecomposition, UGens>Buffer
:sc-related: Guides/FluidCorpusManipulation, Classes/SinOsc
:see-also: BufSines, BufSineFeature
:description: Dynamic frame based voice allocation on buffers
:discussion:
This process takes in buffers of related frequency and magnitude data, and just like :fluid-obj:`BufSines`, first tracks them as peaks to check if they are a continuation of a previous peak.

After this track assignment, the number of peaks is capped to the user-defined ``numVoices`` in order of lowest frequency or loudest magnitude. The final step then assigns these peaks to voices and tracks their states.

:process: The non real time version of the object.
:output: The names of the three buffers in which the data has been stored: [0] is a buffer containing the frequencies, [1] is a buffer containing the magnitudes and [3] is a buffer containing their respective states. Each buffer has a channel per voice.

:control voiced:

The buffer in which to store the processed voice state data.

:control birthhighthreshold:

The threshold in dB above which to consider a peak to start tracking for the high end of the spectrum. It is interpolated across the spectrum until birthlowthreshold at DC.

:control birthlowthreshold:

The threshold in dB above which to consider a peak to start tracking for the low end of the spectrum. It is interpolated across the spectrum until birthhighthreshold at half-nyquist.

:control freqed:

The buffer in which to store the processed frequency data.

:control frequencies:

The buffer from which to take frequency data from.

:control magned:

The buffer in which to store the processed magnitude data.

:control magnitudes:

The buffer from which to take magnitude data from.

:control maxnumvoices:

Up to how many voices can be reported, by allocating memory at instantiation time. This cannot be modulated.

:control mintracklen:

The minimum duration, in frames, for a track to be considered for a voice. It allows the removal of bubbly pitchy artefacts, but is more CPU intensive and might reject quick pitch material.

:control numchansa:

For multichannel srcBuf, how many channels should be processed from the first buffer.

:control numchansb:

For multichannel srcBuf, how many channels should be processed from the second buffer.

:control numframesa:

How many frames should be processed from the first buffer.

:control numframesb:

How many frames should be processed from the second buffer.

:control numvoices:

The number of voices to keep track of and output. It is capped by ''maxnumvoices''.

:control prioritisedvoices:

The order in which to prioritise peaks for voice assignment if an input array is bigger than ''numvoices''.

:control startchana:

For multichannel srcBuf, which channel should be processed first for the first input buffer.

:control startchanb:

For multichannel srcBuf, which channel should be processed first for the second input buffer.

:control startframea:

Where in the srcBuf should the process start, in samples, for the first input buffer.

:control startframeb:

Where in the srcBuf should the process start, in samples, for the second input buffer.

:control trackfreqrange:

The frequency difference allowed for a track to diverge between frames, in Hertz.

:control trackmagrange:

The amplitude difference allowed for a track to diverge between frames, in dB.

:control trackprob:

The probability of the tracking algorithm to find a track.
59 changes: 59 additions & 0 deletions doc/VoiceAllocator.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
:digest: Dynamic Voice Allocation
:species: transformer
:sc-categories: Libraries>FluidDecomposition
:sc-related: Guides/FluidCorpusManipulation, Classes/SinOsc
:see-also: Sines, SineFeature
:description: Dynamic frame based voice allocation.
:discussion:
This process takes in arrays of related frequency and magnitude data, and just like :fluid-obj:`Sines`, first tracks them as peaks to check if they are a continuation of a previous peak.

After this track assignment, the number of peaks is capped to the user-defined ``numVoices`` in order of lowest frequency or loudest magnitude. The final step then assigns these peaks to voices and tracks their states.

:process: The control rate version of the object.
:output: An array of three control streams: [0] is the frequency of each voice, [1] is their respective magnitudes, and [2] is their respective states. The latency between the input and the output is 0 samples.


:control in:

The input to be processed

:control numVoices:

The number of voices to keep track of and output. It is capped by ``maxNumVoices``

:control prioritisedVoices:

The order in which to prioritise peaks for voice assignment if an input array is bigger than ``numVoices``.

:control birthLowThreshold:

The threshold in dB above which to consider a peak to start tracking, for the low end of the spectrum. It is interpolated across the spectrum until birthHighThreshold at half-Nyquist.

:control birthHighThreshold:

The threshold in dB above which to consider a peak to start tracking, for the high end of the spectrum. It is interpolated across the spectrum until birthLowThreshold at DC.

:control minTrackLen:

The minimum duration, in frames, for a track to be considered for a voice. It allows to remove bubbly pitchy artefacts, but is more CPU intensive and might reject quick pitch material.

:control trackMethod:

Currently not implemented as
The algorithm used to track peak continuity between frames. 0 is the default, "Greedy", and 1 is a more expensive [^"Hungarian"]( Neri, J., and Depalle, P., "Fast Partial Tracking of Audio with Real-Time Capability through Linear Programming". Proceedings of DAFx-2018. ) one.

:control trackMagRange:

The amplitude difference allowed for a track to diverge between frames, in dB.

:control trackFreqRange:

The frequency difference allowed for a track to diverge between frames, in Hertz.

:control trackProb:

The probability of the tracking algorithm to find a track.

:control maxNumVoices:

Up to how many voices can be reported, by allocating memory at instantiation time. This cannot be modulated.
40 changes: 40 additions & 0 deletions example-code/sc/VoiceAllocator.scd
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
code::

b = Bus.audio(s);

(
c = {var input = In.ar(b);
var sines = FluidSineFeature.kr(input,5);
var voices = FluidVoiceAllocator.kr(sines[0], sines[1], 6);
SendReply.kr(Impulse.kr(1),"/sourcesines", [sines ++ voices].flat);
input.dup;
}.play;
)

(
o = OSCFunc({
arg msg;
"freqI + magI ".post; msg[3..12].round(0.01).postln;
"freqO + magO ".post; msg[13..24].round(0.01).postln;
"voice states ".post; msg[25..].round(1).postln;
},"/sourcesines");
)

// observe the voices as you add sines...
d = {Out.ar(b,SinOsc.ar(440,mul: 0.1))}.play
e = {Out.ar(b,SinOsc.ar(550,mul: 0.05))}.play
f = {Out.ar(b,SinOsc.ar(330,mul: 0.1))}.play
g = {Out.ar(b,SinOsc.ar(220,mul: 0.15))}.play
h = {Out.ar(b,SinOsc.ar(110,mul: 0.2))}.play

// or remove them
e.free
d.free
g.free

// add 6 sines too quiet too high so should not voice steal the looudest lowst 2 remaining

i = {Out.ar(b,SinOsc.ar(700.series(750,950),mul: 0.02).sum)}.play
i.free

::