diff --git a/doc/BufVoiceAllocator.rst b/doc/BufVoiceAllocator.rst new file mode 100644 index 0000000..8774d6c --- /dev/null +++ b/doc/BufVoiceAllocator.rst @@ -0,0 +1,101 @@ +:digest: Buffer-Based Dynamic Voice Allocation +:species: buffer-proc +:sc-categories: Libraries>FluidDecomposition, UGens>Buffer +:sc-related: Guides/FluidCorpusManipulation, Classes/SinOsc +:see-also: BufSines, BufSineFeature +:description: Dynamic frame based voice allocation on buffers +:discussion: +This process takes in buffers of related frequency and magnitude data, and just like :fluid-obj:`BufSines`, first tracks them as peaks to check if they are a continuation of a previous peak. + + After this track assignment, the number of peaks is capped to the user-defined ``numVoices`` in order of lowest frequency or loudest magnitude. The final step then assigns these peaks to voices and tracks their states. + +:process: The non real time version of the object. +:output: The names of the three buffers in which the data has been stored: [0] is a buffer containing the frequencies, [1] is a buffer containing the magnitudes and [3] is a buffer containing their respective states. Each buffer has a channel per voice. + +:control voiced: + + The buffer in which to store the processed voice state data. + +:control birthhighthreshold: + + The threshold in dB above which to consider a peak to start tracking for the high end of the spectrum. It is interpolated across the spectrum until birthlowthreshold at DC. + +:control birthlowthreshold: + + The threshold in dB above which to consider a peak to start tracking for the low end of the spectrum. It is interpolated across the spectrum until birthhighthreshold at half-nyquist. + +:control freqed: + + The buffer in which to store the processed frequency data. + +:control frequencies: + + The buffer from which to take frequency data from. + +:control magned: + + The buffer in which to store the processed magnitude data. + +:control magnitudes: + + The buffer from which to take magnitude data from. + +:control maxnumvoices: + + Up to how many voices can be reported, by allocating memory at instantiation time. This cannot be modulated. + +:control mintracklen: + + The minimum duration, in frames, for a track to be considered for a voice. It allows the removal of bubbly pitchy artefacts, but is more CPU intensive and might reject quick pitch material. + +:control numchansa: + + For multichannel srcBuf, how many channels should be processed from the first buffer. + +:control numchansb: + + For multichannel srcBuf, how many channels should be processed from the second buffer. + +:control numframesa: + + How many frames should be processed from the first buffer. + +:control numframesb: + + How many frames should be processed from the second buffer. + +:control numvoices: + + The number of voices to keep track of and output. It is capped by ''maxnumvoices''. + +:control prioritisedvoices: + + The order in which to prioritise peaks for voice assignment if an input array is bigger than ''numvoices''. + +:control startchana: + + For multichannel srcBuf, which channel should be processed first for the first input buffer. + +:control startchanb: + + For multichannel srcBuf, which channel should be processed first for the second input buffer. + +:control startframea: + + Where in the srcBuf should the process start, in samples, for the first input buffer. + +:control startframeb: + + Where in the srcBuf should the process start, in samples, for the second input buffer. + +:control trackfreqrange: + + The frequency difference allowed for a track to diverge between frames, in Hertz. + +:control trackmagrange: + + The amplitude difference allowed for a track to diverge between frames, in dB. + +:control trackprob: + + The probability of the tracking algorithm to find a track. diff --git a/doc/VoiceAllocator.rst b/doc/VoiceAllocator.rst new file mode 100644 index 0000000..decedca --- /dev/null +++ b/doc/VoiceAllocator.rst @@ -0,0 +1,59 @@ +:digest: Dynamic Voice Allocation +:species: transformer +:sc-categories: Libraries>FluidDecomposition +:sc-related: Guides/FluidCorpusManipulation, Classes/SinOsc +:see-also: Sines, SineFeature +:description: Dynamic frame based voice allocation. +:discussion: + This process takes in arrays of related frequency and magnitude data, and just like :fluid-obj:`Sines`, first tracks them as peaks to check if they are a continuation of a previous peak. + + After this track assignment, the number of peaks is capped to the user-defined ``numVoices`` in order of lowest frequency or loudest magnitude. The final step then assigns these peaks to voices and tracks their states. + +:process: The control rate version of the object. +:output: An array of three control streams: [0] is the frequency of each voice, [1] is their respective magnitudes, and [2] is their respective states. The latency between the input and the output is 0 samples. + + +:control in: + + The input to be processed + +:control numVoices: + + The number of voices to keep track of and output. It is capped by ``maxNumVoices`` + +:control prioritisedVoices: + + The order in which to prioritise peaks for voice assignment if an input array is bigger than ``numVoices``. + +:control birthLowThreshold: + + The threshold in dB above which to consider a peak to start tracking, for the low end of the spectrum. It is interpolated across the spectrum until birthHighThreshold at half-Nyquist. + +:control birthHighThreshold: + + The threshold in dB above which to consider a peak to start tracking, for the high end of the spectrum. It is interpolated across the spectrum until birthLowThreshold at DC. + +:control minTrackLen: + + The minimum duration, in frames, for a track to be considered for a voice. It allows to remove bubbly pitchy artefacts, but is more CPU intensive and might reject quick pitch material. + +:control trackMethod: + + Currently not implemented as + The algorithm used to track peak continuity between frames. 0 is the default, "Greedy", and 1 is a more expensive [^"Hungarian"]( Neri, J., and Depalle, P., "Fast Partial Tracking of Audio with Real-Time Capability through Linear Programming". Proceedings of DAFx-2018. ) one. + +:control trackMagRange: + + The amplitude difference allowed for a track to diverge between frames, in dB. + +:control trackFreqRange: + + The frequency difference allowed for a track to diverge between frames, in Hertz. + +:control trackProb: + + The probability of the tracking algorithm to find a track. + +:control maxNumVoices: + + Up to how many voices can be reported, by allocating memory at instantiation time. This cannot be modulated. \ No newline at end of file diff --git a/example-code/sc/VoiceAllocator.scd b/example-code/sc/VoiceAllocator.scd new file mode 100644 index 0000000..94038d2 --- /dev/null +++ b/example-code/sc/VoiceAllocator.scd @@ -0,0 +1,40 @@ +code:: + +b = Bus.audio(s); + +( +c = {var input = In.ar(b); + var sines = FluidSineFeature.kr(input,5); + var voices = FluidVoiceAllocator.kr(sines[0], sines[1], 6); + SendReply.kr(Impulse.kr(1),"/sourcesines", [sines ++ voices].flat); + input.dup; +}.play; +) + +( +o = OSCFunc({ + arg msg; + "freqI + magI ".post; msg[3..12].round(0.01).postln; + "freqO + magO ".post; msg[13..24].round(0.01).postln; + "voice states ".post; msg[25..].round(1).postln; +},"/sourcesines"); +) + +// observe the voices as you add sines... +d = {Out.ar(b,SinOsc.ar(440,mul: 0.1))}.play +e = {Out.ar(b,SinOsc.ar(550,mul: 0.05))}.play +f = {Out.ar(b,SinOsc.ar(330,mul: 0.1))}.play +g = {Out.ar(b,SinOsc.ar(220,mul: 0.15))}.play +h = {Out.ar(b,SinOsc.ar(110,mul: 0.2))}.play + +// or remove them +e.free +d.free +g.free + +// add 6 sines too quiet too high so should not voice steal the looudest lowst 2 remaining + +i = {Out.ar(b,SinOsc.ar(700.series(750,950),mul: 0.02).sum)}.play +i.free + +:: \ No newline at end of file