Audio pipeline dependencies are reduced, bug fixed, room acoustics made more tunable and demo improved #57

qanastek · 2025-07-29T09:36:36Z

No description provided.

…e jupyter notebook.

…on pipeline in the jupyter notebook

- Clean pavel fuction header. - Add datasets as a dependency

…he audios form

qanastek · 2025-07-30T16:51:45Z

Bug found: Do not merge

codecov · 2025-10-24T13:41:36Z

Codecov Report

❌ Patch coverage is 74.16295% with 463 lines in your changes missing coverage. Please review.
✅ Project coverage is 55.19%. Comparing base (756b18b) to head (84f7da4).
⚠️ Report is 166 commits behind head on main.

Files with missing lines	Patch %	Lines
src/sdialog/audio/room.py	70.81%	171 Missing ⚠️
src/sdialog/audio/pipeline.py	44.23%	116 Missing ⚠️
src/sdialog/audio/dialog.py	71.94%	39 Missing ⚠️
src/sdialog/audio/voice_database.py	88.88%	34 Missing ⚠️
src/sdialog/audio/tts_engine.py	36.17%	30 Missing ⚠️
src/sdialog/audio/__init__.py	45.00%	22 Missing ⚠️
src/sdialog/audio/acoustics_simulator.py	78.57%	18 Missing ⚠️
src/sdialog/audio/impulse_response_database.py	78.75%	17 Missing ⚠️
src/sdialog/audio/dscaper_utils.py	87.50%	8 Missing ⚠️
src/sdialog/__init__.py	16.66%	5 Missing ⚠️
... and 2 more

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #57      +/-   ##
==========================================
+ Coverage   46.82%   55.19%   +8.36%     
==========================================
  Files          20       34      +14     
  Lines        4171     5963    +1792     
==========================================
+ Hits         1953     3291    +1338     
- Misses       2218     2672     +454

Files with missing lines	Coverage Δ
src/sdialog/audio/jsalt.py	`100.00% <100.00%> (ø)`
src/sdialog/audio/turn.py	`100.00% <100.00%> (ø)`
src/sdialog/audio/utils.py	`100.00% <100.00%> (ø)`
src/sdialog/audio/processing.py	`96.66% <96.66%> (ø)`
src/sdialog/audio/room_generator.py	`94.59% <94.59%> (ø)`
src/sdialog/__init__.py	`63.08% <16.66%> (-0.79%)`	⬇️
src/sdialog/audio/dscaper_utils.py	`87.50% <87.50%> (ø)`
src/sdialog/audio/impulse_response_database.py	`78.75% <78.75%> (ø)`
src/sdialog/audio/acoustics_simulator.py	`78.57% <78.57%> (ø)`
src/sdialog/audio/__init__.py	`45.00% <45.00%> (ø)`
... and 5 more

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

…inference - Add dsiplay function in AudioDialog - Make __str__ for RecordingDevice - Add some comments to the tutorials - Run all tutorials back - Add local impulse response database in the the tutorial 16 - Add tests data for IR

sergioburdisso · 2025-11-07T16:04:10Z

requirements.txt

 # Used to interface with local or hosted large-language-model backends
 ollama
 openai
+


Better if we move these guys to its own requirements-audio.txt so that is then easy to have pip install sdialog[audio]

I added the requirements-audio.txt and now we need to add the support for pip install sdialog[audio] and add it to the github pipeline.

src/sdialog/__init__.py

src/sdialog/sdialog

tutorials/audio/3.accoustic_simulation-customer_service.ipynb

- Fix weird error in dialog2flow during merging - Update to_audio docs, comment and add an error requiring user to install the audio submodule dependencies. - Update all the relative paths in the tutorials - Fix tutorial 7 not using the downloaded archive - Update .gitignore with new paths - Build the changelog locally

…to-speech" feature - Resample the audio out of the tts when not at self.sampling_rate - Pass extra arguments for tts during inference - Rename the "voice" parameter from the tts generators into "speaker_voice" in order to be unique and prevent clashes with kwargs. - Add comments and simplify the code of the save utterances audios function - Update the documentation - Run the tests - Handle import errors for each BaseTTS subclasses - Remove speed argument from KokoroTTS inference and put it in the constructor - Make the sampling rate from the turns public and saved into the serialized file - Add an alternative for HF TTS in the audio tutorial 1 - Remove useless code in from_turn - Test serialization in both saving and loading for the AudioDialog

sergioburdisso · 2025-11-11T08:39:43Z

.gitignore

-#audio files
 *.wav
 *.mp3
+*.png


We do have pngs to be tracked, like images in the docs and in the README, a wouldn't ignore them. If you have pngs somewhere, may be better to ignore them inside that specific folder, but not globally.

Now we have a subfolder for the audio tutorials, I can simply stop tracking the images in it.

sergioburdisso · 2025-11-11T08:39:52Z

.gitignore

+tutorials/audio/demo_dialog_doctor_patient.json
+tutorials/audio/demo_dialog_doctor_patient_no_age_no_gender.json
+tutorials/audio/customer_support_dialogue.json
+tutorials/audio/=0.9.4


is this one ok? tutorials/audio/=0.9.4

It's caused by this part of the tutorials using kokoro:

pip install -q kokoro>=0.9.4 apt-get -qq -y install espeak-ng > /dev/null 2>&1

sergioburdisso · 2025-11-11T08:40:29Z

.gitignore

+tutorials/audio/dscaper_data_impulse_response/*
+*audio_dialog.json
+./src/sdialog/sdialog
+tutorials/9.generating_data.ipynb


Not sure if it is a good idea to add a tutorial that doesn't exist as part of the global gitignore of the library

It's a tutorial I have locally only for now

qanastek and others added 27 commits July 23, 2025 21:54

Create voice dataset and add room position LLM

34cad6e

Merge branch 'main' of https://github.com/qanastek/sdialog

579e0fd

Handle room accoustics and make optionnal for dscaper / room accoustics

5f65d4e

Merge branch 'idiap:main' into main

d379cc7

Merge branch 'idiap:main' into main

e9add47

Merge branch 'idiap:main' into main

2a4f11e

Add voice database fetching from from HF

8ee8325

Merge branch 'main' of https://github.com/qanastek/sdialog

898753a

Full pipeline completed

dc101cb

Add xvectors to the speaker database

f67756a

Update notebooks

79e3c82

Add the paths to audios in the object

e620fc6

Make the steps 2 and 3 optionnal and add a test of each of them in th…

5a37bd3

…e jupyter notebook.

Make HF voices path dynamic and add description to the audio generati…

e43bf1c

…on pipeline in the jupyter notebook

- Make the import called only when needed for the audio pipeline.

aafa2b9

- Clean pavel fuction header. - Add datasets as a dependency

Fix last import and switch back to kokoro

c66d783

Fix some bugs and make jean zay script for generating 200 dialog in t…

5fc3a12

…he audios form

Add few dummy voice database and slurm script for jean zay

e781bf6

Update

3878e3b

Update

62aa807

Update

e38c322

Improve instructions for installation of dScaper and PyRoomAcoustics

6fc3b9a

Flake 8 pass

8f28e38

Merge branch 'main' into main

a6e8f2a

Flake 8 round 2

148eed7

Merge branch 'main' of https://github.com/qanastek/sdialog

6e7d788

Add metrics and plots scripts

b9c3734

qanastek and others added 2 commits August 18, 2025 19:21

Add last metrics scripts before JSALT oral

9661cd3

Better notebook

05da092

qanastek added 12 commits October 27, 2025 03:20

Add and test the feature for recording devices quality emulation.

c0d59b3

Send IR data

cc408c0

Update IR tutorial

b16fe85

Update documentation with IR content

02f9bd5

Improve coverage of the new features

bc6ee71

Move and rename the function into persona_to_voice and add tests.

c9b812e

Move the code of save_utterances_audios from __init__ to AudioDialog

bd128a5

Make the gain level to match the original audio rather than -10dB

63abab9

Fix tests after removing gain reduction

fead022

Update documentation

04b535d

Fix some bugs

c939c06

qanastek force-pushed the main branch from a609074 to c939c06 Compare October 31, 2025 11:54

Merge branch 'idiap:main' into main

757058b

sergioburdisso reviewed Nov 7, 2025

View reviewed changes

src/sdialog/__init__.py Outdated Show resolved Hide resolved

sergioburdisso reviewed Nov 7, 2025

View reviewed changes

src/sdialog/sdialog Outdated Show resolved Hide resolved

sergioburdisso reviewed Nov 7, 2025

View reviewed changes

tutorials/audio/3.accoustic_simulation-customer_service.ipynb Show resolved Hide resolved

qanastek and others added 6 commits November 7, 2025 18:43

Update before merge

9e663bf

Remove symbolic link

dba2580

Merge branch 'idiap:main' into main

91ede6e

Update the documentation for the custom tts engine

4bd9892

sergioburdisso reviewed Nov 11, 2025

View reviewed changes

Fix the CI/CD and .gitignore

84f7da4

sergioburdisso merged commit 93fd82d into idiap:main Nov 12, 2025
2 checks passed

Audio pipeline dependencies are reduced, bug fixed, room acoustics made more tunable and demo improved #57

Audio pipeline dependencies are reduced, bug fixed, room acoustics made more tunable and demo improved #57

Uh oh!

Conversation

qanastek commented Jul 29, 2025

Uh oh!

qanastek commented Jul 30, 2025

Uh oh!

codecov bot commented Oct 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

sergioburdisso Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

qanastek Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sergioburdisso Nov 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

qanastek Nov 11, 2025

Choose a reason for hiding this comment

Uh oh!

sergioburdisso Nov 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

qanastek Nov 11, 2025

Choose a reason for hiding this comment

Uh oh!

sergioburdisso Nov 11, 2025

Choose a reason for hiding this comment

Uh oh!

qanastek Nov 11, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov bot commented Oct 24, 2025 •

edited

Loading

sergioburdisso Nov 11, 2025 •

edited

Loading

sergioburdisso Nov 11, 2025 •

edited

Loading