The train-clean-100, train-clean-360, dev-clean, and test-clean datasets in LibriSpeech, which contain speech from the same speakers, are concatenated, resulting in a total of 1252 utterances (251, 921, 40, and 40, respectively).
- The RIRs for the Training Data and Development Data are generated using
genRIR_ForFrmLvSINR_TrainingData.m
. - The RIRs for the Test Data are generated using
genRIR_ForFrmLvSINR_TestData.m
. - The RIRs for the Sensor-Selection Data are generated using
genRIR_ForFrmLvSINR_NodeSlct_Data.m
.
- To be continued.
The code and datasets are currently under preparation.
(The original plan was to release it before November 20, but I’ve been too busy recently.)
S. Guan, M. Wang, Z. Bai, J. Wang, J. Chen and J. Benesty, "Smoothed Frame-Level SINR and Its Estimation for Sensor Selection in Distributed Acoustic Sensor Networks," in IEEE/ACM Transactions on Audio, Speech, and Language Processing, doi: 10.1109/TASLP.2024.3477277.
Pan, Chao, et al. "An Anchor-Point Based Image-Model for Room Impulse Response Simulation with Directional Source Radiation and Sensor Directivity Patterns." arXiv preprint arXiv:2308.10543 (2023).