-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generated Audio with a lot of noise #69
Comments
By the way the audio sound very loud, be careful !!! |
It seems like the result is not noisy but it's just being clipped when exporting to wav. With most libraries you need to ensure that your signal is always between -1 and 1 before exporting the audio. That's the reason the problem dissapeared when you normalized it. |
Thank you for your response. My explanation was not clear enough. What I meant is that, even after normalizing the audio to address the issue of excessively high loudness, there is still a problem with the audio being clipped . import numpy as np
import matplotlib.pyplot as plt
from scipy.io import wavfile
import time
import soundfile as sf
import gpuRIR
gpuRIR.activateMixedPrecision(False)
fs, source_signal = wavfile.read('/datasets/AVSBench/v1m/vYGVDaZan_8_1/audio.wav')
print(np.max(np.abs(source_signal[:, 0])),fs) # 9471 16000
source_signal = source_signal[:, 0]
room_sz_4 = [20,20,20]
traj_pts = 64
pos_traj_1 = np.tile(np.array([1.5,3.0,1.0]), (traj_pts,1))
pos_traj_1[:,1] = np.linspace(0.2, 2.7, traj_pts)
nb_rcv = 2
pos_rcv = np.array([[1.45,0.01,1.5],[1.55,0.01,1.5]]) # Position of the receivers [m]
orV_rcv = np.array([[-1,0,0],[1,0,0]])
mic_pattern = "card" # Receiver polar pattern
T60 = 0.05 # Time for the RIR to reach 60dB of attenuation [s]
att_diff = 30.0 # Attenuation when start using the diffuse reverberation model [dB]
att_max = 60.0 # Attenuation at the end of the simulation [dB]
pos_traj_2 = np.tile(np.array([1.5,3.0,1.0]), (traj_pts,1))
pos_traj_2[:,1] = np.linspace(2.7, 5.2, traj_pts)
pos_traj_3 = np.tile(np.array([1.5,3.0,1.0]), (traj_pts,1))
pos_traj_3[:,1] = np.linspace(5.2, 8.5, traj_pts)
def create(room_sz, pos_traj, i, global_max):
beta = gpuRIR.beta_SabineEstimation(room_sz, T60)
Tdiff = gpuRIR.att2t_SabineEstimator(att_diff, T60)
Tmax = gpuRIR.att2t_SabineEstimator(att_max, T60)
nb_img = gpuRIR.t2n(Tmax, room_sz)
RIRs = gpuRIR.simulateRIR(room_sz, beta, pos_traj, pos_rcv, nb_img, Tmax, fs, Tdiff=Tdiff, orV_rcv=orV_rcv, mic_pattern=mic_pattern)
filtered_signal = gpuRIR.simulateTrajectory(source_signal, RIRs)
data = filtered_signal.astype(np.int16)
wavfile.write(f"no_normal_{i}.wav",fs,data)
max_value = np.max(np.abs(filtered_signal))
print(max_value)
if global_max < max_value:
global_max = max_value
return filtered_signal, global_max
global_max = 0
signals = []
for i, pos_traj in enumerate([pos_traj_1, pos_traj_2, pos_traj_3], start=1):
filtered_signal, global_max = create(room_sz_4, pos_traj, i, global_max)
signals.append(filtered_signal)
for i, signal in enumerate(signals, start=1):
sig1 = signal / global_max * 32767
wave_data_a = sig1.astype(np.int16)
wavfile.write(f'normal_{i}.wav', fs, wave_data_a)
plt.plot(signal)
plt.title(f"Audio Signal {i}")
plt.show()
print(f"Global Max Value: {global_max}")
#print
# 9471 16000
# 429.41535210609436
# 115.81631326675415
# 97.17903812229633
# Global Max Value: 429.41535210609436 |
Hello, I couldn't run your code but a reverberation time of 0.05 seconds is probably physically impossible in such a big room. I'm not sure if requesting that might generate some artifacts, you're probably getting RIRs that only contain the direct path. Can you send me the audio you're using as source signal so I can run the code? |
Hi David,
This is an exciting project for me. However, when I ran the code [simulate_trajectory.py] in the examples folder, the result didn’t sound great—it seemed to have some grainy noise. Below are the parameters I used:
T60 = 0.1
att_diff = 50.0
att_max = 50.0
I also tried different T60 values, such as 0.6 and 0.5, as well as varying room_sz, but the results still didn’t sound right. Here are the room sizes I tested:
room_sz_1 = [3, 4, 2.5]
room_sz_2 = [5, 6, 4.5]
room_sz_3 = [7, 8, 6.5]
room_sz_4 = [50, 50, 50]
Could you help me figure out what might be causing this issue?
Here is the audio I generated https://www.filemail.com/d/labjxdizwkmilwf
The text was updated successfully, but these errors were encountered: