Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generated Audio with a lot of noise #69

Open
BuJiaLaDiii opened this issue Dec 12, 2024 · 4 comments
Open

Generated Audio with a lot of noise #69

BuJiaLaDiii opened this issue Dec 12, 2024 · 4 comments

Comments

@BuJiaLaDiii
Copy link

Hi David,
This is an exciting project for me. However, when I ran the code [simulate_trajectory.py] in the examples folder, the result didn’t sound great—it seemed to have some grainy noise. Below are the parameters I used:

T60 = 0.1
att_diff = 50.0
att_max = 50.0
I also tried different T60 values, such as 0.6 and 0.5, as well as varying room_sz, but the results still didn’t sound right. Here are the room sizes I tested:

room_sz_1 = [3, 4, 2.5]
room_sz_2 = [5, 6, 4.5]
room_sz_3 = [7, 8, 6.5]
room_sz_4 = [50, 50, 50]
Could you help me figure out what might be causing this issue?

Here is the audio I generated https://www.filemail.com/d/labjxdizwkmilwf

@BuJiaLaDiii
Copy link
Author

By the way the audio sound very loud, be careful !!!
I use this code, which can resolve this problem
max_num = np.max(np.absolute(filtered_signal))
sig1 = filtered_signal/max_num* 32767

@DavidDiazGuerra
Copy link
Owner

It seems like the result is not noisy but it's just being clipped when exporting to wav. With most libraries you need to ensure that your signal is always between -1 and 1 before exporting the audio. That's the reason the problem dissapeared when you normalized it.

@BuJiaLaDiii
Copy link
Author

Thank you for your response. My explanation was not clear enough. What I meant is that, even after normalizing the audio to address the issue of excessively high loudness, there is still a problem with the audio being clipped .

import numpy as np
import matplotlib.pyplot as plt
from scipy.io import wavfile
import time
import soundfile as sf

import gpuRIR
gpuRIR.activateMixedPrecision(False)

fs, source_signal = wavfile.read('/datasets/AVSBench/v1m/vYGVDaZan_8_1/audio.wav')
print(np.max(np.abs(source_signal[:, 0])),fs) # 9471 16000

source_signal = source_signal[:, 0]

room_sz_4 = [20,20,20]

traj_pts = 64
pos_traj_1 = np.tile(np.array([1.5,3.0,1.0]), (traj_pts,1))
pos_traj_1[:,1] = np.linspace(0.2, 2.7, traj_pts)

nb_rcv = 2
pos_rcv = np.array([[1.45,0.01,1.5],[1.55,0.01,1.5]])	 # Position of the receivers [m]
orV_rcv = np.array([[-1,0,0],[1,0,0]])
mic_pattern = "card" # Receiver polar pattern

T60 = 0.05 # Time for the RIR to reach 60dB of attenuation [s]
att_diff = 30.0	# Attenuation when start using the diffuse reverberation model [dB]
att_max = 60.0 # Attenuation at the end of the simulation [dB]



pos_traj_2 = np.tile(np.array([1.5,3.0,1.0]), (traj_pts,1))
pos_traj_2[:,1] = np.linspace(2.7, 5.2, traj_pts)

pos_traj_3 = np.tile(np.array([1.5,3.0,1.0]), (traj_pts,1))
pos_traj_3[:,1] = np.linspace(5.2, 8.5, traj_pts)

def create(room_sz, pos_traj, i, global_max):
    beta = gpuRIR.beta_SabineEstimation(room_sz, T60)
    Tdiff = gpuRIR.att2t_SabineEstimator(att_diff, T60)
    Tmax = gpuRIR.att2t_SabineEstimator(att_max, T60)
    nb_img = gpuRIR.t2n(Tmax, room_sz)
    RIRs = gpuRIR.simulateRIR(room_sz, beta, pos_traj, pos_rcv, nb_img, Tmax, fs, Tdiff=Tdiff, orV_rcv=orV_rcv, mic_pattern=mic_pattern)
    filtered_signal = gpuRIR.simulateTrajectory(source_signal, RIRs)
    data = filtered_signal.astype(np.int16)
    wavfile.write(f"no_normal_{i}.wav",fs,data)

    max_value = np.max(np.abs(filtered_signal))
    print(max_value)
    if global_max < max_value:
        global_max = max_value

    return filtered_signal, global_max

global_max = 0
signals = []

for i, pos_traj in enumerate([pos_traj_1, pos_traj_2, pos_traj_3], start=1):
    filtered_signal, global_max = create(room_sz_4, pos_traj, i, global_max)
    signals.append(filtered_signal)

for i, signal in enumerate(signals, start=1):
    sig1 = signal / global_max * 32767
    wave_data_a = sig1.astype(np.int16)
    wavfile.write(f'normal_{i}.wav', fs, wave_data_a)
    plt.plot(signal)
    plt.title(f"Audio Signal {i}")
    plt.show()

print(f"Global Max Value: {global_max}")

#print
# 9471 16000
# 429.41535210609436
# 115.81631326675415
# 97.17903812229633
# Global Max Value: 429.41535210609436

https://www.filemail.com/d/odztxamjygunonu

@DavidDiazGuerra
Copy link
Owner

Hello, I couldn't run your code but a reverberation time of 0.05 seconds is probably physically impossible in such a big room. I'm not sure if requesting that might generate some artifacts, you're probably getting RIRs that only contain the direct path. Can you send me the audio you're using as source signal so I can run the code?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants