-
-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request: Offer audio normalization filter #45
Comments
As a heads up this is currently not a High Priority thing for me to figure out as a standalone option in the GUI. Because it should be able to add audio track changes and filters via the "Custom ffmpeg options". Filters are based on the output stream number, which can be found beside the audio tracks at their start in the format "incoming_stream_#:outgoing_stream_#" In the above example the Stereo track outdex is Though for loudnorm specifically it looks like for files it's highly recommended to do dual pass on them and can use the ffmpeg-normalize tool for it. |
Yeah I totally agree. My intention behind this feature request was that Fastflix in its current state tempts you (I was actually thinking of newbies) to do downmixing without any clipping filters and that is a really really bad idea. Maybe a temporary solution could be to display a small hint when selecting downmixing, like "We discourage you to downmix without clipping filter". Otherwise there is the danger of people messing up their media lib because they just didn't know better. (Somehow like people transcoding HDR stuff with handbrakes 10-bit encoder. Quite a similar pitfall.) |
Interesting, I thought FFmpeg's built in down-mixing was considered good by default / adheres to the ATSC standards? https://trac.ffmpeg.org/wiki/AudioChannelManipulation |
Hmm I'm a bit confused. I remember that I ve read this page too. And I agree it says they stick to ATSC standard. The standard says that it prevents any overflow. So no clipping. I'm not certain if I just got it wrong or if there was another reason for me to stick to that normalization filter. However, then it seems to be not as important as I said in my last comment. |
If you find more info please send it my way, I don't know much on the audio side of things and want to provide best / safest defaults as possible 😄 |
Please check out these interesting discussions about "correct" FFMPEG stereo downmixing (and relative approaches):
Hope that helps. |
So maybe instead of some advanced filter system / page, might be easier to just add a few more downmix options / filters and allow the user to generate their own and save them? For example, could add the nightmode stereo, stereo with LFE, and a 2.1 ("nightmode" stereo + LFE) Then have a list in the config file with those as examples and allow more to be added? |
Sorry closed by mistake lol |
After some quick experimentation, the custom filters will probably work for stereo only. "2.1" is a bad idea. Codecs like AAC don't have a notion of "2.1" so if you do a pan filter to 2.1 with FL+FC+LFE it actually maps those (in order) to C+FL+FR it seems. |
Agree, downmix to (hq) STEREO is the way to go. |
OK, here are the tebasuna51' suggestions about (proper) multichannel audio downmixing with FFMPEG:
Hope that inspires ! Source: Audio encoding @ Doom9's Forum |
It would be extremely interesting (and almost unique) to implement a "replaygain-based" normalization. Some interesting resources: (python) Hope that inspires ! |
Bump. Just found some interesting infos about Audacity approach...
Maybe is possible to implement those through FFMPEG... EDIT:
EDIT2: EDIT3:
Hope that inspires ! |
Thanks for the mention. The only thing I want to leave here is that 2-pass normalization is the preferred way of doing it. Essentially you just parse the first pass output and apply it to the second pass. The complex part is handling multiple audio streams, channel layouts (as mentioned above), metadata, etc., and the fact that not all files and settings will yield nice results. Since this project uses Python it may be a simple call to |
Thanks for the mention here! Just want to point out that my commands are a mess. :p the preferred way is to use 2-pass normalization (as @slhck I my case, FFnorm uses a simpler audio gain command for faster/simpler manual normalization (I couldn't get 2-pass normalization to work as fast) as it mainly focuses on normalizing hundreds of media file. the actual the other stuff is just me trying to preserved as much metadata Also, please don't use In the actual code I use either of them depends on the situation. Sorry if I make any mistakes, I'm still a noob when comes to FFmpeg commands. |
Hi everyone, thanks for all clarifications you've bringed here, I think they will be useful for @cdgriffith's software. 1st of all: I believe it's clear for all that normalization could be performed only AFTER the channels downmixing. According to @cpuimage's FFmpeg_Loudness Overview:
So I would choose it by default (the first pass could be performed by default if the source is stereo or right after the downmix process, even without asking users) but, for normalization algorithms, the most reasonable approach might be implementing only some "commonly used" (as channel downmixing) ones. Here are some standards listed by @nlebedevinc in his Audio normalization git:
Some has been already implemented (as XML) by @hz37 in his r128v3 project: Hope this knowledge exchange will benefit all projects. OT |
That quoted explanation (and the entire repository) seems to have been created by ChatGPT, so take that with a grain of salt. It's at least a bit misleading in terms of what it counts as different "standards" — ITU-R BS.1770 is what simply defines LUFS and common target levels, and EBU R 128 is implementing BS.1770 for its underlying measurements. The de-facto industry standard is R 128. |
You're probably right, anyway it gives an idea of how many approaches to audio normalization has been developed.
...does this means that 1st-pass (detection/measurement) can be the same for all ?
I see. Thanks for your efforts ! note: there's another interesting Command line helper for performing audio loudness normalization with ffmpeg's loudnorm audio filter that aims to be a simpler alternative to the ffmpeg-normalize Python script 'cause it Performs the loudness scanning pass of the given file and outputs the string of desired loudnorm options to be included in ffmpeg arguments (by @indiscipline). |
All of them, ideally, because users may want to choose one over the other for particular use cases.
This one is quite simple, yes, and should do the job, but it does not check for the various edge cases in terms of option thresholds etc. Essentially it's similar to the ffmpeg-supplied one: https://github.com/FFmpeg/FFmpeg/blob/master/tools/normalize.py |
No, I meant Peak, RMS, and R 128. Both peak and RMS are super easy to compute and implement (just beware of clipping), and R 128 uses BS.1770 under the hood. (AES41 has nothing to do with normalization; it's a standard for embedding of metadata. I guess you can thank ChatGPT for causing the confusion.) |
You're right, here's the AES page about normalization where - in the Normalization Targets section - they refers to ANSI A/85, EBU R128 and AES71 standards. Anyway Peak, RMS, and R 128 seems to be sufficient for all. Thanks again. |
@slhck: there are two different presets in r128v3... EBU R128:
And EBU R128 conservative:
...do you think are both useful ? The other interesting @hz37's preset approach is the Loudness check one: it switches the normalizer into loudness checker by just setting the ...dunno if @cdgriffith is interested in implementing a dedicated panel to let users customize all those parameters (would be great for flexibility, even if almost useless for many), but "injecting" such presets into FF's config file shouldn't be that difficult. |
I think it would be fine to keep the default ffmpeg settings and allow the user to override the individual filter values. The XML preset options are also nice but require a different backend as far as I can tell. I am not passionate about this, in any case I am just weighing in here without really contributing :) |
Same here (since I HAVE NO LONGER CODED for last 25 years), but I think discussing these techniques/approaches is a contribution - not just for FF - itself. Let's wait for the @cdgriffith's opinion about. Thanks |
A lot of these options sound cool, but way past any scope I envisioned for FastFlix I must admit! For options that could be added as audio filters, I am currently working towards being able to add those, with #551 finally giving a dedicated audio conversion popup, where those could live. Things that would need multiple passes would preset an issue, if the video encode isn't set to two pass as well, as FastFlix is heavily video first still (as in how the internal code is driven.) If there are ways I can incorporate any of these as check boxes and just drop them into the ffmpeg command (or nvencc if possible) would be ideal. Best way to help with that is give straight forward ffmpeg examples! (Like @Type-Delta Anything requiring a full page is not out of the question from the "could we ever see it in FastFlix?" side, but sadly is way outside the time I have to put towards this project myself, and would have to be done by other contributors. |
1st of all, an heartfelt thank to @cdgriffith who confirms to be a developer that carefully listens to the community's needs/opinions. Methods aside, the most important thing about audio normalization is to restrict its application to the "right" signals: it could be done on stereo ones, but not on multichannel ones. About multipass normalization: well, if it doesn't upset the current coding chain too much, the 1st step - analysis/measurements - should be performed BEFORE (as the 2nd after) the video stream encoding. flowchart TD;
A[input file audio stream] --> B{is stereo ?};
B -- Yes --> C[audio analysis - 1st pass];
B -- No --> D[perform downmix];
D-->B;
C-->E[video encoding];
E-->F[audio normalization - 2nd pass];
F-->G[audio encoding]
G-->H[mux a/v streams];
(note: flowcharting through markup is fun !) Last but not least, yesterday I've "coded" (with ChatGPT help) this batch script which performs the FFMPEG's 2pass-audio-normalization correctly:
(note1: the -drc_scale 0 parameter is for better - aka full range - AC3 decoding only, so FFMPEG automatically skips it for other audio codecs Let me know, in the end, if you need me to modify it to replicate - video coding aside - the flowchart above. Hope that helps/inspires ! EDIT: interesting - and simple - page about Audio normalization with FFmpeg |
Not quite sure what you mean by
Oh, the ffmpeg script has been recently updated (after more than 10 years), haven't seen the new version yet. It's still not as user-friendly as ffmpeg-loudnorm-helper and has most options hardcoded. It also performs the conversion which is against separation of concerns principle and permits integrating into dynamic workflows. The script is more of a usage example than a tool. Regarding the script above: there's no point in creating a temporary Regarding the flowchart, I suppose the correct approach would be to process audio in parallel. Moreover, it would be convenient for a proper user-facing tool to perform the loudness measurement in the background as soon as the option is selected, as it's not exactly a fast process. The results can be cached as they won't change unless the input file changes. ffmpeg's loudnorm filter is rather capricious and falls back to dynamic normalization quickly, which is often undesirable, as it changes the dynamics of the audio and may introduce unwanted loudness fluctuations around drastic changes in loudness in the source material. On the other hand, it usually performs good enough and having a standard tool with more or less predictable behaviour is better than learning the nuances and corner cases of a novel reimplementation. Since FastFlix seems to be built around ffmpeg it's only natural to stick to its normalization capabilities. |
This is what I was referring to. Basically trying to detect when it would fall back to dynamic mode based on the measured variables and user-set thresholds, and at least warning the user that dynamic mode will be used. It's definitely not required and complicates the code, but some users have asked for it. |
In this case you weren't completely correct as ffmpeg-loudnorm-helper does check that, although the check is as basic as it gets. It just computes the available threshold (delta between the current and requested TP levels) and sees if it's enough of a headroom for the requested linear Integrated Loudness change. It does warn the user in case there's no headroom but I haven't done any rigorous testing to determine if it catches all the cases of ffmpeg switching to Dynamic Normalization. The way to make it solid is to just copy the way ffmpeg itself triggers the switch. |
Hi everybody, just found this (GPL-licensed) standalone audio batch processing software that embeds:
Code: https://sourceforge.net/p/lastar/code/HEAD/tree/ Since I'm not a developer, I honestly don't know if/how it can effectively help, but - as a multimedia content creator/editor/publisher - I find that its offered functions are quite interesting (= would like to see them implemented in encoder GUIs like FastFlix). Hope that inspires. |
Since there is an option to downmix audiostreams, it should also be possible to apply at least a basic normalization filter in order to prevent clipping. Eg "-ar {samplingrate} -af loudnorm"
The text was updated successfully, but these errors were encountered: