Realtime Audio Normalisation


Haiku could benefit from implementing an automatic real time audio normalization system (i.e. An automatic volume adjustment system) at the mixer level on a per channel basis. It could utilise adjustable normalization schemes based upon given scenarios or channel purposes.


Real time audio normalization could be implemented at the system mixer level in Haiku. This system would be designed to boost or soften sounds produced throughout the system to suit listener tastes or given conditions. E.g. Large quantities of digital media have extremely low sound levels throughout. But it is also impractical to continually adjust the volume of the player application. With this system in place, the digital media's volume could be scaled to an acceptable volume, while not affecting other audio playback applications in the system or requiring user intervention when switching to digital media with appropriate audio levels.

An easily accessible volume range setting, incorporating a maximum volume and a preferred volume would be ideal. The maximum volume setting should be exactly that. No sound should ever break that level, primarily for comfort reason, but also for added safety. (A DrDobbs column relates the story of one user with headphones playing soft music. They received an extremely high volume sound clip and was left almost deaf). The system should adjust the volume as closely to the preferred volume level as possible at all times, without ever breaking the maximum volume.

While this system will allow quite reasonable performance, the addition of normalization schemes would allow for greater control over sound levels. These schemes would encapsulate settings such as delay times, how quickly the volumes scale and how greatly the volumes scale. These schemes could then be saved and applied to different situations. For example there could be one for watching video which does not alter sounds such as a gun crack in a suspenseful scene, another for listening to music and another for TV designed to soften loud commercials. Schemes would be applied on a per channel basis, possibly using a "default scheme" unless otherwise specified. The direct manipulation of schemes should not be necessary for the system's use, it should be provided as a means for advanced users to fine tune rather than as a general tool.

The possibility of saving much of this information into attributes looks promising. It could be quite effective to store, for example, normalization schemes as attributes in each file of a users music collection, providing a customized normalization for each. This could of course be achieved similarly for all media and extended also into audio applications, saving the last normalization settings as attributes of the binary.

When considering system alerts and other such sounds, it could be possible to make use of a dynamic normalization scheme which could utilise information such as the current average volume. This way sounds from sources such as system alerts do not provide a great disruption to the user.

One way to implement such a system, would be to make use of VST plugins at the mixer level. Using a preset number of plugins per channel as well as for mixer outputs it would be quite easy to develop a normalization system with support from additional kits from Haiku.