Professional Sound API

Forum thread started by DavieB on Fri, 2010-04-23 15:34

I'm interested to know what API model Haiku will be implementing to allow professional sound applications to be written.

On Windows we have a variety of API's such as WDM from Microsoft, ASIO and VST/VSTi from Steinberg and Rewire from Propellarheads. On apple we have AU and Jack OSX and on Linux we have quite a few but the best appears to be Jack and ALSA. There are other ones from Pro Tools too.

So on Haiku what's it going to be?

A professional level sound API must be low latency to allow realtime dsp of audio and allow the audio to be routable between hardware and software and between software and software etc.

Obviously the appeal of Haiku is it's ultrafast use of multi core processing and it's heritage coming from BeOS, the media OS.

I'm not too concerned about midi since this appears to be already done and in the API, but what about sound? I noticed OSS has come along but is this the best on offer? I realise that hardware support is needed and it seems to come with alot of driver support. But what I am talking about is professional level implementations for IO, routing and dsp processing.

So what are the long term goals for getting audio up to scratch in Haiku and will this become part of Haiku for professional sound and have it's own native API.

Comments

Re: Professional Sound API

Haiku implements the BeOs API

10 years ago BeOs provided extremly low latency capabilities ( a well designed API combined to a good soft-realtime kernel ) much more better than windows/linux or even Mac.
Haiku aims this goal.

the BeOs API documentation for media processing can be found Here :
www.haiku-os.org/legacy-docs/bebook/TheMediaKit_Overview.html

...and you should have a look to the Haiku's Cortex demo program. It shows media processing concepts very well. (IMHO)

Note that the OSS API is not available in Haiku. OSS is just a fall-back solution for missing drivers ( and so provides a good hardware support for a wide panel of sound devices)

Re: Professional Sound API

I've just had a quick glance at the media kit from the Be Book. This seems to have everything you would need for developing professional sound production programs. Has this media kit actually been implemented in Haiku? Does Haiku come with this media kit, I presume it does? Is this Be Book documentation the same for Haiku's media kit?

Re: Professional Sound API

DavieB wrote:

Has this media kit actually been implemented in Haiku? Does Haiku come with this media kit, I presume it does?

Yes, it has been implemented and it is certainly in Haiku. I do not know whether it is already feature complete though (it might be). But being alpha software, it will most likely still have some bugs.

You may be interested to know that, back in the days, Tascam used BeOS for some of their systems. :)

Re: Professional Sound API

Yes I am aware that alot of professional hardware solutions were built on top of BeOS. Radar is another highly regarded solution. Obviously these companies would have utilised the media kit that came with BeOS to produce their software. I wonder if they will release their recording solutions for Haiku as a native solution?

I have just downloaded the CD image and am running it now. It appears to be fine and I'm really impressed with the Cortex program. This is exactly what I was hoping would be available; obviously this shows routing possibilities and I noticed a few nice touches such as an example FlangerNode.

I only briefly read through the documentation for the media kit and midi kit but as far as I can see the API seems complete, which is good since it would be crazy to have this and people bring in their own media frameworks such as VST etc. So it's "nodes" then? Cortex is essentially Jack on Linux.

Re: Professional Sound API

DavieB wrote:

Cortex is essentially Jack on Linux.

I would rather say that Jack on Linux attempts to be like Cortex on BeOS/Haiku, but I am obviously biased. :)

I personally know of at least one company that is eagerly waiting for R1 to be released so that they can migrate their professional audio system(s) from BeOS to Haiku. I would not be surprised if more follow suit.

Re: Professional Sound API

Well I would suggest that Tascam and Iz (Radar) might be keen on releasing their hardware solutions for Haiku since they probably used the media and midi kits to develop them for BeOS in the first place. I don't know of course but it makes sense. Would it be a good idea to contact them and suggest they release their software DAW's as Haiku native solutions. Doesn't have to be open source of course, it's just better if a professional supplier of DAW solutions was to adopt Haiku early on. Maybe someone at Haiku could contact them about this?

Re: Professional Sound API

While I'm not sure these professional companies have too much interest going over 10 years old code - they have moved on long ago - it won't hurt to mail them about Haiku. And "someone at Haiku" could as well be you. :)
This is an open, community-driven project, every one can help. In this case, I even think an email from a user in the audio field, showing interest in Haiku and some specialized software, may be even better. If it's well done by more than one guy, it may even have an impact... :)

Regards,
Humdinger

Re: Professional Sound API

don't know if you have seen this write up of cortex but you might be interested in the VST plug-ins information near the bottom and how the vst plug-ins and media kit can work together

http://betips.net/1997/09/09/fun-with-cortex/

Re: Professional Sound API

Humdinger wrote:

While I'm not sure these professional companies have too much interest going over 10 years old code - they have moved on long ago...

The one that I was referring to has not moved on and still uses BeOS for their systems. It's not TuneTracker that I am referring to, btw. :)

Re: Professional Sound API

I suspected as much. TT isn't actually the kind of audio application I had in mind. TT is more of an automation software that schedules the playback of audio files with some live mixing. Cool stuff, but not in the same boat as iZ/Tascam with their DAW or any other audio manipulating app.
TT has been running BeOS for the past 10 years. Migrating to Haiku should be relatively easy and is a sensible step to stay in business. Others would have to resurrect code that hasn't been touched for 10 years, while their money making apps have evolved on other OSs.

Regards,
Humdinger

Re: Professional Sound API

Humdinger wrote:

I suspected as much...

You misunderstood what I said. As I said, I was NOT referring to TuneTracker. Also, as I said, the company I was referring to are still using BeOS for their product NOW, that is, they have NOT moved on to another OS, and thus DO NOT need to resurrect old code to migrate to Haiku.

I may add that the company's products are professional audio systems for studios, the film industry, live performances, etc. Just thought I would clarify.

Re: Professional Sound API

oops. missed the "not"...
is there a reason not to name the company?

Re: Professional Sound API

This is of interest and is relevant.

Ben Loftis at Harrison was involved in the creation of MixBus which is a Harrison product for OSX based on Ardour. What's interesting is that he wrote BeSting and IKIS, which was originally written for BeOS.

http://www.benloftis.com/portfolio.html

He'd be a great person to get involved in developing software for Haiku, if he isn't already involved that is. I sent him a quick email but the developers should definitely contact him. He'd be a great person to get behind this.

Re: Professional Sound API

What about latency issues with the MediaServer?

Anyway, I'm in the process of porting (S)LV2, LADSPA, DSSI and other sound processing related stuff to Haiku, probably going to release binaries in June or something (LADSPA and DSSI SDKs and some plugins already working, but I have no time right now - damn university).

Re: Professional Sound API

When you say you are porting sound related stuff to Haiku what specifically do you mean? LV2, LADSPA etc are programming kits for making plug-ins on Linux. On Haiku this is not required, since the API for writing plug-ins is in the media and midi tool kits. Having VST, ASIO, AU, LV2, LADSPA etc on Haiku is just not needed. It's also going to mean that the sound infrastructure will become messy. Short term to allow things to run then yeah good idea but long term it's a big no no in my opinion.

I am interested to know what you are actually porting though :)

Re: Professional Sound API

He is probably porting just the APIs for now, and will probably make media node adapters that allow you to use the plugins using this API in the media kit, this does not really make anything messy but allows the use of other plugins in Haiku thus avoiding rewriting them unless they are really desired... would be nice to have a open source audio processing workstation on haiku but that will take time and need someone to be dedicated.

Re: Professional Sound API

I think short term the idea of porting various third party API's to allow ported plug-ins and other software to run is ok but long term it would be better to have native low level API "add ons" - is this the correct term for plug-ins in Haiku? I think there is already a VST wrapper for the media kit.

Re: Professional Sound API

TechnoMancer wrote:

He is probably porting just the APIs for now, and will probably make media node adapters that allow you to use the plugins using this API in the media kit, this does not really make anything messy but allows the use of other plugins in Haiku thus avoiding rewriting them unless they are really desired... would be nice to have a open source audio processing workstation on haiku but that will take time and need someone to be dedicated.

Exactly.

To be more precise, I'm porting the LADSPA and DSSI SDKs (DSSI needs also libdssialsacompat, and that one is already done too), SLV2, some plugins and other stuff I wrote, including a compiler for a DSP language (compiles to LV2 plugin source code), a LADSPA-to-LV2 and a DSSI-to-LV2 bridge (a VST-to-LV2 bridge will come too sooner or later), so that there only needs to be a LV2 media kit addon (and this will probably be done by another guy from the Italian Haiku user group).

Regarding creating a Haiku-specific API, I would rather suggest the use of LV2: it's simple, decentralized, powerful and, above all, not UNIX-specific and extremely extensible.

Re: Professional Sound API

The Media Kit is already a generic API for audio and video.
I don't mind having LV2 as an API for plug-ins since it means they can be portable and we can use ones not expressly written for Haiku, I would suggest that anything meant only for Haiku should use the Media Kit and make media nodes for the filters/other things that it provides. I would really really really like someone to make cortex better and to add more programs that can be used together to be an audio processing and composition workstation, not that I have a need for such a thing. :/

Re: Professional Sound API

A plug-in on Haiku is simply a media kit "node". There is no requirement for a sound API for plug-ins on Haiku, it's all provided for by the media kit.

Re: Professional Sound API

While this is true it is also good to let people use their favourite plugins from other things as well, we do not require a new API, however if something must be portable or we are using a filter written elsewhere then an API like LV2 is not a bad idea.

Re: Professional Sound API

I suppose if someone has a plug-in they want to use then having a wrapper to the media API is ok. But will these plug-ins ever become native to Haiku? Will they be able to truly latch into the API and run as efficiently as possible? What will the latency be on a wrapped LV2 plug-in against the latency on a correctly implemented media kit node?

Re: Professional Sound API

DavieB wrote:

But will these plug-ins ever become native to Haiku?

Short answer: yes. Long answer: the LV2 core specification does not contain anything platform-specific, some LV2 extensions do (for example GtkUI), but those can be easily replaced by Haiku-specific extensions. There's no technical reason, at least, preventing LV2 plugins to be as "native" as you want them to be.

DavieB wrote:

Will they be able to truly latch into the API and run as efficiently as possible?

Yes, the only overhead is an extra function call each time the audio buffer(s) is (are) processed, and this is more than neglibile.

DavieB wrote:

What will the latency be on a wrapped LV2 plug-in against the latency on a correctly implemented media kit node?

I'm not familiar with the media kit API, but I wrapped several sound processing APIs and I never had any extra latency - on the contrary, the media server introduces latency for each node connected in a chain-like fashion, so the bottleneck in this case is actually in the way Haiku handles inter-application audio data routing.

Re: Professional Sound API

zanga wrote:

I'm not familiar with the media kit API, but I wrapped several sound processing APIs and I never had any extra latency - on the contrary, the media server introduces latency for each node connected in a chain-like fashion, so the bottleneck in this case is actually in the way Haiku handles inter-application audio data routing.

How does the Media Server compare with Jack in terms of latency? I tried Jack on windows recently and it was atrocious, not surprisingly on Windows really. The model on Windows is essentially ASIO which is a direct connection to the audio driver, not really a media server as such but this is very low latency on a capable machine. The media server is more like a way of patching nodes together, nodes being anything you like; a physical audio input, a software "plug-in", a mixer etc. It seems to me that the media api is in fact just one big audio patch-bay/mixer; a virtual mixing desk. So if you know anything about mixing desks you will know what the media API can provide. This is what I kind of what I saw in Cortex. The media API being one big mixer. So as far as latency goes in this regard it's gonna be a trade off. If you want a mixer/media server then you are going to have latency, no question, this is true in the real world of pro level mixing desks too, be it very low. But if you want a hyper fast IO then you don't get the mixer, just a driver to your hardware, super low latency without the patching of nodes.

I like the idea of having the virtual mixing desk media server thing as long as it's fast enough. How fast is the media server? Anyone tried chaining together some heavy DSP plug-ins?

Re: Professional Sound API

DavieB wrote:
zanga wrote:

I'm not familiar with the media kit API, but I wrapped several sound processing APIs and I never had any extra latency - on the contrary, the media server introduces latency for each node connected in a chain-like fashion, so the bottleneck in this case is actually in the way Haiku handles inter-application audio data routing.

How does the Media Server compare with Jack in terms of latency? I tried Jack on windows recently and it was atrocious, not surprisingly on Windows really. The model on Windows is essentially ASIO which is a direct connection to the audio driver, not really a media server as such but this is very low latency on a capable machine. The media server is more like a way of patching nodes together, nodes being anything you like; a physical audio input, a software "plug-in", a mixer etc. It seems to me that the media api is in fact just one big audio patch-bay/mixer; a virtual mixing desk. So if you know anything about mixing desks you will know what the media API can provide. This is what I kind of what I saw in Cortex. The media API being one big mixer. So as far as latency goes in this regard it's gonna be a trade off. If you want a mixer/media server then you are going to have latency, no question, this is true in the real world of pro level mixing desks too, be it very low. But if you want a hyper fast IO then you don't get the mixer, just a driver to your hardware, super low latency without the patching of nodes.

I like the idea of having the virtual mixing desk media server thing as long as it's fast enough. How fast is the media server? Anyone tried chaining together some heavy DSP plug-ins?

I think you are maybe confusing latency and throughput here.

Very roughly speaking, latency is the amount of time it takes for a system to produce the output corresponding to the input, throughput is the "processing power" of a system; in other words, in audio latency is the (unwanted) delay due to various kinds of buffering (hardware and software) and unavoidable algorithmic delays (access to future input data simulated by adding delays and processing current data as it was past data), while throughput is how much stuff you can process at the same time.

In practice, latency and throughput are contrasting concepts, so you need to make tradeoffs.

Now, AFAIK, the Haiku media server exchanges data among directly connected nodes at each "cycle", while JACK executes all the nodes taking into account dependencies; this mean that if I have something like: input -> A -> B -> C -> output, in the Haiku Media server it goes like:

t=1: input -> A
t=2: A -> B
t=3: B -> C
t=4: C -> output

while in JACK it would be:

t=1: input -> A -> B -> C -> output

So the latency introduced by the Haiku media server is in this case 4 - 1 = 3, while in JACK it is always 0.

This, of course, at the expense of throughput. If you use ALSA directly on Linux you have better throughput than when using JACK, no doubt about that, but the total latency (hardware + software) is exactly the same.

Re: Professional Sound API

Your analysis seems a bit off...jack is like a single cycle CPU with low throughput where to the user/programmer an instruction takes once cycle but the cycle must be much longer yes it has zero cycle latency but the cycles are much longer thus the low throughput really no CPUs are designed this way anymore because its stupid.

If Haiku operates as you describe it has an audio pipeline which spreads out the latency between each step instead of delaying the whole pipe while one chunk goes through as you described Jack

A pipelined audio system would IMO have better throughput with minimal degradation in latency especially on SMP processors which are widespread at this point and were obtainable back in the BeOS days though a bit more expensive. In fact the improved throughput would probably have the effect of reducing system load thereby reducing latencies...assuming high load audio processing if you assume lower load Jack should be able to keep up but will reach its limits faster from this perspective I see a pipelined audio kit as more robust.

I have no idea how these are actually implemented but your description doesn't make much sense to me. Maybe you can set *my* thinking straight if I have missed something or other.

t=1: input -> A
t=2: A -> B
t=3: B -> C
t=4: C -> output

while in JACK it would be:

t=1: input -> A -> B -> C -> output

While your diagram is OK you have to remember that T isn't the same in both of those most likely that jack's t will be very large and haiku's t very small

PS: I think the system refered to earlier is RADAR V ... which I believe can still run BeOS as an option (going from what I have read never seen one myself)

Re: Professional Sound API

Radar is a hardware DAW built on top of BeOS, yes.

Can we make the example a bit more accessible please?

What does "t" represent? Is it a variable? Does it mean "thread"?

input, I understand this, it could be anything from a file playing or a signal coming in from a hardware sound card.

A, B and C is this is a dsp process such as a "gain control" or an "EQ"?

output, again this is the end of the path, could be a file or a physical output on the sound card.

So correct me if I am wrong.

Jack

soundcard -> EQ -> Chorus -> Delay -> monitors

or

Wav file -> EQ -> Chorus -> Delay -> Wav file

Haiku

soundcard -> EQ
EQ -> Chorus
Chorus -> Delay
Delay -> monitors

???

soundcard ->

start threads
thread1 = EQ -> Chorus -> Delay
thread2 = EQ -> Chorus -> Delay
thread3 = EQ -> Chorus -> Delay
end threads

-> monitors

Feel free to step in at any time, haha...

Re: Professional Sound API

t is a unit of time as the DSP system sees it.
What I believe it means is that in JACK the whole chain is processed in one step then it happens again and again over time but the time step is larger.
In haiku, it passes the buffers through each node and then they give them back to be passed to the next node and so on, this means that each "time step" is smaller and the delay of moving the data from one place to another is spread out over the whole chain.
At least that's what I understand :/

Re: Professional Sound API

zanga wrote:

So the latency introduced by the Haiku media server is in this case 4 - 1 = 3, while in JACK it is always 0.

If your buffer length is 5ms, your latency is at least 5ms
then, you have to wait the scheduler to execute your thread. for this kind of job, (soft)realtime capable scheduler are much better. your thread wait less and then your buffer can be smaller. It was the main reason for BeOs' excellence.

so I wouldn't say Jack's latency is 0.

Re: Professional Sound API

Oh I see.

This is about how a "unit of time" is being managed by the Haiku media server or Jack.

So in Haiku is the following roughly correct?

A media roster (mr) object manages 4 nodes; 1-input, 2-EQ, 3-chorus and 4-file output

A single cycle is as follows ...

input returns to mr -> mr feeds EQ -> EQ returns to mr -> mr feeds chorus -> chorus returns to mr -> mr feeds file output -> file output returns to mr

... repeat

Re: Professional Sound API

Some of what's been written above is... confused

LADSPA / LV2 and JACK are complementary. Consider a serious DAW system, you may have twenty plugins running in realtime, you don't want the overhead (minimal as it is) of the context switch for each plugin, so they run in-process. But also, you may see that you're running short of CPU throughput and so you decide to "cook" some of the plugins, those you haven't really been changing the parameters on. The same API can be run faster (or indeed slower) than realtime, applying any automation and resulting in an intermediary image of a track to which any remaining plugins are still applied in realtime.

Now, as to the hypothetical benefit of sort of "sharing" the latency between applications, there are several nasty problems with this approach, although none of them apply to the real world uses of the BeOS media kit, which were mostly to play some music or game sounds, and not to do serious audio processing.

• Synchronisation. In the JACK system all the components are processing the same time period at the same time. A "sharing" scheme screws this up. Of course you can still have an audio clock but there is no well-defined synchronisation between that clock and non-audio events. We can easily imagine such a scheme causing the audible effect of two simultaneous MIDI events to be separated in time, which would be intolerable from a live music point of view.

• Worse case wins. In the JACK system all the work must be scheduled for each period. Most decent audio software is designed with realtime considerations in mind, which means predictable workload (the workload may vary when parameters are tweaked, but it shouldn't vary for other reasons). A lot of work was put into identifying JACK software that emitted or was strangled by denormals - valid but unusual IEEE floating point values that are often costly to process in hardware. But with the "sharing" scheme each period must be long enough for the most CPU intensive of your processes to complete.

e.g. suppose process A takes 5ms for a 2048 frame buffer, process B 8ms, process C takes 22ms. These are very intensive processes, perhaps process C is a Convolver with a complex impulse from a cathedral. Or maybe we just aren't using a very powerful computer. Let's further suppose we're working with 48kHz PCM audio, so a 2048 frame buffer equates to just under 43ms of latency.

in JACK 5ms + 8ms + 22ms = 35ms, and we incur just 2048 frames of total latency, 43ms

in the hypothetical sharing system (I hope the BeOS media kit is not in fact this terrible) the worst must run in one cycle, the worst is 22ms and we have three cycles of 2048 frames, for a total latency of 128ms

In reality small latencies (< 30ms) are often inevitable anyway, not least because sound travels relatively slowly through air (about one foot per millisecond) - if you perform live you have to get used to the fact that it takes a noticeable amount of time for sound to travel across the stage, and compensate for that. But of course it's appropriate to try to keep this to a minimum - which is why JACK was designed the way it is.

Re: Professional Sound API

Ok I run a Cubase mix and have plug-ins inserted into the mixer etc. Ok, the CPU is being beaten to death so I "freeze" one or two tracks to free up some CPU throughput. This process of freezing can happen in realtime or offline (faster than realtime).

Sharing latency between applications will be done by a process that must synchronize all the processes to some sort of clock. Media roster, I think, is the class that manages nodes and that is slaved to some sort of system level clock, other media roster objects can run in parallel and will also be in sync. I think that the biggest media roster object or rather the one with the largest latency will mean that the other media rosters will have to wait for this largest one to finish. At least that's my understanding of it. I'm very new to the media kit, so I'm currently reading the documentation. Serious audio processing seems achievable using the media kit, why do you say this? Can you give some reasons.

Synchronization. I think in the Haiku system the same is also true there is a master clock that synchronizes the nodes under a media roster object. Midi can be synchronized to this clock also. Live, midi events are recorded in sync to the audio that is being monitored. The monitored audio will have a latency but when the performer hits a key on their midi keyboard there is no latency and the sound is heard by them in time with the monitored audio, It is recorded exactly in sync with the audio. Not entirely true for virtual instruments though.

I don't think the sharing system you describe is in fact how the media kits works. I think the media roster object owns the nodes kind of like a cable between them. And it is slave to a system level clock. The biggest media roster object is total system latency. I don't actually think the latency you describe would be any different. The media roster is just jacking together the nodes.

I think 43ms in Jack is 43ms in Haiku more or less.

Re: Professional Sound API

All Media Nodes are slaved to a time source, I believe this fixes the sync issue.
There is a system default time source that is used by media kit as the default for nodes, you can use others though I believe.

Re: Professional Sound API

Oh my, what a mess! I try to be as clear as possible this time, but this is the last time I try to explain things, since I was only interested in knowing what the developers think about this issue and I hope they know what I am talking about at least.

So, let's say that we are reading and writing audio buffers from/to the soundcard and each buffer represents x milliseconds of audio.

Now, let's consider we have a program which reads stuff from the microphone input, processes it and writes it back to the soundcard: each x milliseconds the sound card driver gives you the data which corresponds to the last x milliseconds of microphone input (so your input data goes from (now - x milliseconds) to (now)), you process it (let's suppose instantly for simplicity) and you write it back to the soundcard - ignoring other latencies, the soundcard can only start to reproduce the the processed sound from the moment you sent it, so from (now) to (now + x milliseconds). This means that the output sample corresponding to the input sample (now - x milliseconds) is (now), so we have x milliseconds of latency - which in this case is only related to the buffer size and the sample rate.

Now we move to the case of inter-application audio data routing: we have three independent programs, A, B and C, and a signal flow like: input -> A -> B -> C -> output, which means A takes its input from the sound card and outputs to B, B outputs to C, C outputs to the soundcard.

Once a system like JACK or the Media Server receive data, they let all nodes process some data, the difference is in which data is actually processed.

AFAIK, the Media Server does not care about processing chains, it just "calls" nodes in random order giving them as input the output of the previous node at the previous cycle, and keeping in memory their output for the next cycle.

If you do your own math, you will discover that each node in a chain introduces a latency which is equal to the amount of time corresponding to the buffer size (input of B is the past output of A, input of C is the past output of B, which correpsonds to the second past output of A).

In our case, since we have 3 nodes in a chain, the Media Server introduces an additional latency which is three times the time corresponding to the buffer size (so our software latency, excluding algorithmic latencies, is now 3 or 4 times the buffer size in time - I don't know whether the output to the sound card is actually the current or the past output of C).

JACK, instead, analyses the connection graph and does basically a topological sort, so that at each cycle, it can give to a certain node the data produced by the previous node in the chain corresponding to the "global" output which is going to be produced in this whole cycle.

So, it will execute, in order, A, B and C in our case, using the current output of A as input of B and the current output of B as the input of C, which means it introduces no latency at all (total software latency, excluding algorithmic latencies, is just the buffer size in time).

Now let's make some considerations:
1. JACK is probably more resource intensive that Media Server, so it steals some more throughput, but I have no measurements so I can't really tell.
2. The number of nodes executed at each cycle by both systems is probably the same (all nodes), unless they have some mechanisms which exclude from execution nodes which are not useful to generate the total output (theoretically there are ways to do that in both systems, but in practice it might even be harmful).
3. I/O buffer sizes are the same or at least comparable in both systems, since whatever systems, smaller buffer sizes correspond to less latency but also to less throughput (more interrupts or read/writes and more code executed in practice).
4. The threshold to hear separate sounds is somewhere around 10 ms, which corresponds to buffer sizes of 441 samples at 44.1 kHz, 480 samples at 48 kHz, 960 samples at 96 kHz, etc. In pratice the hardware (A/D, D/A, and other stuff) introduce even more latency, so you should typically stay at least below 5 ms of software latency. In JACK you can just stick to that, with the Media Server you should further divide your buffer size by the maximum number of chained nodes, which gives you worse throughput.
5. A nasty side-effect of introducing latency for each node is synchronization of different streams, and that IS important if you want graph-like connections, because the human hearing can perceive differences between sounds no matter how small, in time, they are (well, almost, but for practical purposes that's what it is).

Re: Professional Sound API

I don't think I really know what you are talking about because you are being too technical. If you can make what you are saying more accessible by using analogies to the real world applications, such as mixing desks, monitoring, input, etc it would be easier to follow what you are saying. It is obvious you have a specialist skill in the area of LV2 Jack etc and have developed for it and others but I myself have absolutely no experience what so ever in this area of programming. I am a music producer and songwriter but I work as a professional software developer mainly and I write database systems but I can follow a coding guideline if it's given to me or some well commented code examples or well documented help files. It would be good if you could discuss this with the developers of the media kit. Anyway, trying to make sense of all this ...

Ok, a soundcards input to output has a latency which is the same as the soundcards buffer size for a given sample rate.

Now, we have three programs linked in series.

In your example is a "node" a program; A, B or C?

The media server doesn't card about processing chains? I think I know what you are going on about now. On linux you can string your programs together in series such as; Renoise -> Ardour -> soundcard. Jack treats this as one thing? How does the media server achieve this? Well each program A, B and C uses a the media roster object. This roster object owns individual "nodes" for fx, inserts, tracks, anything you like etc. The roster object is aware of a master clock so that they can run in sync. This is ok when things are to be sync in parallel but you want to know how the media roster objects run in series. This is an important point. I don't actually know.

Simple high level view of the media kit

Media Kit is the studio.
A "node" is to an effect, input, output, track, channel etc
A media roster is an application such as a sequencer, DAW, beatbox program that can use nodes and can be sync'd to a studio clock etc.

So can media rosters be chained together in series? One feeding the next etc?

Re: Professional Sound API

DavieB wrote:

I don't think I really know what you are talking about because you are being too technical. If you can make what you are saying more accessible by using analogies to the real world applications, such as mixing desks, monitoring, input, etc it would be easier to follow what you are saying. It is obvious you have a specialist skill in the area of LV2 Jack etc and have developed for it and others but I myself have absolutely no experience what so ever in this area of programming. I am a music producer and songwriter but I work as a professional software developer mainly and I write database systems but I can follow a coding guideline if it's given to me or some well commented code examples or well documented help files. It would be good if you could discuss this with the developers of the media kit. Anyway, trying to make sense of all this ...

Ok, let's see if I can make things more clear.

DavieB wrote:

Ok, a soundcards input to output has a latency which is the same as the soundcards buffer size for a given sample rate.

Ok.

DavieB wrote:

Now, we have three programs linked in series.

In your example is a "node" a program; A, B or C?

Yes.

DavieB wrote:

The media server doesn't card about processing chains? I think I know what you are going on about now. On linux you can string your programs together in series such as; Renoise -> Ardour -> soundcard. Jack treats this as one thing? How does the media server achieve this? Well each program A, B and C uses a the media roster object. This roster object owns individual "nodes" for fx, inserts, tracks, anything you like etc. The roster object is aware of a master clock so that they can run in sync. This is ok when things are to be sync in parallel but you want to know how the media roster objects run in series. This is an important point. I don't actually know.

Simple high level view of the media kit

Media Kit is the studio.
A "node" is to an effect, input, output, track, channel etc
A media roster is an application such as a sequencer, DAW, beatbox program that can use nodes and can be sync'd to a studio clock etc.

So can media rosters be chained together in series? One feeding the next etc?

Well, I said I have no experience with the media kit itself, apart from resampling; my source of information is: http://www.haiku-os.org/legacy-docs/bebook/TheMediaKit_Overview_Introduc...

I don't know if media rosters can be chained themselves (doesn't Cortex do that?), but even inside one roster, connecting nodes in series creates latency.

Real world example: let's say I have 3 ms buffering latency and I'm using a guitar rack program which uses nodes for each effect. I want to apply 5 effects in this order: wah, distortion, phaser, flanger, FIR convolver - if each node introduces latency I have a total of at least 12 ms of latency. If I used JACK to connect individual application each doing one of these effect I would have had only the 3 ms of latency.

Why? Because JACK is capable of executing the effects in the "correct" order, while the media kit does not take the order of effects into account. So the media kit has to use past buffers for each effect to avoid processing bad data, and each buffer is 3 ms long.

Result: 12 ms is hearable, 3 ms instead is not.

I hope this is clear enough.

Re: Professional Sound API

I've been going through the documentation too.

I did notice in the documentation that when a media roster links it's nodes together that the link itself introduces a latency of 2 microseconds? Is this what you mean?

This is the quote from the media kit http://www.haiku-os.org/legacy-docs/bebook/TheMediaKit_Overview_Introduc...

"Let's consider a case in which three nodes are connected. The first node has a processing latency of 3 microseconds, the second has a processing latency of 2 microseconds, and the last has a processing latency of 1 microsecond.

This example uses the term "microseconds" because the Media Kit measures time in microseconds; however, the latencies used in this example may not be indicative of a real system.

In addition, 2 microseconds is required for buffers to pass from one node to the next. The total latency of this chain of nodes, then, is 3 + 2 + 2 + 2 + 1 = 10 microseconds."

This is kind of making a bit of sense to me now but I don't care about the low level stuff at the moment. I'm interested in the high level plan for professional audio on Haiku.

Assuming that the media kit is capable, something you are currently trying to find out, I envisage it this way ...

Haiku is the recording studio.

At the centre of the Haiku sound studio is, yes you guessed it, a mixing desk. If you are working with professional sound applications you need a mixing desk, it is an essential thing to have.

The mixing desk provides the ability to IO, set levels, eq, compress, and use "add-on" fx etc. This is also where you will find the master clock and transport contols for the studio.

It is the hub of the sound environment in Haiku and DAW's, Editors, BeatBox's, Sequencers, instrument racks all output to the mixing desk. Even the sound from the media player goes to the mixing desk.

There is no situation that cannot be catered for in this design because it comes from a professional studio.

Cortex is the nearest example of what I am thinking about but it's badly done.

What Haiku needs first and foremost is a mixing desk ...

Thoughts please...

Re: Professional Sound API

zanga wrote:

1. JACK is probably more resource intensive that Media Server, so it steals some more throughput, but I have no measurements so I can't really tell.

The overhead for JACK is really tiny, I'd be very surprised if it's as much as the Media Server in a like-for-like comparison, let alone more. Where possible JACK uses FIFO scheduling, so the OS scheduler is automatically running the next needed JACK thread(s) whenever there's work to be done, and it uses carefully aligned and wired shared memory so that everything is in RAM and there's zero copies, probably the biggest source of "overhead" is the choice of single precision floating point for PCM representation, but if you're doing anything really serious you have to pay that price anyway, so better to pay it only once inside JACK itself where they have highly tuned conversion code.

You might think the graph ordering problem would add overhead, but JACK's design ensures this can be done outside the tight audio loop where it isn't time critical - and without locking too.

The FIFO scheduler is potentially dangerous (nothing else will be scheduled while a FIFO process is ready to run), but modern Linux lets processes be given a "limited" privilege to run with this scheduler for a finite fraction of time. Because JACK is all about low latency a process that abuses FIFO scheduling to stay running too long is undesirable anyway, so the interests of JACK performance and system stability are aligned.

All this is great - so long as you're doing pro audio. The moment you've got people who don't know about audio writing the software, you lose. Such people aren't going to spend hours analysing their work in cachegrind and they certainly aren't going to worry if their code sometimes incurs a disk seek (typically 10ms each) when it's supposed to be emitting audio. Their approach is fine for listening to music while reading your email, but it's not pro audio, and in JACK what will happen is the xrun counter will start climbing and the program gets kicked out of the graph.

The consensus has long been that we aren't going to get general application developers to "pay their taxes" (to use Raymond Chen's preferred phrase) on pro audio APIs, so you will always want a separate "no compromise" pro audio system for doing real work because the system used by general application developers will have to accept relatively long latencies and unpredictable latency jitter from less than excellent software. From what I have read and understood about it, the Media Kit is the latter type of system.

Re: Professional Sound API

NoHaikuForMe wrote:

The consensus has long been that we aren't going to get general application developers to "pay their taxes" (to use Raymond Chen's preferred phrase) on pro audio APIs, so you will always want a separate "no compromise" pro audio system for doing real work because the system used by general application developers will have to accept relatively long latencies and unpredictable latency jitter from less than excellent software. From what I have read and understood about it, the Media Kit is the latter type of system.

Are you saying the media kit is not up to scratch for pro audio applications? Have I misunderstood?

Re: Professional Sound API

DavieB wrote:

Are you saying the media kit is not up to scratch for pro audio applications? Have I misunderstood?

Well it's always possible to stretch a point. After all, a ZX Spectrum is not what we'd generally consider a pro audio system, but a crowd of paying guests dancing for an hour to sounds coming (in part at least) from a ZX Spectrum is hard to argue with. I'd say if you care about low latency, the Media Kit is probably not a good choice.

But to be fair the primary criterion for most users, whether they do pro audio or just blog about skateboarding is always application availability. If you like Reason then an OS without Reason is no good to you, even if it has really low latency.

Re: Professional Sound API

You haven't really answered the question and your comments seem vague and indirect. If you think the media kit is not up to scratch then give us some reasons.

Lets say for now the media kit is written for bog standard audio use and that it sucks for pro audio. The better API is Jack which is already in use by Ardour and Harrison's MixBus program. It's made it's way onto MAC OSX and Windows. This seems to be the best API for "pro" audio.

Re: Professional Sound API

I DO think that using the Media kit for pro audio (extreme low latency) is not appropriate because it will inevitably introduce jitter and also because it does not allow direct hardware acceleration which is trivial because it is hidden by the Media kit API (please correct me if I am wrong).

I have some interestes in audio myself and I tried to think about this previously, and after reading the Be Book I found these conclusions

- Using Media Kit is only for non-time critical (no pro-audio) software, such like media players because of the overhead of the context switching, non optimsed sound routing calls (again correct me if I am wrong)

- Pro audio is similar in some ways to 3D graphics, it needs the fastest access to the device and the best hardware acceleration, so it needs some direct access API for sound, something like DirectSound, ASIO, etc .... So I found some answers not in the MediaKIT section but in the GameKit instead so let's see what it says here :
BPushGameSounds

Are used to let you fill buffers flowing to the speakers (or headphones, or whatever audio output device is being used). Their missions are the same, but their methods are different; BPushGameSound also provides a way to play sound in a cyclic loop, keeping ahead of the playback point in the buffer. This is the extreme in high-performance, low-latency audio, but does require some extra work on the programmer's part.

- My personal approach for a Pro audio system would be an implementation of Jack API in top of the GameKit which provides higher level API instead of using directly the BPushGameSounds and the BStreamingSound and it would be for the "careful and know-what-to-do programmers" and for the not so critical missions, there's the MediaKit.

My 2 cents

Re: Professional Sound API

I'm in favour of Jack but I think it can be extended to not just be a way of routing but also be the hub of the mix. Consider the fact that most programs have their own mixer built in, Pro Tools, Cubase, Sonar, Ardour, MixBus Logic; they all have their own mixer. Why? If you want to Jack your audio between all your programs and hardware etc why not mix them as well? Just have one system level professionally implemented mixer. So you open a DAW and create a track. This track shows up in the mixer as a channel strip, with everything a professional mixer has, inserts, sends, master fader, metering etc. In the DAW you edit and move the audio around but you control the mix from the system mixer. Then you open a drum machine instrument and this shows up in the mixer also. The mixer also has automation built into it, which can be edited. The set-up can be saved as a project. It has a master transport control and clock so everything can run in sync. This mixer would come with a rich high level API not unlike Jack's. In fact you could consider this as "Jack with a Pro Mixer". Or think of it as Ardour with the mixer removed and stuck into Jack instead. The great thing about MixBus is that it was written by Harrison so the implementation of the mixer is exemplary but the problem is that most people just want the mixer part not the DAW. If you had Jack Mixer implemented by Harrison that would be the center piece of a pro audio system on a computer, everything goes through it's API ...

Another thought on the subject of the media kit etc. How do you explain iZ Corps RADAR system and Tascams SX-1 professional hardware DAW's being based on BeOS? Will they have made use of the various media and midi kits to create their software?

http://www.tascam.com/products/sx-1.html
http://www.izcorp.com/radar.php

Re: Professional Sound API

Hi all. First time poster here. I was a developer back in the BeOS days (Harrison used BeOS extensively in our products, along with the iZ, Tascam, and LCS guys). In all of these cases, BeOS was the control/automation computer and the heavy-lifting was done by DSP hardware. I don't believe any professional product ever used the Media Kit.

The Media Kit is very well designed (overdesigned?) for media playback and simple music production. It is not ideally suited for professional use because of the latency issues discussed before, the relative complexity of some simple tasks (such as plugin hosting and parameters) and the lack of features such as time sync, video sync pulse, netjack, etc. Of course all of these issues _could_ be incorporated into the Media Kit, and probably very neatly.

On Linux it has become fairly common for systems to use PulseAudio for desktop sounds and JACK for professional audio. I'm a big fan of the "cleanliness" of BeOS/Haiku so I'd prefer that Haiku didn't have this split. However, practically speaking, Haiku has the opportunity to gain a lot of developers and a lot of existing code if they adopt JACK. If they don't do this, then there is going to be a long and hard road to re-implement everything that has been happening in Linux Audio for the last 5 years.

DavieB, regarding your comment about a separate mixer: there have been many examples of this over the years on all platforms. BeOS had a simple software mixer that allowed you to adjust the level of apps that were playing back on the system. There have been mixer-only apps for CoreAudio, ReWire, and JACK. Ardour/Mixbus has often been used as a mixer-only system with no track playback. I'm even aware of people using Ardour and Mixbus as live mixing consoles(!) However in most cases, the inconvenience of saving the mixer state in addition to the various other programs makes it impractical. There is an effort to incorporate "session management" into JACK to provide just this sort of inter-operability. (yet another thing that the Media Kit will want to re-implement in the future...)

I could imagine a cool software mixer that automatically adds an input strip(s) every time you launch a new app. But you'd want to separate the "pro" apps (whose settings could be saved/recalled and automated) from the desktop apps such as the web browser.

Best Regards,
Ben Loftis

edit by admin: removed the URLs.

Re: Professional Sound API

It would be good to have a professional implementation on Haiku, Jack seems to be the current contender.

On the separate mixer idea, I know what you mean, when you say you want to save the mixer state with the project audio and midi sequencing etc. But I would say that this is still broken if you are using instruments or FX feeding into the DAW from external Jack inputs. The state of the entire project means saving the DAW and mixer as one project, the instruments as another, jack routing as another etc.

The other reason why I think it is a good idea to have a separate mixer is that the user will never have to change the mixer settings if they decide to move from one DAW sequencer to another. It's a real pain to work in Ableton Live with a mixer set-up and then move to Cubase for recording audio and having to move over your mixer settings as well. Interchange of audio and midi is fine but how can a user easily interchange their channel strips, plug-in inserts, and overall mixer set-up. I don't think you can and copying one mixer to another is slow and painful. In the real world there is only ever one mixer at the heart of any recording studio.

I agree that a pro level software mixer would be for "pro audio" software only.

Re: Professional Sound API

I was thinking about the issue a bit lately and I think I came to the conclusion (which anyway would be better discussed on the development mailing list with the developers) that an easy and clean way to get something working for professional audio on Haiku could be to have JACK as a node/roster/whatever in the Media server, so that simple applications may just use the Media Kit and professional apps could use JACK - it would be also preferable to be able to connect normal Media server nodes/rosters/whatever to the JACK's inputs/outputs (and maybe have some setting, maybe even dynamic, to specify how many JACK I/O ports one wants to have).

This as far as audio data goes and in the hope that direct connections in the Media server to soundcard I/O do not introduce latency too (if that's not the case, that could probably be changed without breaking APIs, I hope) and that Haiku is capable of makiing JACK work decently.

When it comes to MIDI, I don't really know; I've been told that the Media Kit has problems for professional MIDI usage too, but I actually have no idea.

Re: Professional Sound API

As a professional composer and musician using linux, i'd add a couple of things to this thread.

Ben Loftis makes some good points in his post, and hits the nub of any discussion about professional sound.

1.) A central mixer is essential. I've gone the route he described in the past, using an app's built in mixer just to try and have a central mix core into which i can add any number of apps. Now i use a little app called non-mixer, as it's the only mixer i've found which does the job as a standalone for what i do. (As a writer of orchestral and film music, my port requirements are in the hundreds, not the tens, and many professional audio and midi apps are written for "headbangers", excluding my unique requirements, and lacking the extensibility to use many more tracks, ports, etc.)

2.) Jack, for all its strengths and weaknesses, is ideal (more or less) for professional audio use. It seeks to fulfill the requirements for a multiapp working environment, and does so at low latency. Those that know will appreciate the value of this, recording not only from live sources, but building scores, tunes, etc... with software tools inside the box, and hardware tool outside. (Synths, samplers, blah blah blah)

3.) Extending what is a domestic API may well seem like a good idea, but it's basic flaw is in user requirements, those of domestic users, where generous buffers and high latency are essential for great playback of movies and tunes, versus the time essential low latency requirements of the professional user. Pulseaudio, for all it's strengths and weaknesses, and i assume this is akin to the Haiku API, is built for domestic users, and there are many trainwrecks in fora as the discussions rage on about the "worth" of pulseaudio versus jack, or indeed directly using alsa. It's FUD to assume that a domestic API will satisfy the needs of a professional user, and this FUD is often spread by well meaning devs who think they know what professional requirements might be, although they have no experience in that field.

4.) This also applies to Video playback, and as someone who writes to image, all too often the sync between audio and video devices is less than optimal, because the sound api doesn't neccessarily sync to the framerate in vid. Jack has a common Transport API that can be implemented in both audio and video apps, and in the linux world, Xine is one of these Jack Transport capable apps, able to show vid synced via jack transport with any other jack transport capable app. A domestic api lacks this sync, or framework for syncing, and renders itself more or less useless for such sync required work.

5.) MIDI is the hairy red headed step sister of any professional working environment, and all too often, this protocol is poorly done, not only at app level, but at server, or framework level in APIs. Jack at least attempts to rectify this in some way, with JackMidi, a sample accurate midi framework that syncs by default with JackAudio using as it does timestamped events, and doing so within a specified period, as for audio. Any consideration of a professional grade sound/vid/midi API should, by default, include the required components to deliver sample accurate MIDI, and without jitter, or uncertain timing. There are many users, professional and domestic alike that use midi, yet all too often it's "tacked on afterwards" as an afterthought, rather than considered as part of the whole professional API, and integrated at initial development, by developers whose vision is of a studio with a desk/console, and that's....professional. It's just one part of a much wider picture, and you're more likely to see a film score, or recording studio setup, in the 21st century, on a box, with a widescreen. Large console studios are still viable, of course, but there's a lot of users working with a software mixer, and plugins, at a paid level. A colleague of mine has a large console and Protools rig, yet his editing work for the last 2 years has been on a computer, using another DAW, because it does the job. He's on Windows, poor soul, but he makes a valid point when he says it's cheaper and quicker to edit this way, than crank up the desk and PT rig, to the same finished product standard. (Which makes yet another case for a central software mixer as part of any professional sound API)

6.) With the rise of smaller devices, like phones, handhelds of some description, netbooks, laptops, and so on, there is a trend to build to these small devices, as a default, and considerations in design, and implementation seem to be heading in this direction. That's fine for the 6 to 24 track engineer (some may be a little more using a powerful laptop for example), but as designs change, it's also a trend that limitations are imposed to cater for these devices, and by the nature of those limitations, the larger setup users are being slowly excluded. I urge anyone building apps or sound APIs not to forget many users are still using desktop computers or server farms to render their material, and this doesn't look like changing soon. A Haiku professional sound API should be just as valid, and easy to use for a multi box user, as it may be for a laptop or netbook user, imho.

7.) Finally, no limitations. If you think a user will never use more than 100 audio ports, or 50 midi ports, you're wrong, and profoundly so. My regular template, and this is also true for my colleagues doing the same work, is at least 400 audio tracks, and 128 midi ports. I have a friend in the UK who writes for bigger work than I, using 600-900 midi tracks, over 250 midi ports, and a farm of boxes running samplers to feed his main daw. If you think there's "enough", then double it, then triple it, then use no limitations at all. :)

A few thoughts if you chaps are considering a professional API from a user who does this for a living.

Good luck.

Alex.

Re: Professional Sound API

Buffers are not passed in and out nodes by MediaRoster (no xxx returns to mr / mr feeds yyy). Nodes being connected and started, they perfectly knows their output(s). They simply hands the buffer to their connected output(s), which trigger a BufferReceived() event on the corresponding node(s) in the graph, which run in a different thread.
The BMediaRoster allows you to query, instanciate and control nodes, but is not involved in buffer(s) processing.

There is no main/central/looping processing thread managing the pipeline(s). As with many BeOS/Haiku areas, it's an asynchronous multithreaded model, where each node works in parallel (actually, on SMP hardware) and cooperatively via messaging to produce buffers in time to perform them when due.
It's very visible in ProcessControler: the media_server doesn't do that much work, instead the load is spread in several threads, in media_server_addon's ones for system media nodes like mixer, physical nodes, in client apps (Media Player, CortexAddOnHost, Clockwerk...) for others.

The timing design is kind of reversed compared to JACK: it's a performance time driven data flow, not a process-driven cycle. Also, JACK design focus mostly on live audio, while Media Kit design don't/can't.

I dunno if Media Kit design is up for Pro Audio, but when I consider that JACK is not an end-user audio-video framework (GStreamer is more in par with Media Kit here), that its design is optimized for live audio but had to evolve from single RT processing thread model to a dual thread model to actually benefit from SMP system, had memory locking and RT memory alloc support to avoid VM swapping, I find more similarities in both frameworks than one could though at first.

The fact remains that nobody will knows what worth Be's Media Kit design and Haiku re-implementation until one actually try to stress it as a Pro Audio solution will.

Any volunteer?
:-)

Re: Professional Sound API

Media Kit don't do MIDI.
The Midi Kit does, but don't ask me if it's up for professional usage or not...

Re: Professional Sound API

zanga wrote:
DavieB wrote:

I don't think I really know what you are talking about because you are being too technical. If you can make what you are saying more accessible by using analogies to the real world applications, such as mixing desks, monitoring, input, etc it would be easier to follow what you are saying. It is obvious you have a specialist skill in the area of LV2 Jack etc and have developed for it and others but I myself have absolutely no experience what so ever in this area of programming. I am a music producer and songwriter but I work as a professional software developer mainly and I write database systems but I can follow a coding guideline if it's given to me or some well commented code examples or well documented help files. It would be good if you could discuss this with the developers of the media kit. Anyway, trying to make sense of all this ...

Ok, let's see if I can make things more clear.

DavieB wrote:

Ok, a soundcards input to output has a latency which is the same as the soundcards buffer size for a given sample rate.

Ok.

DavieB wrote:

Now, we have three programs linked in series.

In your example is a "node" a program; A, B or C?

Yes.

DavieB wrote:

The media server doesn't card about processing chains? I think I know what you are going on about now. On linux you can string your programs together in series such as; Renoise -> Ardour -> soundcard. Jack treats this as one thing? How does the media server achieve this? Well each program A, B and C uses a the media roster object. This roster object owns individual "nodes" for fx, inserts, tracks, anything you like etc. The roster object is aware of a master clock so that they can run in sync. This is ok when things are to be sync in parallel but you want to know how the media roster objects run in series. This is an important point. I don't actually know.

Simple high level view of the media kit

Media Kit is the studio.
A "node" is to an effect, input, output, track, channel etc
A media roster is an application such as a sequencer, DAW, beatbox program that can use nodes and can be sync'd to a studio clock etc.

So can media rosters be chained together in series? One feeding the next etc?

Well, I said I have no experience with the media kit itself, apart from resampling; my source of information is: http://www.haiku-os.org/legacy-docs/bebook/TheMediaKit_Overview_Introduc...

I don't know if media rosters can be chained themselves (doesn't Cortex do that?), but even inside one roster, connecting nodes in series creates latency.

Real world example: let's say I have 3 ms buffering latency and I'm using a guitar rack program which uses nodes for each effect. I want to apply 5 effects in this order: wah, distortion, phaser, flanger, FIR convolver - if each node introduces latency I have a total of at least 12 ms of latency. If I used JACK to connect individual application each doing one of these effect I would have had only the 3 ms of latency.

Why? Because JACK is capable of executing the effects in the "correct" order, while the media kit does not take the order of effects into account. So the media kit has to use past buffers for each effect to avoid processing bad data, and each buffer is 3 ms long.

Result: 12 ms is hearable, 3 ms instead is not.

I hope this is clear enough.

Dude I have done research with this latency doesn't even get to perceptual till 150+ msec.

But belive what you want.

12msec is

0.00012 seconds