cortex

Forum thread started by skarmiglione on Sat, 2013-11-02 08:21

Hello, one of the things i fallen in love with haiku os is that "cortex" i never used it for nothing but i ever has supect that thing should rock, but how? anybody can explain me?, i can imagine software who lack one trick can obtain from another software with the ports opens to be used on cortex, is it like i can imagine? or better?
Why there are so poor information about this tool, i want write an arcticle, can help me with information?

Comments

Re: cortex

jua wrote:

In the end, you keep arguing that the Media Kit is fundamentally flawed, as if it was just a theoretical concept which has never been tried out in practice. Fact is, two implementations exist, the one in BeOS and the one in Haiku. Believe it or not, they do work.

On the one hand, I really am mostly interested in explaining the theory. It's an opportunity for people to learn stuff that's useful outside of the present context, stuff that I wish I'd known twenty years ago before I wrote my first audio software.

But on the other hand, in practice yes the Media Kit design really is pretty terrible. Does it work? After a fashion sure, and so did the BeOS R5 netserver.

Re: cortex

What exact thoery are you discussing ? Processing Latency, Transport Latency, AD conversion Latency ? Which Latencys, where do they exist in the audio chain and what are you rambling on about ? There is nothing fundementally flawed with the media kit design, it works rather well in practice. Its based on the BeOS design, which BTW inspired the jack2 implementation, the new windows implementation and most of the linux implementations.

thats reallity. does it need some updates and features and enhancements, sure, but its the basis for every other media backend design I can see and it was the first design to actually concouqer latency in a practical and sane way. You can spend 20 years doing something, and still suck at it. and you can spend 20 years doing something and master it.

Re: cortex

jua wrote:

I had lost interest in replying due to the increasingly aggressive discussion style. So, if you want to know more about the Media Kit, I'd just refer you to the plethora of available documentation.

In the end, you keep arguing that the Media Kit is fundamentally flawed, as if it was just a theoretical concept which has never been tried out in practice. Fact is, two implementations exist, the one in BeOS and the one in Haiku. Believe it or not, they do work.

+1

Re: cortex

In Haiku require an intuitive interface, control audio streams, including the software should be very easy to learn. Cortex'a will have little to master musicians :), need at least a semblance of Logic Studio, Adobe Audition3 (CoolEditPro), FL Studio - simple, comfortable music editors(trnslt)

Re: cortex

Quote:

This probe can only measure the state when it is performed, not the state it's being asked to predict. You haven't offered (and indeed the available Haiku software makes no attempt to offer) a solution for that. The entire premise (if we are to believe jua) of the Media Kit is undermined by these estimates.

Check for example the equalizer node provided with Haiku, its latency estimate is calculated by simply running its calculations over a test buffer at the moment of connection. This is exactly the sort of probe you're talking about, and jua would have us believe that it shouldn't do this because the results running alone now will of course not reflect what happens if later it is competing with another node for timeslices. But, what should it do instead?.

The solution is simply to do this check some times programmatically and update the latencies accordingly. But for most nodes it's not needed.

I'll try to use different words for explaining what you aren't understanding (or refusing to). Before all it's false that latencies aren't not taken into consideration in jack. If you go through the documentation at some point this question is explained very well. Basically the latency callback is not declared by clients with only input or output ports, or when the input is completely unrelated to the output. In BeOS the situation is not different, for (pure) consumers the latency isn't something that have meaning at all.

Quote:

For audio on a PC type system? Do all the work and race to idle. This was the approach taken by JACK for good reason. Note that if you lose this race then by definition you could not have avoided a buffer underrun by any means, whereas you can't be sure with the Media Kit approach described above -- so JACK is justified in deciding to kick a node from the graph for running too slowly if it misses the deadline. In the Media Kit design you can at best rely on heuristics for managing lateness.

Extending what i said before, let's dispel some myths out there. The approach taken by Haiku is performance driven, and it's true. But by itself this means only that you have more control on what will happen in the node. This is probably not needed in a audio system but the media_kit is not a such type of beast in the strict meaning. So, i'll show you what is the current approach which usually a BMediaEventLooper take :

1 - the new buffer event is scheduled at time x
2 - receive a new buffer event at time x
3 - create the buffer and send it

Theoretically there's nothing preventing you to do this :

1 - create the buffer
2 - the new buffer event is scheduled at time x
3 - receive a new buffer event at time x, retrieve the buffer and send it

From the latency perspective, i think jack use some method to estabilish how many time it will take to process the client (when the latency_callback is not declared), and a similar technique may be used on top of the media_kit. I don't think really those are the problems out there.

Re: cortex

NoHaikuForMe wrote:

On the one hand, I really am mostly interested in explaining the theory. It's an opportunity for people to learn stuff that's useful outside of the present context, stuff that I wish I'd known twenty years ago before I wrote my first audio software.

But on the other hand, in practice yes the Media Kit design really is pretty terrible. Does it work? After a fashion sure, and so did the BeOS R5 netserver.

Did you had to work with the mediakit or is it just from looking into the code? And wich framework are you thinking is "well" designed? What is your opinion. Because there are always to sides some Frameworks are "well" designed in the backend but the api is horrible and the other maybe the API is well designed but the Backend is way to slow.

JMF , GStreamer, Phonon, xine, DirectShow, Quicktime, ...?

Re: cortex

Barrett wrote:

The solution is simply to do this check some times programmatically and update the latencies accordingly. But for most nodes it's not needed.

Please give a variety of examples of nodes for which no solution is needed. You should explain your working. If you don't have any examples to give, you should reconsider your argument that "most nodes" don't need to solve this problem. Note that for the overrun examples "updating the latencies accordingly" just results in latencies climbing forever, you still can only do finite work in finite time as before. This is why jua's clever examples don't work, they assume you can do infinite work in finite time just by slicing it up more, maybe a nice argument for a theoretical mathematician, but quite obviously wrong for a computer programmer.

Quote:

I'll try to use different words for explaining what you aren't understanding (or refusing to).

OK?

Quote:

Before all it's false that latencies aren't not taken into consideration in jack.

Please try to rewrite this so that it doesn't contain a double negative with unclear valency. I recommend trying to write things in positive terms when you can, for example perhaps you meant here "It's true that JACK takes latency into consideration" ?

Quote:

If you go through the documentation at some point this question is explained very well. Basically the latency callback is not declared by clients with only input or output ports, or when the input is completely unrelated to the output.

As the documentation explains the latency callback is only to be used by clients which introduce an algorithmic delay. This occurs when the algorithm unavoidably needs to use "future" input sample frames to calculate the value of "past" output sample frames. A "lookahead limiter" is a popular and useful example.

You should not confuse it with delays related to the audio hardware (which JACK tracks separately), or with the "latency" values used by the Media Kit for scheduling decisions.

Quote:

In BeOS the situation is not different, for (pure) consumers the latency isn't something that have meaning at all.

The situation is quite different in BeOS because the latency reported is used for scheduling decisions. That would be nonsensical in JACK.

Quote:

Extending what i said before, let's dispel some myths out there. The approach taken by Haiku is performance driven, and it's true. But by itself this means only that you have more control on what will happen in the node. This is probably not needed in a audio system but the media_kit is not a such type of beast in the strict meaning. So, i'll show you what is the current approach which usually a BMediaEventLooper take :

1 - the new buffer event is scheduled at time x
2 - receive a new buffer event at time x
3 - create the buffer and send it

Theoretically there's nothing preventing you to do this :

1 - create the buffer
2 - the new buffer event is scheduled at time x
3 - receive a new buffer event at time x, retrieve the buffer and send it

For a BMediaEventLooper implementing a trivial effect (say, doubling the linear amplitude of an audio stream) which of these steps is deriving the new values from the old? I suppose it is "create the buffer" ? In this case you are wrong, you are prevented from moving this step before "receive a new buffer" because you cannot proceed until you have the input data to be processed.

Of course you can sidestep this by incurring an entire period of additional latency for each such node. That's actually more or less what Be's engineers were recommending when the BMediaEventLooper was introduced. Their plan was to come up with a fix for the next release of BeOS. Alas, events overtook them and BeOS R5 missed quite a lot of things more obvious than this. The Media Kit you have today leaves the problem unsolved.

Quote:

From the latency perspective, i think jack use some method to estabilish how many time it will take to process the client (when the latency_callback is not declared), and a similar technique may be used on top of the media_kit. I don't think really those are the problems out there.

No. JACK doesn't need to "establish how many time it will take to process the client". It just runs the client, synchronously as described previously. If the client takes too long to run (causing xruns) it is dropped from the graph.

Because JACK does not have this imaginary "method" you cannot incorporate it "on top of the media_kit".

Re: cortex

Paradoxon wrote:

Did you had to work with the mediakit or is it just from looking into the code? And wich framework are you thinking is "well" designed? What is your opinion. Because there are always to sides some Frameworks are "well" designed in the backend but the api is horrible and the other maybe the API is well designed but the Backend is way to slow.

I have spent some small amount of time playing with the Media Kit (writing actual code, though nothing of value), and somewhat more time reading the documentation and source code. I had actually assumed, years ago when it was current, that it was somewhat more capable than in fact it is.

I have also written non-trivial programs for JACK and fooled around with GStreamer. I have read about, but never had cause to use, JMF and CoreAudio.

Problems like being "way to slow" are called Quality of Implementation issues. Haiku scores very badly here. For example over the years there have been several bug reports (e.g. #1351, #9438) about Haiku's interpolating resamplers. From time to time it is pointed out that the entire approach of the resampler is wrong (e.g. it needs to know the actual sample rates in order to function correctly) and must be replaced, but as with most things in Haiku nobody finds the time to do it, although they've got time for bike-shedding about it.

The API is nothing special, it's certainly not the worst part of the Media Kit. The programmer is (as several others have observed) left doing a lot of unnecessary housekeeping that could be taken care of by the OS / the kit. There are various things missing, and no evidence anyone is looking to add them (such as monitors and session management). It looks more or less how you'd expect given that it was all abandoned, unfinished, almost 15 years ago.

Re: cortex

NoHaikuForMe wrote:

Please give a variety of examples of nodes for which no solution is needed. You should explain your working.

Any BBufferConsumer is an example of a node which doesn't have the need to have a complex latency handling, because it's usually fixed. The consumer latency (more like jack do with the driver latency), is used to estabilish how many time it will pass before the buffer is heard at the speakers.

Talking more generally, if the node processing cost is not variable (in the sense that it's every time a Ω(n) or Ω(n log n) and so on), it's latency may be perfectly calculated using a phantom cycle, just as i've seen Clockwerk do.

So at this point you may say (as before) "What happens when the load become too high?". Well the media_kit, notice the producer that it's producing buffers too late. At this point the producer should update it's latency accordingly. I would like also to show out how this way to do things is more reasonable for video rather than audio.

NoHaikuForMe wrote:

Please try to rewrite this so that it doesn't contain a double negative with unclear valency.

You've understand perfectly. No need to comment a typo of a non mother tongue speaker. This just show us your joking way of doing discussions.

Quote:

As the documentation explains the latency callback is only to be used by clients which introduce an algorithmic delay. This occurs when the algorithm unavoidably needs to use "future" input sample frames to calculate the value of "past" output sample frames. A "lookahead limiter" is a popular and useful example.

You should not confuse it with delays related to the audio hardware (which JACK tracks separately), or with the "latency" values used by the Media Kit for scheduling decisions.

NO. It's used to get the latency of a port which is not trivial. So for example if the internal signal path is not unique and jack can't trivially estabilish how to operate. The callback is needed for example in a client processing audio from two separate inputs to two separate outputs.

Quote:

The situation is quite different in BeOS because the latency reported is used for scheduling decisions. That would be nonsensical in JACK.

Absolutely not, scheduling is another story. BeOS use the latency to determine when the buffer should be played out.

Quote:
Quote:

1 - create the buffer
2 - the new buffer event is scheduled at time x
3 - receive a new buffer event at time x, retrieve the buffer and send it

For a BMediaEventLooper implementing a trivial effect (say, doubling the linear amplitude of an audio stream) which of these steps is deriving the new values from the old? I suppose it is "create the buffer" ? In this case you are wrong, you are prevented from moving this step before "receive a new buffer" because you cannot proceed until you have the input data to be processed.

The only difference with jack in this case, is that jack process everything in a unique cycle. Where the BeOS instead, use different threads to do that in a cooperative fashion. But if a jack client block out the results will be the same (an xrun).

Quote:

Of course you can sidestep this by incurring an entire period of additional latency for each such node. That's actually more or less what Be's engineers were recommending when the BMediaEventLooper was introduced.

That's something like a xrun.

Quote:

No. JACK doesn't need to "establish how many time it will take to process the client". It just runs the client, synchronously as described previously. If the client takes too long to run (causing xruns) it is dropped from the graph.

Because JACK does not have this imaginary "method" you cannot incorporate it "on top of the media_kit".

Anyway, in sync mode jack always wait for the clients to end. In async he doesn't wait and go to the next cycle. If jack were not taking into account latencies, how it recognize xruns?

Well, i remember it doing all this way in the driver :

1 - the buffer is long 10ms
2 - process graph
3 - if the process has been more long than 10 ms we have an xrun

But this way to work is replicable also in Haiku. Jack is like a media_node itself and can be compared to it. It's clients aren't comparable to media_nodes but a little subset of it.

Re: cortex

NoHaikuForMe wrote:

This is why jua's clever examples don't work, they assume you can do infinite work in finite time just by slicing it up more,

No. Neither me nor my examples ever assumed that.

I'm not interested in getting into this discussion with you again, but please, stop spreading false information.

Re: cortex

Barrett wrote:

BBufferConsumer is an example of a node which doesn't declare a latency.

BBufferConsumers make up rather less than "most" nodes. In many cases the BBufferConsumer is paired with a BBufferProducer which does declare a latency.

Quote:

Talking more generally, if the node processing cost is not variable (in the sense that it's every time a Ω(n) or Ω(n log n) and so on), it's latency may be perfectly calculated using a phantom cycle, just as i've seen Clockwerk do.

It seems as if this recycles the same argument we demolished earlier in the thread, go back and read it.

Quote:

So at this point you may say (as before) "What happens when the load become too high?". Well the media_kit, notice the producer that it's producing buffers too late. At this point the producer should update it's latency accordingly. I would like also to show out how this way to do things is more reasonable for video rather than audio

You run into a tautology. If you can't get all the work done, you can't get all the work done. In general if you can't get 10ms of work done in less than 10ms, you won't have any more luck trying to do the remaining work and the next 10ms of work in the subsequent ten milliseconds. This is a conceptual error by Be engineers that we've been banging our heads against on and off for the last half of this thread.

Quote:

You've understand perfectly. No need to comment a typo of a non mother tongue speaker. This just show us your joking way of doing discussions.

I can't be sure whether I understand you when you write things that make no sense. Your claim here was wrong, as I explained, so it was also possible you didn't mean to claim that at all, I gave you the benefit of the doubt.

Quote:

NO. It's used to get the latency of a port which is not trivial. So for example if the internal signal path is not unique and jack can't trivially estabilish how to operate. The callback is needed for example in a client processing audio from two separate inputs to two separate outputs.

I don't think you've understood why this latency for most clients is "trivial". It's ZERO. Remember, this is strictly algorithmic latency. For all clients without algorithmic latency, regardless of how many inputs or outputs they have, no callback is needed.

I am concerned that you still haven't grasped what's going on here, because the Media Kit does such a lousy job of explaining the realities of real time audio. Algorithmic latency here is the latency that's unavoidably incurred by the mathematics of what we want to do. Not by whether you use an overclocked Xeon, or whether you used the latest high-speed DIMMs or whatever. So let's do a worked example.

We'll choose a lookahead limiter because although in practice they can be quite complicated both their purpose and the fundamental method of operation are fairly accessible. The purpose of the lookahead limiter is to ensure that, no matter how loud the input sounds are, the output is kept under a particular threshold, usually configured as a parameter, and further, the sound is distorted as little as possible. Imagine you're on stage and there is a sound guy controlling the mixing desk, but he is psychic, so that a fraction of a second before you play something very loud he adjusts the gain so that it doesn't deafen the audience. That's what this limiter is for.

Obviously computer software, unlike the imaginary sound guy, is not psychic. It applies gain based on the trend of samples, to give smooth changes and thus reduce distortion. So our limiter cannot set the gain correctly for a particular sample N until it knows what the next few samples N+1, N+2 ... N+M are. The value of M for a lookahead limiter might be something like 48.

If we're sending and receiving sample frames in periodic blocks (as in both the Media Kit and JACK) then we can't just use the received samples to calculate the samples we're sending, because we need the next M samples too. So instead we insert M samples of delay, always outputting samples that are M samples later than the ones we received, this is the algorithmic latency of our limiter and that's what JACK's callback is for.

Does that make sense now?

Quote:

Absolutely not, scheduling is another story. BeOS use the latency to determine when the buffer should be played out.

Exactly, this is a wrong-headed design, and the Media Kit pays dearly for it. It's hard to tell exactly when this muddle started, we do not have the source code history of BeOS. Algorithmic latency is largely irrelevant to real-time scheduling, so the BeOS "latency" measurements end up proxying as a way to schedule the processing work.

Quote:

The only difference with jack in this case, is that jack process everything in a unique cycle. Where the BeOS instead, use different threads to do that in a cooperative fashion.

Yes, so your suggested "improvement" doesn't work, unless as explained below you're willing to incur a one period per node additional latency. This is the decision Be's developers didn't like the look of back in 1999, but we don't know if they could have come up with a solution (for example, a more rational way to schedule things in the Media Kit) because soon after the "focus shift" was announced, BeOS R5 PE was pushed out the door and developers who weren't focused on making cheap "Network Appliances" work with BeOS were fired.

Quote:

That's something like a xrun.

No, it's additional latency, an entire period (ie some agreed number of frames) of additional latency, per node. This is a very high cost to pay, and one Haiku developers have previously claimed is not paid in Haiku. Feel free to either dispute this with them, or accept it as true.

Quote:

Anyway, in sync mode jack always wait for the clients to end. In async he doesn't wait and go to the next cycle. If jack were not taking into account latencies, how it recognize xruns?

You seem to have answered the question yourself. If a new cycle begins and we are still processing the previous period then we have an xrun.

Quote:

Well, i remember it doing all this way in the driver :

1 - the buffer is long 10ms
2 - process graph
3 - if the process has been more long than 10 ms we have an xrun

Indeed? Well I suppose in a way you are almost right. The xrun is detected because the hardware driver (for example ALSA) reports it to userspace. JACK needn't try to guess. You might think "but it's not a guess" alas you would be wrong, the clock that matters for xruns is the sample clock on the hardware, which may or may not match wall clock time. The hardware driver has access to this clock (we hope) but the userspace application does not directly.

Quote:

But this way to work is replicable also in Haiku. Jack is like a media_node itself and can be compared to it. It's clients aren't comparable to media_nodes but a little subset of it.

In what sense is JACK "like a media_node itself" ?

Re: cortex

jua wrote:

No. Neither me nor my examples ever assumed that.

Actually you wrote:

many nodes can work concurrently, even on a single-processor system.

and

Actually you wrote:

It should not take twice as long if the node was written correctly.

How else can we interpret this? I think I offered adequate time for you to explain it some other way but you declined. The node has no way, when it gives the estimate, of knowing the actual runtime conditions under which, according to you "if the node was written correctly" it must meet the deadline. In practice then it will unavoidably fall short of your requirement, and all today's Haiku Media Kit producers behave this way.

Quote:

I'm not interested in getting into this discussion with you again, but please, stop spreading false information.

I'm happy to limit myself to just quoting your exact words in future if you prefer.

Re: cortex

NoHaikuForMe wrote:

How else can we interpret this?

Two nodes doing the same work at the same time doubles the CPU load, not the processing time. I explained this in detail before, I won't do it again.

NoHaikuForMe wrote:

The node has no way, when it gives the estimate, of knowing the actual runtime conditions under which, according to you "if the node was written correctly" it must meet the deadline.

Estimating latency is tricky, but by no means impossible. Same as before, I won't go over it all again.
If you want to know more about it, read the sourcecode of actually existing nodes. You see, Media Kit nodes exist, and they work.

Re: cortex

NoHaikuForMe wrote:

It seems as if this recycles the same argument we demolished earlier in the thread, go back and read it.

You run into a tautology. If you can't get all the work done, you can't get all the work done. In general if you can't get 10ms of work done in less than 10ms, you won't have any more luck trying to do the remaining work and the next 10ms of work in the subsequent ten milliseconds. This is a conceptual error by Be engineers that we've been banging our heads against on and off for the last half of this thread.

If the node can't get the work done in 10ms, the final consumer notify it, then the node can choose their way to work.

Quote:

I don't think you've understood why this latency for most clients is "trivial". It's ZERO. Remember, this is strictly algorithmic latency. For all clients without algorithmic latency, regardless of how many inputs or outputs they have, no callback is needed.

If there's not algorithmic latency the same is on Haiku. You have just the scheduling latency plus some little constant. This is completely unrelated to the discussion.

Quote:

Obviously computer software, unlike the imaginary sound guy, is not psychic. It applies gain based on the trend of samples, to give smooth changes and thus reduce distortion. So our limiter cannot set the gain correctly for a particular sample N until it knows what the next few samples N+1, N+2 ... N+M are. The value of M for a lookahead limiter might be something like 48.

If we're sending and receiving sample frames in periodic blocks (as in both the Media Kit and JACK) then we can't just use the received samples to calculate the samples we're sending, because we need the next M samples too. So instead we insert M samples of delay, always outputting samples that are M samples later than the ones we received, this is the algorithmic latency of our limiter and that's what JACK's callback is for.

Does that make sense now?

The client should calculate such latency, we're in the same situation of a media_node. No difference with the media_kit.

Quote:
Quote:

Absolutely not, scheduling is another story. BeOS use the latency to determine when the buffer should be played out.

Exactly, this is a wrong-headed design, and the Media Kit pays dearly for it. It's hard to tell exactly when this muddle started, we do not have the source code history of BeOS. Algorithmic latency is largely irrelevant to real-time scheduling, so the BeOS "latency" measurements end up proxying as a way to schedule the processing work.

While i understand your reasons, the BeOS latency is not used to schedule any work, stop saying it.

Quote:
Quote:

The only difference with jack in this case, is that jack process everything in a unique cycle. Where the BeOS instead, use different threads to do that in a cooperative fashion.

Yes, so your suggested "improvement" doesn't work, unless as explained below you're willing to incur a one period per node additional latency. This is the decision Be's developers didn't like the look of back in 1999, but we don't know if they could have come up with a solution (for example, a more rational way to schedule things in the Media Kit) because soon after the "focus shift" was announced, BeOS R5 PE was pushed out the door and developers who weren't focused on making cheap "Network Appliances" work with BeOS were fired.

The cost paid is the additional latency notified.

Quote:

You seem to have answered the question yourself. If a new cycle begins and we are still processing the previous period then we have an xrun.

That is what happens in Haiku, if we are in the performance time but have no buffers to play, we have a late producer.

Quote:

JACK needn't try to guess. You might think "but it's not a guess" alas you would be wrong, the clock that matters for xruns is the sample clock on the hardware, which may or may not match wall clock time. The hardware driver has access to this clock (we hope) but the userspace application does not directly.

The BeOS use timesources, which are exactly that.

Quote:

In what sense is JACK "like a media_node itself" ?

The media_node have control on things which in the jack environment only jack have.

Re: cortex

Barrett wrote:

If the node can't get the work done in 10ms, the final consumer notify it, then the node can choose their way to work.

In practice all of the supplied nodes and example code have the same "way to work": They increase their internal latency estimate (up to an arbitrary limit) and press on regardless. Since most "lateness" reports are speculative they often get away with this.

As these internal latencies grow beyond one buffer period (for example, default Media Players alone can easily end up reporting a latency over 140ms, despite a buffer period of 40ms) the system starts to experience buffer bloat. Despite a requirement in the Be Book, Haiku's BufferGroup code does not block but instead always returns an error if no block is yet available, so sooner or later the system will trip over its own nose in this fashion, and indeed various bugs have been reported that amount to this problem.

Quote:

If there's not algorithmic latency the same is on Haiku. You have just the scheduling latency plus some little constant. This is completely unrelated to the discussion.

It's important because this algorithmic latency must be tracked in some pro audio applications. As we'll see Haiku ends up not providing any way to do that. If you want to handle algorithmic latency in Haiku you'll need to abolish (or entirely re-think) the Media Kit.

Quote:

The client should calculate such latency, we're in the same situation of a media_node. No difference with the media_kit.

This is an attractive option, and it seems (though we can't be sure at this distance of time) to be how Be's engineers conceived "event latency" in the Media Kit very early on. By the time it was actually shipping, however, they had switched to the present approach, in which reporting algorithmic latency here doesn't work.

Quote:

While i understand your reasons, the BeOS latency is not used to schedule any work, stop saying it.

The BMediaEventLooper's ControlLoop() dispatches B_HANDLE_BUFFER events (thus, the actual work of the node) based on the node's reported "latency" and its own estimate of scheduling latency. So yes, this is used to schedule work.

This is where (what I believe was) the original scheme that would have handled algorithmic latency fell apart. To get the Control Loop to behave in a desirable way Be's engineers began recommending that people use SetEventLatency such that the reported "latency" was almost an entire buffer period. By the time BeOS R4.5 shipped, all example code took as its assumption the fact that "latency" is really just a curious way to say "how long before performance time this node should be run" and all concept of actual algorithmic latency is gone.

Quote:

That is what happens in Haiku, if we are in the performance time but have no buffers to play, we have a late producer.

Critically, in Haiku every consumer makes this decision independently. The idea seems to have been that this propagates lateness detection through the network, but in practice the effect is that reported latencies grow until they bump into the arbitrary limits hacked in by various developers over the years.

In JACK xrun detection is a problem solely for the driver, at the edge of the graph. So long as the graph executes in its entirety within a buffer period, we're OK.

Quote:

The BeOS use timesources, which are exactly that.

BTimeSources are software, they estimate a relationship between real time tracked by the OS and, for the cases we're interested in, a frame counter increased by the Multi Audio Node. Whilst more useful than just relying on the real time reported by the OS directly, they are not the hardware, just a proxy.

Haiku actually ignores the behaviour dictated by the Be Book for these time sources in a bunch of places, but it hardly matters because in practice so did BeOS.

The Multi Audio Node's consumer does make its lateness decisions based on whether it was able to write buffers in time, as far as I can tell. But other consumers don't have that option, so most "lateness" reports in Haiku are speculative.

Re: cortex

NoHaikuForMe wrote:

In practice all of the supplied nodes and example code have the same "way to work": They increase their internal latency estimate (up to an arbitrary limit) and press on regardless. Since most "lateness" reports are speculative they often get away with this.

As these internal latencies grow beyond one buffer period (for example, default Media Players alone can easily end up reporting a latency over 140ms, despite a buffer period of 40ms) the system starts to experience buffer bloat. Despite a requirement in the Be Book, Haiku's BufferGroup code does not block but instead always returns an error if no block is yet available, so sooner or later the system will trip over its own nose in this fashion, and indeed various bugs have been reported that amount to this problem.

The additional buffer is requested only in offline mode. If we are not in offline and recording mode, the buffer is ignored, and the producer is notified to be too late. The latency is not arbitrary but calculated this way :

lateness = (performance_time - latency - real_time)*-1

So it's not arbitrary at all, it's just the same which is done in jack. I've not looked into BBufferGroup, but when developing, i remember it blocking when there aren't enough frames. About the media player latency, it's a bit more complex. The calculus is done this way , media_player_latency + downstream latency. In your case more than 100 ms are caused by the long buffers which use Haiku as default. If you use the hda_audio patch to your haiku build, you will reduce it a lot.

Quote:

If you want to handle algorithmic latency in Haiku you'll need to abolish (or entirely re-think) the Media Kit.

You may say that it's doomed, but the latency mechanism of Haiku separate various latencies, and one of them is the processing latency.

Quote:

The BMediaEventLooper's ControlLoop() dispatches B_HANDLE_BUFFER events (thus, the actual work of the node) based on the node's reported "latency" and its own estimate of scheduling latency. So yes, this is used to schedule work.

Well, it's not the latency used to schedule work but the event_time.

Quote:

Critically, in Haiku every consumer makes this decision independently.
In JACK xrun detection is a problem solely for the driver, at the edge of the graph. So long as the graph executes in its entirety within a buffer period, we're OK.

It's true, but also not. If we take as example a situation where the producer is connected to the sound card, so this decision is made by the audio card consumer. This isn't bad, but reasonable because the media_kit was conceived to do different work at different time and format needs. Also this make the cpu load less, since the OS have more possibility to correctly balance it.

Quote:

BTimeSources are software, they estimate a relationship between real time tracked by the OS and, for the cases we're interested in, a frame counter increased by the Multi Audio Node. Whilst more useful than just relying on the real time reported by the OS directly, they are not the hardware, just a proxy.

Do you remember the "DAC time source"? It's different than the system one. If the soundcard driver provide it, it will use the card clock.

Quote:

Haiku actually ignores the behaviour dictated by the Be Book for these time sources in a bunch of places, but it hardly matters because in practice so did BeOS.

Could you fill bugs in the bugtracker? At least you will be of some help to understand your reasons against the media_kit. Other than that one probably will find a solution. Alternatively, your words are just words. Verba volant scripta manent.

Quote:

But other consumers don't have that option, so most "lateness" reports in Haiku are speculative.

Don't know where you are looking, but those are not speculative. They are calculated in the way i showed at the begin of the article. If you found that in a Haiku's node, fill a bug report.
And in the end, we are taking about BMediaEventLooper, but a node isn't required to be derived from it. So this part of the system could be improved by providing a better designed class. Since you seems so expert, why not suggest a new design? At least to solve what we can solve, if the idea is worth one may end up with an implementation.

Re: cortex

jua wrote:

Two nodes doing the same work at the same time doubles the CPU load, not the processing time.

Do you understand what CPU load is? When you do more work, it takes more time. Perhaps this time does not feel perceptible to you, on human timescales, but the reason the CPU load goes up is that the CPU is spending more time working. Processing time has increased.

Quote:

Estimating latency is tricky, but by no means impossible. Same as before, I won't go over it all again.
If you want to know more about it, read the sourcecode of actually existing nodes.

There are basically two "methods" used by existing nodes. One method is to pick a number out of their air, say, one millisecond. The node declares its latency will always be one millisecond no matter what. This "estimate" is worthless.

The other one - which we'd already addressed - is to measure once at startup how long it takes to run the function that generates the output, this by its nature cannot take any account of future changes.

Quote:

You see, Media Kit nodes exist, and they work.

Yes, you've said that before. And I even agreed with you, they do exist, they're just awful.

Haiku gets away with a lot because hardware has got a lot more capable since the days of BeOS. The poorly implemented resampler I mentioned earlier for example, doesn't sound as terrible on a PC with 192kHz audio, it's still bad, but that higher sample rate hides the mistakes better.

Re: cortex

NoHaikuForMe wrote:

Do you understand what CPU load is? When you do more work, it takes more time. Perhaps this time does not feel perceptible to you, [...] Processing time has increased.

Before we continue misunderstandings, let's clear up definitions: when CPU load goes up, the increasing measure is CPU time. CPU time is the time the CPU uses to do actual work as opposed to being in its idle thread. By 'processing time' I mean something different: the wall-clock time it takes for a buffer to travel through a node. Whether a node takes 1 second at 20% CPU load or 1 second at 40%: that doesn't matter, it's still 1 second. The node could also use the 100% CPU for 0.1s and then sleep the remaining 0.9s before sending out the buffer -- still 1 second.

Quote:

There are basically two "methods" used by existing nodes.

There's a third one used by nodes: look at the buffer duration and make a calculation based on that.

Re: cortex

jua wrote:

There's a third one used by nodes: look at the buffer duration and make a calculation based on that.

Good point, the System Mixer uses two different numbers for this calculation to determine its internal "latency", I had considered these to basically be picking a number from the air, but you could treat them more kindly if you were generous.

For a FormatChangeRequested it chooses either the buffer duration plus 4.5ms or 1.5 times the buffer duration, whichever is larger.

At Connect it chooses buffer duration plus 3.5ms or 1.5 times the buffer duration plus 1.5ms, whichever is larger.

As with many nodes if it receives a "late" notice it increases the internal "latency" by the reported lateness, unless it reaches an arbitrary limit (here 150ms) and then further late notices are ignored.

These choices mean that under normal circumstances after a particular buffer B has been created by the System Mixer, but before that buffer is handled by the Multi Audio Node, the System Mixer will calculate a further buffer B+1, and queue that too, so there's always an extra buffer "in the air". I have not found any commentary to explain this decision, if indeed there ever was a conscious decision to have it work this way.

Re: cortex

jua wrote:

Whether a node takes 1 second at 20% CPU load or 1 second at 40%: that doesn't matter, it's still 1 second. The node could also use the 100% CPU for 0.1s and then sleep the remaining 0.9s before sending out the buffer -- still 1 second.

On the whole actual nodes don't block (sleep while working) like your examples. They're running at far higher priority than almost everything else (Most at B_URGENT_PRIORITY with some at B_REAL_TIME_PRIORITY) so they can't lose the lottery against less prioritised threads like an ordinary B_NORMAL_PRIORITY thread can occasionally and their memory is (supposed to be) wired so they don't wait for paging.

Any loads you may see reported in the user interface are averages over a considerable time, usually a second or more. So they're taking into account long (relatively speaking) periods when everything is asleep, not because a node blocked while working on a buffer, but because there was no work to be done.

So, when the node is actually doing any work, it always does that at 100% "CPU load". If two nodes need to do some work, they each do their work at 100% "CPU load", but they have to take turns and so it takes longer in wall clock time. Is this clearer for you now? The appearance of everything running simultaneously in a modern desktop OS is an illusion, like the appearance of continuous motion on a movie screen achieved by projecting a series of still images.

Re: cortex

Barrett wrote:

The additional buffer is requested only in offline mode.

Yes, but buffers are created spontaneously as the event time triggers. Offline mode is... vestigial at best at this point. Most stuff does not work properly in offline mode, ignore it.

Quote:

If we are not in offline and recording mode, the buffer is ignored, and the producer is notified to be too late. The latency is not arbitrary but calculated this way :

lateness = (performance_time - latency - real_time)*-1

Did you mean to say lateness rather than latency, here? I can't tell what you think you're telling me.

Quote:

So it's not arbitrary at all, it's just the same which is done in jack

I said the limit is arbitrary. It was imposed by Axel back in 2010, it's set to 150 milliseconds, no further rationale was ever provided or asked for, it's an arbitrary choice.

Quote:

About the media player latency, it's a bit more complex. The calculus is done this way , media_player_latency + downstream latency. In your case more than 100 ms are caused by the long buffers which use Haiku as default. If you use the hda_audio patch to your haiku build, you will reduce it a lot.

The reported latency starts significant lower and grows. The buffer period is, as I already said, about 40ms, and yes, this leaves more than one buffer "in flight". The code actually acknowledges this, it ensures there are extra buffers allocated to allow for it, hence buffer bloat.

Quote:

You may say that it's doomed, but the latency mechanism of Haiku separate various latencies, and one of them is the processing latency.

I thought I'd explained what algorithmic latency is. It's just a little further back up the page, maybe go back and read it again.

Quote:

Well, it's not the latency used to schedule work but the event_time.

The event time just cranks monotonically here, the RealTimeFor() function basically converts the event time to a real time, then subtracts the declared "latency", and that's how long the BMediaEventLooper() will sleep. Thus, Media Kit latency controls when the work is actually scheduled to be done, increase the "latency" and the work will be scheduled sooner. It's a self-fulfilling prophecy.

Quote:

Do you remember the "DAC time source"? It's different than the system one. If the soundcard driver provide it, it will use the card clock.

That's actually the example I was looking at while writing. The card clock is not directly exposed as "DAC time source", instead, exactly as I wrote, the Multi Audio Node updates a frame counter which acts as the "clock" for this time source. It's a subtle difference, but it's worth knowing about.

Quote:

Could you fill bugs in the bugtracker? At least you will be of some help to understand your reasons against the media_kit. Other than that one probably will find a solution. Alternatively, your words are just words. Verba volant scripta manent.

There are plenty of bugs already filed about the Media Kit. Maybe somebody will fix some more of them this year. I am not speaking at all, any voices you're hearing are in your head, this is just text, as a bug report would be text.

Quote:

Don't know where you are looking, but those are not speculative. They are calculated in the way i showed at the begin of the article.

Nothing was actually late in a sense that would matter to the user, but the producer is sent a late notification anyway. That's why I call it speculative. Remember that all the extant nodes respond to this by increasing their reported latency (so that, as we saw above, they will run "earlier").

Re: cortex

realtime typically isn't more than 44.1kHz though -- extra time for calculations on a processor several dozens of thousands of times faster is not touching milliseconds. it'd help to speak in terms of real sample rates (in place of "realtime") vs real cpu cycles. otherwise, no matter how adamantly one might appeal to reality, we're not discussing real use and every number brought up is completely baseless.

Re: cortex

NoHaikuForMe wrote:

Is this clearer for you now?

I understand scheduling and load tracking well, no worries. Your description there isn't really correct though, at least not for Haiku (hint: a B_URGENT_PRIORITY thread still plays "the lottery" -- lower-priority threads still get time-slices as well, no matter how much CPU the urgent one demands).

Anyway, I didn't even want to get lured into this discussion again, I will stop now.

Re: cortex

jua wrote:

I understand scheduling and load tracking well, no worries.

Oh?

jua wrote:

Your description there isn't really correct though, at least not for Haiku (hint: a B_URGENT_PRIORITY thread still plays "the lottery" -- lower-priority threads still get time-slices as well, no matter how much CPU the urgent one demands).

"The Be Book" wrote:

Real-time (100 and greater). A real-time thread is executed as soon as it's ready. If more than one real-time thread is ready at the same time, the thread with the highest priority is executed first. The thread is allowed to run without being preempted (except by a real-time thread with a higher priority) until it blocks, snoozes, is suspended, or otherwise gives up its plea for attention.

Both the old "simple" scheduler and Pawel's scheduler largely‡ honour this expectation, both according to their authors and my review of the code. In Pawel's scheduler threads at B_URGENT_PRIORITY or other real time priorities are exempt from busy/ greediness penalties used to punish normal threads. To be quite clear: lower-priority threads get no time-slices unless/ until the urgent thread blocks. The exact opposite of what you wrote.

‡ So far as I can see Pawel doesn't strictly honour the case where multiple threads are all runnable at say, B_URGENT_PRIORITY. Be's intention is that this will be treated like POSIX SCHED_FIFO rather than SCHED_RR but Pawel seems to have implemented SCHED_RR. It's a small nitpick under the circumstances.

Re: cortex

NoHaikuForMe wrote:

Both the old "simple" scheduler and Pawel's scheduler largely‡ honour this expectation

While I don't have time to look at the code in detail now, it seems you are right on this one. I somehow remembered it only applied to B_REAL_TIME_PRIORITY itself, not anything above 100 already.
Anyway, you might want to reread what I said about processing time.

As I said, I won't discuss things further in here.

Re: cortex

Quote:

Did you mean to say lateness rather than latency, here? I can't tell what you think you're telling me.

Suppose we receive an event for performance time at 50 ms, but real time is at 60 ms, suppose our latency is 1 ms : 50 - 1 - 60 = -11 then multiply this value for -1.

Quote:

That's actually the example I was looking at while writing. The card clock is not directly exposed as "DAC time source", instead, exactly as I wrote, the Multi Audio Node updates a frame counter which acts as the "clock" for this time source. It's a subtle difference, but it's worth knowing about.

Don't see how this is a design flaw, it seems natural to have an hardware abstraction.

Quote:
Quote:

Could you fill bugs in the bugtracker? At least you will be of some help to understand your reasons against the media_kit. Other than that one probably will find a solution. Alternatively, your words are just words. Verba volant scripta manent.

There are plenty of bugs already filed about the Media Kit. Maybe somebody will fix some more of them this year. I am not speaking at all, any voices you're hearing are in your head, this is just text, as a bug report would be text.

The topic is becoming annoying, and you are more or less talking with a hobbyist. If you are so sure about what you say, i think you should talk with core developers. And yes there are various ticket for the media_kit but no one related to the problems we discussed. Said that, i'll stop to talk with you.