World's fastest Haiku box
Well almost. The owner of this 8-core Mac Pro successfully booted Haiku on this PC and it is for sale:
You could do some very fast Haiku work here. I'm working on some SMP code that will use all processors.
Well almost. The owner of this 8-core Mac Pro successfully booted Haiku on this PC and it is for sale:
You could do some very fast Haiku work here. I'm working on some SMP code that will use all processors.
Comments
Re: World's fastest Haiku box
Pretty decent yes but dollar for dollar not the most speed you can buy with your money.
A friend of mine upgraded to 32 cores (quad socket) and 32Gb quad channel ram relatively cheaply. All you have to do is start looking at server boards instead of desktop and wait for the right price. The video card for now at least wouldn't matter what it is.
there are also already 12 core per socket CPUs which you can get for a bit more $$
Re: World's fastest Haiku box
Well I just built a new machine with the latest Intel 8 core Sandy Bridge CPU and 8GB RAM and it booted Haiku. But the video and network didn't work and I think there were interrupt issues that slowed it down. With that fixed and with Haiku installed on the SSD I got for this system as well it should be pretty darn fast. And since I got a deal on the CPU and SSD total cost was only around $600. No need to waste money on overpriced Mac hardware.
Re: World's fastest Haiku box
@ Ryan, you mean Intel Quad core w/HT. For desktop Intel only makes 6 core CPUs. HT (hyper threading) shows as core in OSes but not true core. ie, you get 25-30% performance boost compared to 85+% with real core.
http://en.wikipedia.org/wiki/Sandy_bridge#Desktop_processors
Be careful because Xeons come with 1 to 10 real CPU cores.
http://en.wikipedia.org/wiki/Xeon
versus the 2-6 real cores for the Core i7
http://en.wikipedia.org/wiki/Core_i7
This system uses 2 Xeon Quad core CPUs with server motherboard which is more costly than desktop parts.
http://en.wikipedia.org/wiki/Mac_Pro
@ rest,
People should only get the computer that they will fully use. No need to get 8 or more cores unless you really use them. Many cores and high CPU speeds also use up lots of electricity so it puts a greater strain on the electrical grid.
I really would like to know why people really need all this computing power. Except for compiling, 3D graphic design, video/audio conversion, scientific computing or running a server; the CPU gets used up fairly little for most desktop users.
Re: World's fastest Haiku box
I would like to say that although you may be able to build a server box from parts, that doesn't mean it would be good enough to put into production in a data warehouse. Mac Pros are indeed expensive, but the parts are premium. Mac Pros have lot of cores because Adobe software like PhotoShop can use them all and more. You may not need all those cores all the time, but if you are a professional on deadline, it's nice to know you have the headroom.
I look forward to the day when we have more haiku software that uses 8, 12 16, or more cores. It's what BeOS was designed for.
Re: World's fastest Haiku box
I would also like to say that it is nice to know that Haiku boots on Mac Pro. The core developers are doing good work and it shows.
Re: World's fastest Haiku box
My friend went with a board that supports the latest 12 core AMD cpus (for 48 cores) however he went with the 8 core ones for the value... for now.
Also The Apple software really can use the machine he said the difference between his machine and a quad core machine is immense.
Intel may be faster at single threaded stuff but there is a reason AMD dominates on supercomputers :) cost/performance is much better.
Re: World's fastest Haiku box
I would like to say that although you may be able to build a server box from parts, that doesn't mean it would be good enough to put into production in a data warehouse.
Advanced computer guys know high quality brands and can build a quality system from scratch. I've built quite a few systems myself which have lasted for years without issues but requires lots of knowledge of brands & parts. Only good thing with Apple is you do not have to worry about the system quality. For typical user might be worth paying the premium to know they are getting a very good quality system but for expert computer techs not worth it because they can build a same or better quality system for less money.
You may not need all those cores all the time, but if you are a professional on deadline, it's nice to know you have the headroom.
For professional use then yes systems with dual or quad socket makes sense for them if they do CPU intense work but not for your typical (or average) computer user. Only a small number of desktop super users will take advantage of all that computing horsepower.
I look forward to the day when we have more haiku software that uses 8, 12 16, or more cores. It's what BeOS was designed for.
BeOS had a 8 CPU or Core limit. Haiku has 8 CPU or Core limit too. I tested with Qemu & setting CPUs to 16 booted Haiku but only showed 8 CPUs.
http://haiku.it.su.se:8180/source/xref/headers/os/kernel/OS.h#409
I believe this was done to stay compatible with BeOS. R1 Haiku will *officially* support only 8 cores.
The number of cores used will depend how well and smartly threaded the application is. Yes, having more multi-threaded software on Haiku (or other OSes) would be very cool but apps that are CPU intense only benefit from it.
Re: World's fastest Haiku box
Also The Apple software really can use the machine he said the difference between his machine and a quad core machine is immense.
SMP support is also available in Linux, BSD, Haiku, Windows NT family, etc. Not just on MacOS X. Depends on which applications are used and how well they are multi-threaded. Without knowing what software or use for the system it is impossible to say anything.
I will say that is a very powerful system and hopefully he is using lots of multi-threaded software on it.
Re: World's fastest Haiku box
SMP support is also available in Linux, BSD, Haiku,
Tones, you can't get PhotoShop or FinalCut Pro for these ones. In the hands of a pro, you can charge $125+/hour. You would pay for a Mac Pro system in 2 weeks.
Re: World's fastest Haiku box
I don't recall the software he was running it for a friend that was in a hurry and didn't have time to wait weeks on his quad core to finish.
As far as what he runs he I know for a fact he is getting his bang for his buck LOL... he had it building all of the redhat repos from scratch mostly vanilla with some important fixes but I think scientfic Linux fixed up what he wanted working mostly as well as the slow release of centos 6 heh. Apparently at one point he even ran out of ram trying to build too many kernels at once which is mind boggleing but it happened O.o
Its overall the most impressive box I have ever seen though he built one for someone else with far more resources along the lines of 32-48 cores and 256Gb ram and 3x SLI
My mind has been on other things tonight and well I need sleep and am not getting it so I end up posting here... heh good night all and no I didn't hallucinate this post I am sure.
Personally I would sink my money in a TYAN board.... I have an ancient one (2x300Mhz) that runs like a tank it was an amazing board its day. Tyan Thunder 2 ATX very similar to the board the BeOS demos ran on though a few Mhz faster.
Re: World's fastest Haiku box
SMP support is also available in Linux, BSD, Haiku,
Tones, you can't get PhotoShop or FinalCut Pro for these ones. In the hands of a pro, you can charge $125+/hour. You would pay for a Mac Pro system in 2 weeks.
However for say, Red Hat Enterprise Linux you can get things like Autodesk's Maya
Although the direct reason for Haiku being limited to 8-way SMP is binary compatibility with BeOS, more pragmatically neither BeOS nor Haiku are really designed for many-way parallelism. Pure compute loads (where the operating system does almost nothing) should still run OK but as your needs grow the kernel will start to trip over its own feet so to speak.
Re: World's fastest Haiku box
So haiku is limited somehow? I'm not sure i follow sure there is scalability issues but I'm not sure I follow what you are saying directly with regard to haiku. I guess it could be similar to how IO performance was hampered on BeOS due to the suboptimal block size which was being used.
Re: World's fastest Haiku box
double post
Re: World's fastest Haiku box
SMP support is also available in Linux, BSD, Haiku,
Tones, you can't get PhotoShop or FinalCut Pro for these ones. In the hands of a pro, you can charge $125+/hour. You would pay for a Mac Pro system in 2 weeks.
Yes for professional use on MacOS X or Windows or as Linux server makes perfect sense but Haiku lacks the applications to make real use of a system like this.
Also, Final Cut Pro is made by Apple for MacOS X so no other choice for the OS. =)
You will not see FCP software on Windows.
Windows and MacOS X control most of the desktop market with 80% & 15% respectively or close to that. That's why you see most commercial software on those two OSes. ie, because they are popular and have most of the desktop OS market.
There still is some software available for Linux (open-source type) which do similar jobs.
http://en.wikipedia.org/wiki/Comparison_of_video_editing_software
http://en.wikipedia.org/wiki/Comparison_of_3D_computer_graphics_software
So a couple of options are still available but just not the big name ones. These shorts done with Blender look interesting but of course Blender not popular video editing choice like FCP.
http://en.wikipedia.org/wiki/Blender_3D#Use_in_the_media_industry
Re: World's fastest Haiku box
"Although the direct reason for Haiku being limited to 8-way SMP is binary compatibility with BeOS"
From previous discussion with Axel it is a simple recompile to increase this. It is not likely to break anything by increasing to 16 or even 32. Considering that you can soon get an AMD Bulldozer with 8 cores on a chip, and 4 chips to a motherboard, we might want to increase this number.
"more pragmatically neither BeOS nor Haiku are really designed for many-way parallelism. Pure compute loads (where the operating system does almost nothing) should still run OK but as your needs grow the kernel will start to trip over its own feet so to speak."
I disagree here. I think the massive threading in the OS makes it a good candidate for 'many-way' parallelism and I hope to release some code for testing and benchmarking. I do agree that the kernel will need some improvements. Like for instance I think the scheduler could use affinity. Right now I think I see a single load bouncing between two cores. That is probably sub-optimal.
Perhaps you have some other ideas on how to improve the kernel for larger CPU processing loads?
Re: World's fastest Haiku box
"http://en.wikipedia.org/wiki/Comparison_of_video_editing_software"
For best results, any video editing on Haiku should be 100% native. It should make full use of the rewritten media server and should use a native Haiku interface. We have source code for the old BeOS UltraDV video editor. It is a lot of code, it needs an experienced developer. It could be a shining star for Haiku when done right.
Re: World's fastest Haiku box
"http://en.wikipedia.org/wiki/Comparison_of_video_editing_software"
For best results, any video editing on Haiku should be 100% native. It should make full use of the rewritten media server and should use a native Haiku interface. We have source code for the old BeOS UltraDV video editor. It is a lot of code, it needs an experienced developer. It could be a shining star for Haiku when done right.
Sure, native software for Haiku will work best when you can find developers to write it. I believe Clockwerk is another option but I have not used either of these.
If open source, you should list it on OSDrawer. That way any developer can have a look and get it fixed up or working on Haiku.
http://dev.osdrawer.net/projects
OSDrawer is site for open source BeOS & Haiku code.
Re: World's fastest Haiku box
The original developers have not made the decision to open source UltraDV yet.
Thanks for the Clockwerk reference, I need to check that out. As always, Tones, you are a walking 'memory bank' :-)
Re: World's fastest Haiku box
Sure, native software for Haiku will work best when you can find developers to write it.
A point worth repeating. Obviously we'd all love to have native software good enough to rival any ported offerings, but in practice that is simply impossible.
Given the limited amount of developers, better to put the effort into software were a native version can really shine. Perhaps clockwerk is such a piece of software, I don't know what the open source competition is in this field and how much it could make use of Haiku's possibilities.
Webpositive is a nice proof-of-concept, and certainly overall better than the now 5+(?) years old BezillaBrowser, but even here I can't help but to think that an up-to-date port of Firefox/Chrome would be of much better use to end users. That said, since there is/are no such port(s), obviously Webpositive fills a need.
However, even with ports (and let's not kid ourselves, Haiku will be heavily dependant on ports in the foreseeable future) there are often possibilities to make the port appear/feel quite native.
Re: World's fastest Haiku box
I disagree here. I think the massive threading in the OS makes it a good candidate for 'many-way' parallelism and I hope to release some code for testing and benchmarking. I do agree that the kernel will need some improvements. Like for instance I think the scheduler could use affinity. Right now I think I see a single load bouncing between two cores. That is probably sub-optimal.
Soft-affinity would be an improvement even on the very common dual-core systems. But there are lots of other problems. Locks for example. To scale from one CPU to two or more, the locks ensure that access to critical data structures and code paths is serialised. But this serialisation becomes a bottleneck for the most frequently accessed structures as you scale up further.
Re: World's fastest Haiku box
I disagree here. I think the massive threading in the OS makes it a good candidate for 'many-way' parallelism and I hope to release some code for testing and benchmarking. I do agree that the kernel will need some improvements. Like for instance I think the scheduler could use affinity. Right now I think I see a single load bouncing between two cores. That is probably sub-optimal.
Soft-affinity would be an improvement even on the very common dual-core systems. But there are lots of other problems. Locks for example. To scale from one CPU to two or more, the locks ensure that access to critical data structures and code paths is serialised. But this serialisation becomes a bottleneck for the most frequently accessed structures as you scale up further.
Its not just a problem for Haiku, its a problem for all operatingsystems pretty much. Thats why gpu compute is so promising for big heavy dsp type of loads.
Re: World's fastest Haiku box
indeed on Linux it was the BKL (big kernel lock) which was introduced when SMP support was added and only recently has it been removed and replaced with mutexes I believe.
From what I gather a mutex is better than a lock as it doesn't stall other things it only protects acess to the component that requires serialization?
Re: World's fastest Haiku box
Locks for example. To scale from one CPU to two or more, the locks ensure that access to critical data structures and code paths is serialised. But this serialisation becomes a bottleneck for the most frequently accessed structures as you scale up further.
Unlike Linux, BeOS was multiprocessor since 1.0. Can you give an example of a lock that is causing a bottleneck in Haiku now?
Re: World's fastest Haiku box
Locks for example. To scale from one CPU to two or more, the locks ensure that access to critical data structures and code paths is serialised. But this serialisation becomes a bottleneck for the most frequently accessed structures as you scale up further.
Unlike Linux, BeOS was multiprocessor since 1.0. Can you give an example of a lock that is causing a bottleneck in Haiku now?
Sure whats the time to clear the lock ? IE how much time is spent in terms of clock cycles on aquiring and releasing locks ? its about %20 of a threads cpu demend IIRC.
Look into Amdahls law.
Anyways I think gpgpu is a way better way to go, it doesn't make sense to parrellel alot of operation becuase the overhead exceeds the performance benefit. the things we need to process faster anyways can be done better on a gpu style processor anyways and given the changes that AMD and Nvidia plan to bring to the table soon, programming for them will get alot easier in the future.
Re: World's fastest Haiku box
Sure whats the time to clear the lock ?
I'm not sure we are talking about the same thing here. I am looking for a specific example of a bottleneck in the Haiku kernel that affects SMP today...
Anyways I think gpgpu is a way better way to go
For what, cracking passwords? Running magma simulations? GPGPU processing is fairly specialized at this point. You can only offload very specific tasks, mostly math. It's not as useful as having a second, third, or fourth core. [/quote]
Re: World's fastest Haiku box
indeed on Linux it was the BKL (big kernel lock) which was introduced when SMP support was added and only recently has it been removed and replaced with mutexes I believe.
From what I gather a mutex is better than a lock as it doesn't stall other things it only protects acess to the component that requires serialization?
"lock" and "mutex" are synonyms in this context, they're both just names for a method to control access to a data structure or subroutine, usually on a voluntary basis (ie the programmer is responsible for ensuring they "take the lock" before accessing the protected data).
The BKL is a strategy used in many once uni-processor Unix systems to bootstrap SMP support, it is sometimes also called a "giant lock". All access to the kernel is initially wrapped with a single lock (the BKL) so that only one CPU at a time can be running kernel code, but as many CPUs as are available can be running userspace code.
Although this is a good start, it's a long way from ideal. Large improvements can be made by releasing the BKL in common code paths and taking a mutex that's more specific. For example by having separate locks for SCSI controllers and TCP/IP, one CPU can be sending a TCP/IP packet while another is reading data from disk. Having a separate mutex for each network card would allow several CPUs to be sending a TCP/IP packet simultaneously, at the cost of making the networking code slightly more complex. The term for using each mutex to protect only a very specific codepath or data structure is "fine-grained locking". Such work began on Linux almost straight away in the Linux 2.0 kernel series.
In the last five years or so there haven't really been many more opportunities to improve performance through fine-grained locking in Linux. The BKL retreated to the dustier corners of the kernel, like VT switching (when you press Ctrl-Alt-F2) or ioctl() calls. It was no longer a bottleneck, and for a while it seemed as though it was mostly harmless.
Ultimately replacing even these more obscure uses with various finer grained locks was desirable because the BKL makes interdependencies within the kernel trickier to understand. It's a recursive lock and it's automatically released when sleeping, so determining whether the lock is always held when calling a particular routine could be very difficult. Removing it altogether completed the job which began when it was introduced, Linux is now probably the most scalable OS kernel that ever existed.
To get there though, something quite different had been happening meanwhile.
Re: World's fastest Haiku box
Unlike Linux, BeOS was multiprocessor since 1.0. Can you give an example of a lock that is causing a bottleneck in Haiku now?
My old boss would say "That's a problem we'd like to have". Meaning of course "The problems we have now are much worse". If you ran Haiku on a 16 CPU system with a workload that wasn't inherently CPU bound, the first few obstacles you'd probably run into don't have anything to do with locking, or at least not explicitly.
But once you have cleared those obstacles you'd start to hit things like the extensive reliance on a single block cache per volume with a mutex.
Re: World's fastest Haiku box
If you ran Haiku on a 16 CPU system with a workload that wasn't inherently CPU bound, the first few obstacles you'd probably run into don't have anything to do with locking, or at least not explicitly.
But once you have cleared those obstacles you'd start to hit things like the extensive reliance on a single block cache per volume with a mutex.
Yes but at least for now desktop users do not have to worry except for Intel i7 EE which is 6C/12T (or 6 physical, 12 logical CPUs). AMD desktop CPUs are 6 core & less and most Intel 4 core + HT (ie, 8 logical) or less.
From what I have seen, with 8 CPUs it seems to still run good. Only with benches can see if SMP holds up or not and how well.
Also, since this is just code it can always be fixed with the right programmers. So, even if an issue today, that could change later on by some developer improving the code.
As for Linux scalability, well, if I ever need a system with 256 cores then I will keep that in mind. Or even with 16 or more cores but for now I am happy using my 2C/4T systems. In future we may have 8, 12 & 16 core on desktop but who can say when that will happen. The i7 EE is selling for $1,000 so too expensive and may take 2 years to drop to a better price.
Use installoptionalpackage clockwerk to install and try that out Andrew. Clockwerk was included with A1 & maybe A2 but was not part of A3.
Re: World's fastest Haiku box
As for video editing and other applications, native software is always best but Haiku will have to use many ports until post-R1 when more developers jump onboard and start writing Haiku software.
Also, trying to write software from scratch does not always make sense and better to port some apps over to Haiku.
People may also be familiar with certain software, like Blender, and having it also available on Haiku means someone can have choice whether to use native or ported version of the program.
Re: World's fastest Haiku box
I'm not sure we are talking about the same thing here. I am looking for a specific example of a bottleneck in the Haiku kernel that affects SMP today...
For what, cracking passwords? Running magma simulations? GPGPU processing is fairly specialized at this point. You can only offload very specific tasks, mostly math. It's not as useful as having a second, third, or fourth core.
I don't really know enough about the innards to know for sure, but its likely comparable through benchmarking I have done to windows linux in most of the smp heavys apps that are around, mainly handbrake.
As to gpgpu, GCN and Cuda adress many of the problems your hinting at and when a good programming model is developers things like audio DSP, video DSP, etc etc etc will show massive improvements. Actually some already do BTW.
Its just a matter of a good programming model, which BTW Haiku has a very good opportunity as a fiarly clean unencumbered system to make that happen.So if Haiku wants to really make that jump, that big leap, they can likely given the rough construction of the OS and the API's make the jump go gpgpu processing post R1 or even a dev version and likely really do it right. Currently its sort of a hacky add on everywhere else unless its custom written software for a specific purpose and super computers are showin g just how power the GPU can be.
However some apps won't benefit, highly serialiazed GUI's won't but haiku does well there already, its the heavy lifting for audio,video, etc types of processing where those workloads make the most sense to be on the gpu in the first place.
Look around the world is changing.
Re: World's fastest Haiku box
As to gpgpu, GCN and Cuda adress many of the problems your hinting at and when a good programming model is developers things like audio DSP, video DSP, etc etc etc will show massive improvements. Actually some already do BTW.
Well, CUDA would be pointless since it's NVidia ONLY. AMD however is going all-in with the cross-gpu framework OpenCL, not surprising perhaps since they also seem to be focusing on cpu's with gpu's in them. OpenCL allows the code to run on both cpu and gpu, originally drafted by Apple it's now backed heavily by Apple, AMD and also to some extent Intel.
its the heavy lifting for audio,video, etc types of processing where those workloads make the most sense to be on the gpu in the first place.
But these areas are handled in user-space, not by the Haiku system so I'm not sure I follow what you are suggesting, what exactly would the Haiku system use the gpu for?
Re: World's fastest Haiku box
People may also be familiar with certain software, like Blender, and having it also available on Haiku means someone can have choice whether to use native or ported version of the program.
I'm not sure exactly what you mean by a 'native' Blender, are you talking a pure native Blender equivalent? Haiku native equivalents of such specialized software like Blender, Inkscape, etc seems very unlikely to happen, even if Haiku were to attract lots of developers. I'd say this kind of software really works best as cross-platform projects.
If it's about making Blender look/feel native I think Blender is a poor candidate for that given it uses it's own OpenGL-driven interface.
Inkscape (which I mentioned) would be a much better candidate for being made to look/feel native. Still I'm not sure someone would find it worth doing it even there. Maintaining non-official ports is hard work enough I gather, so I'd be glad just to have the ports at all.
Re: World's fastest Haiku box
Not talking about making Blender more native.
Change out:
"native or ported version of the program" to "native or ported software application". ie, someone can choose to use native video editing software (like Clockwerk) or a ported version like Blender. If familiar with Blender then can use the software on Haiku right away without having to relearn it.
To me, making ported software look & feel native is unimportant and what really matters is that it runs well and is available on Haiku.
"Maintaining non-official ports is hard work enough I gather, so I'd be glad just to have the ports at all."
I agree. The user should decide if they want to use native software or ported software on Haiku. Having both gives better and more choice to Haiku users. Ported software can also run pretty well on Haiku except for a few, larger applications.
Re: World's fastest Haiku box
Well, CUDA would be pointless since it's NVidia ONLY. AMD however is going all-in with the cross-gpu framework OpenCL, not surprising perhaps since they also seem to be focusing on cpu's with gpu's in them. OpenCL allows the code to run on both cpu and gpu, originally drafted by Apple it's now backed heavily by Apple, AMD and also to some extent Intel.
But these areas are handled in user-space, not by the Haiku system so I'm not sure I follow what you are suggesting, what exactly would the Haiku system use the gpu for?
Yes AMD is going for a more open and versitale approach whereas cuda is very vendor specific. Makes a great case to support it.
I am talking about having userland apis and kernel structures etc to make the gpgpu paradigm more effective. A closer tie in vrs a user space and a good programming method and design to make it work. Opencl is a start IIRC gallium plans to support opencl at some point.