It is very unfortunate that some people are still not aware of the fact that Java performance is comparable to that of C++. This blog piece collects the evidence to support this claim.

The wrong perception about Java slowness is by-and-large because Java 1 in 1995 was indeed slower than C++. Java has improved a lot since then, e.g., hotspot. It is now version 6 and soon will be version 7. Java is now a competitive technology comparing to C/C++. In fact, in order to realistically optimize for C/C++, you need to find the “right” programmer to code it. This programmer needs to be aware of all the performance issues of C/C++, profiling, code optimization such as loop unfolding, and may even need to write code snippets in assembly. An average Joe coding in C/C++ is probably not any faster than coding in Java.

(I am in general against code optimization techniques because they make the code unreadable to humans, hence unmaintainable, such as a lot of the FORTRAN/C/C++ code found in Netlib and Statlib.)

More importantly, most modern software runs on multiple cores. Code optimization techniques are dwarfed by parallel computing technologies. It is significantly easier and more efficient (and more enjoyable) to write concurrent programming code in Java than in C++. Therefore, to code high performance software, I personally prefer to code for multi-core, multi-CPU, and cloud in java rather than doing code optimization in C/C++.

(I AM NOT SURE WHY FORTRAN SURVIVES IN 2011. HOW ARE YOU SUPPOSED TO READ THOUSDANDS LINES OF CODE ALL IN UPPER/LOWER CASES WITH A BUNCH OF C’S AND GOTO’S EVERYWHERE?)

Briefly, my conclusion is that, among the general purpose programming languages (hence excluding Scala and etc.), we should use Java instead of C/C++, FORTRAN, Assembly and etc. whenever possible because Java is the easiest programming language to learn and work with without a sacrifice in performance.

(For me, Java is an easier language than C# because the Java IDE technology is far much better than the C# counterparts.)

The evidence I collect are listed below. Please feel free to expand the list.

  1. Java has recently won some major benchmark competitions.
    1. http://developer.yahoo.com/blogs/hadoop/posts/2008/07/apache_hadoop_wins_terabyte_sort_benchmark/
    2. http://developer.yahoo.com/blogs/hadoop/posts/2009/05/hadoop_sorts_a_petabyte_in_162/
    3. http://news.cnet.com/8301-13846_3-10242392-62.html
  2. Recent independent studies seem to show that Java performance for high performance computing (HPC) is similar to FORTRAN on computation intensive benchmarks.
  3. http://blog.cfelde.com/2010/06/c-vs-java-performance/
  4. http://www.amazon.com/Fixed-Income-Analytics-Developer-Circa/lm/R3FV39FJRU3FE9

 

10 Comments

  1. Well said.

    For high performing code in a modern object oriented language, it’s really the data structure, well controlled threading and optimized code execution path, folks from LAMX developed disruptor can handle 6mil msg/s, and it’s purely in Java…

  2. Jia, thanks for quoting “disruptor”, a concurrent programming framework. It is an excellent example of how Java is not slower than any other language.

    I am particularly impressed with:

    “The Disruptor is the result of our research and testing. We found that cache misses at the CPU-level, and locks requiring kernel arbitration are both extremely costly, so we created a framework which has ‘mechanical sympathy’ for the hardware it’s running on, and that’s lock-free.”

    And, this is all done in Java!

  3. “An opinion exists that Java is not suitable for computational modeling and finite
    element programming because of its slow execution speed. It is true that Java
    is slower than C in performing “multiply-add” arithmetic inside double and triple
    loops. However, tuning of important Java code fragments provides computational
    speed comparable to that of C.”

    – Programming Finite Elements in Java(TM) by G. P. Nikishkov (Jan 12, 2010)

  4. Hmm, I think that we just hit a language rerbiar. Looking at your original post, you wrote: I’m having trouble deciding if I should start out with tutorials that covers the basics of J2ME programming, or maybe even the basics of Java programming? I’ve seen almost that exact same question asked many times in other places, and I’ve interpreted it as I’m trying to learn programming, where should I start? In fact, I’ve personally been asked that question more times than I can count. Given the lack of context, I read that as though you were looking to *learn* how to program.If there were other clues in your post that indicated otherwise, then I must have totally missed them. Sorry for that, and the resulting confusion. Doug

  5. Fortran survives today because it is still (after 60+ years) the fastest of all !
    http://shootout.alioth.debian.org/u64q/which-programming-languages-are-fastest.php

    Yes, Fortran is NUMBER ONE. Java comes sixth in the list.

    You are grossly mistaken about Fortran, it stopped using CAPS in 1990 itself. It is “Fortran 2008” now, not “FORTRAN 77” (which was mostly used in Netlib). Today’s Fortran has modules, OOP, function/operator overloading, runtime polymorphism and everything needed for modern SW development of numerical libraries.

  6. Fortran survives because of the large amount of legacy code that is reused. For new developments I don’t know if it’s worth it.

  7. I like both, fortran and java. For prototype I use mat lab then translate into fortran most time . Then may use jna to mix with java

  8. I know that INET, the large stock exchange system powering NASDAQ on Wall Street is written in Java. Also, their derivative system Genium INET has the time critical matching engine part written in Java. The trick is to never trigger the garbage collector, and skip real time java, to get extreme performance. This can be done by preallocating lot of objects and then constantly reuse them.

  9. The reason Fortran is faster than C/C++ is because Fortran does not have any pointers. So their matrices are fixed size, this means fortran can optimize heavily. C/C++ has pointers which means the matrices are not really not known to the compiler (dynamic size) so the compiler can not do such heavy optimizations as fortran compiler.

    If you know that a struct is always 32 bytes, you can use that to your advantage when optimizing. In C/C++ the struct does not necessarily have 32 bytes, the size can vary, so you need to take precautions and can not optimize heavily – all because of pointers.

  10. Nice article!
    What about cpp? it have many libraries with the same performance of c, doesn’t it? You have shared valuable information here that will lead in choosing best language to learn first for me.
    Thanks a lot for sharing.


Add a Comment