Back      Home

Summer (nut shell)


Benchmarked the Dhrystone program using different flavors of the Java Virtual machine.

The Configuration of the System used was a Compaq, with 2.0 ghz clock speed and 608 MB RAM. All the tests were performed repeatedly over the non-tainted machine. Number of cycles for the dhrystone was 1000000.


Summary of Results


The GCJ JVM

total time: 18828ms

Result: 53112 dhrystone/sec.

SUN JVM

total time: 677ms

Result: 1477104 dhrystone/sec.

JIKES RVM

total time: 6694ms

Result: 149387 dhrystone/sec.

Kaffe JVM

total time: 6404ms

Result: 156152 dhrystone/sec.


Benchmark test for the C version of the Dhrystone.


GNU C compiler

total time: 367ms

Result: 3467104 dhrystone/sec.


Considering the above case, there is an incredible increment in the performance in the case of C or even C++. On an average C and C++ perform 8-10 times faster than java.

Next I tried to modify the dhrystone program with some native interfaces. It wasn't easy to completely integrate the code with a Native language like C. On the event of conversion in order to make my job easier I thought I could try to use a couple of the tools that promise to generate JNI wrappers automatically. It was really tedious to get some of them work itself.

Most of the tools required a lot of paths to be set before they where executed, below are a list of tools that I tried....

Tools like jace, janet did not quite work for me on the linux box.

Noodleglue seemed to be very promising, but did not work, it threw an uncaught error, I could not do much about it as it only worked on the windows environment, and used .exe files.

Startj worked Windows, a very small but powerful tool, only windows version available. Successful in generating all the wrappers, but when put together very slow. As startj used plenty of unwanted exceptions. Even when I tried to use simple programs like HelloWorld and then generate wrappers, it would generate too many files, not to mention the exceptions involved in them. On the other hand too good to good as an IDE, worked too good for its size.

Jnipp was the next tool that I tried my hands on, had to set a million paths before it actually worked, attached is a file called mypath.txt with all the paths that where eventually set to get it work. Jnipp generated all the files including a make file, initially it didn't work, fixed a couple of bugs and then it finally worked. Performed really bad though. The LINUX GCJ hung a couple of times. SUN JDK would take it through but takes along time. Trying to optimize the code till this day. Sometimes it just says aborted I don't understand why. The folder mychap has all he files got from jnipp along with the makefile.

Hand coded, this was another option, I hand coded the wrappers, seemed to perform better than any of the automatically generated code. The attached folder also has the source code that I coded by hand and a make file to compile them. It has a readme too. Which asks you set a few paths before it actually works.

On the other hand the aim was to achieve 90% utilization, but at most I could get to a 65%, still working on optimizing the code.

Did a lot of reading during the summer,

papers

Automated and Portable Native code Isolation

by Gzegorz Czajkowski, Laurent Daynes and Mario Wolczko


Improving Java Performance through Semantic Inlining

by Peng Wu, Sam Midkiff, Jose Moreira and Manish Gupta


High Performance Computing in Java

by P.V. Artigas, M.Gupta, S.P.Midkiff, J.E. Moreira


Semi Automatic Parallelization of Java Applications

by Pascal A. Felber.



Automated and portable Native code Isolation

by Grzegorz Czajkowski, Laurent Daynes and Mario Wolczko

This paper helped me a lot in gaining an insight with JNI, helped me in coding the JNI too. Its pretty cool it has small code extracts too.


An Instruction Cache Architecture for parallel Execution of Java Threads

by Wanming chu


Implementing a Java Virtual Machine in the Java Programming Language.

by Antero Taivalsaari

(current)


A study of Java Virtual Machine Optimization and Implementation in Hardware

by Austin Kim

(current)

Read the article VAST-C/Altivec article (code optimizations).




Agenda for Further Research


First goal would be to optimize the JNI code, to reach the 90% mark, Once I'm through with this, I wish to use the javax.vecmath class and generate wrappers for the methods implemented.


Implement all of the methods in a native language such as C or C++. Once I do this I could optimize this code too. In this case

java source -> native code -> myprogram ->optimized native code -> new JNI wrappers -> use them on other programs.

I have an alternative here, instead of using Native can I directly use the bytecodes only (without involving any native code).In other words

java source -> bytecode -> myprogram -> vectorized byte code -> execute


Next Step I would like to do recognize computation intensive patterns (this may be the case for the bytecodes or the native code). Manipulate these computation intensive instructions to actually use the Vector Processing Unit, in other words force the JVM to use the VPU.

Like in the first case if I generated native code and vectorized them, I could generate new wrappers for the methods, these would reference the vectorized code directly, and can be used with any program, aiming at a customized javax.vecmath class for the underlying hardware. The native method is more generic, whereas the direct method is more to a single program.


Once this phase is over, need to test for results and measure for speed-ups.


Another thing that I really intend to work on is :

I have been working with the open source Virtual machine i.e Kaffe Virtual machine, I have enrolled in all of its forums, and kept track of some of the source code and latest patches, they are all written in C. I intend to actually find a way to modify the Kaffe Virtual Machine to actually look up a library may be, and generate vectorized byte codes directly, whenever a program is compiled. I tried several times to come out with a layout for this, not successful yet. I have tried actually to modify the source code to do a couple of silly things, some of them actually works. In this case I would have a virtual machine that actually generates machine specific code.