Maximizing Java Performance on AIX: Part 5 - References and Conclusion![]() |
![]() |
![]() |
Level: Intermediate Amit Mathur (amitmat@us.ibm.com), Senior Technical Consultant and Solutions Enablement Manager, IBM 17 May 2004 This is the conclusion of the 5-part series providing tips and techniques that are commonly used for tuning Java? applications for optimum performance on AIX?. We touch upon other interesting areas of Java performance tuning for AIX, look at a few case studies, and then end the series with a list of useful references. This is the conclusion of the 5-part series on Java Performance.The first article in the series laid the foundation for performance tuning, and parts 2, 3 and 4 looked at various bottlenecks that can affect a system's scalability and throughput. This article covers two important topics that were not covered previously, along with providing case studies and references. A frequently asked question (FAQ) is about translating Sun-specific command-line switches to IBM-specific switches. Also, any serious performance tuning exercise, like benchmarks, cannot ignore system-wide tuning. We touch upon these topics briefly in the next section. This is followed by a few case studies that attempt to illustrate how the tools and tips described in the series are applied to solve problems in the field. The emphasis is on understanding and learning how to use the tools and techniques mentioned in the series. The article, and the series, concludes with a recap of useful references.
Other Important Considerations This section talks about translating Sun Java configuration to IBM Java configuration, and System-wide tuning for AIX applications. The scope of both of these topics is quite vast, so we only touch them briefly. If you have an application that has been tuned for Sun Java, and you are attempting to migrate your application to AIX (or, for that matter, any platform running IBM Java), you may already have done the hard work. Understanding the application characteristics is half the battle. You can use the characteristics-based tuning tips explained in Part 2 and Part 3 based on the understanding obtained by the tuning exercise with Sun Java. However, we frequently receive queries about how to translate specific Sun Java command-line switches to equivalent IBM Java command-line switches. These switches almost always correspond to Garbage Collection, as a well-tuned GC is essential to any Java-based application's performance. A mapping between Sun and IBM switches is difficult because of the difference in JVM architecture. The IBM Java does not contain a Generational Garbage Collector, and does not understand any command-line switches that start with The table below attempts to translate Sun Java GC command-line switches to equivalent IBM switches. This mapping is based on the functionality of Sun switches as described in the Sun-specific article "Tuning Garbage Collection with the 1.4.2 Java Virtual Machine". You should attempt to use this table only for the very specific purpose of locating an equivalent (or close) switch for IBM Java. This is not meant to replace the tuning exercise, as even heap size requirements can be quite different. For general GC tuning tips with IBM Java, as well as for information on these and other GC switches for IBM Java, please refer to Fine-tuning Java garbage collection performance. The creation of this table did not involve any performance-related testing, but was based entirely on the documented use of the Sun switches in the reference quoted above.
As the above table demonstrates, translating the switches in most cases will involve simply discarding the Sun switches for IBM platforms. This makes the GC tuning exercise for IBM Java quite painless, while still providing superior performance. Using AIX tools like But for most multi-tiered applications and especially benchmarks, system-wide tuning is unavoidable. There are several excellent resources available for AIX performance tuning that you can consider. To get an idea of the kind of tuning normally required, you can look at actual published benchmarks. If you look at a recent SpecJBB 2000 result for IBM Java on AIX, say http://www.spec.org/osg/jbb2000/results/res2003q3/jbb2000-20030624-00194.html, the OS tunings are mentioned below:
Warning: These settings must not be applied without carefully understanding the consequences, as an improper use of these settings can actually worsen the system performance. So what do the above settings do? Referring to AIX documentation, you can quickly get a better understanding of what each of these settings is doing. Let us examine each of these in turn. The The The The Finally, the So you can see that the system was switched to large pages and a fixed priority scheduler, to get record numbers with SpecJBB 2000 benchmark. Can you use these same settings in your application? Probably not, but you are now in a position to examine these commands and the scheduling policies, and experiment to suit your application characteristics. This is the next step in Performance tuning.
In this section we look at a few examples, taken from actual issues handled by Java service team. These examples should give you a good idea of how to approach performance tuning, and how to use various tools to gather information that can be used for tuning exercise. Note that the cases for this section were not chosen based on how frequently the problem is encountered in the field. The emphasis is on understanding how to use the various tools and techniques discussed in the series to locate and correct performance issues. Case 1: Bad application response time The reported issue was that the Java-based application's response time was unacceptable. Using Looking at GC logs confirmed that the heap was expanding very often. This, combined with multiple allocation failures in quick succession and a very full heap, resulted in the Java application spending a lot of time just trying to locate a free chunk, not finding it, expanding the heap, and then satisfying only the current allocation request. This was fixed by specifying a larger value for The first step, using AIX tools, confirmed this to be a Java-related issue. The second step, guided by the fact that AIX tools indicated this to be a problem related with GC, could concentrate on GC logs directly. The third step used the available tuning parameters to break the unusual cycle that the application was getting into, allowing the excessive CPU time to be recovered. Another interesting scenario was raised as a performance issue, with CPU being busy most of the time. Looking at verbosegc, we could see that a GC cycle was being called a bit too frequently, resulting in the application spending most of its time doing just GC. The verbosegc traces showed that most of the GC activity was being caused due to multiple allocations of very large objects, roughly 10 MB or more in size. These were fragmenting the heap, making the GC cycles longer and thus affecting the performance. But looking at the verbosegc cycle, the customer could not say what these objects were. The easiest way to locate the culprit would have been to analyze the heapdump using HeapRoots tool. But there was another twist to the situation: the large objects were not surviving the GC cycle. So the heapdump did not show any objects of such size. This is a classic example of how profiling can be a very useful ally for locating and correcting problems in application sources. The Java Virtual Machine Profile Interface makes this problem trivial. For this particular example, we used a variation of the method described at Using JVMPI to Identify Large Memory Allocations, and were able to quickly identify the code that was doing this allocation. As the final case study for this article, we discuss a scenario that showed up looking like a simple sizing problem. An attempt was being made to scale the application to a 1000 users, and the application would run out of Java heap. Calculating the heap requirements based on the number of users, the heap size was increased from 1 GB to 1.5 GB. But this triggered OOM errors not coming because of Java heap. The Java heap would show enough free space, but the application logs would show that an OOM occurred. Using To dig further, the command-line switch For some time we tried to work around the problem by increasing the number of JNI references (if you specify a higher -Xoss value, it increases the limit in a proportional manner). But this only delayed the inevitable, and the severe Java heap fragmentation caused by the large number of pinned JNI references did not help either. A closer look at the application design revealed the true cause: an unbounded number of threads being created by the application. As the tests proceeded, the threads would wait for finalizers, and since finalizers are not predictable, there would be a large number of these threads waiting to release their JNI references. The only feasible solution in this case was to change the application code to correct these two problems. Once the application replaced the unbounded threads with a thread pool, and replaced finalizers wherever possible, the sizing work completed with flying colors. This case shows how you can sometimes end up trying to hit a moving target. An issue reported as Java heap exhaustion eventually turned out to be a design problem. Balancing the Java and Native heaps is usually a critical part of performance tuning, but in this case it was not sufficient. Having so many tools and techniques at your disposal gives you a much broader picture, allowing you to make informed decisions about what to tune.
This article concludes the series. We hope you find this series a valuable guide in maximizing the performance of your Java applications on AIX. The authors thank Ashok Ambati, Rajesh Jeyapaul, Sharad Ballal, Roger Leuckie and Mark Bluemel for their input and advice on these articles. A special note of thanks goes to John Tesch, whose collection of information on AIX Java performance was a major source of inspiration to the series.
Java on AIX
|