Archive

Archive for November 27, 2009

Putting GlassFish v3 in Production: Essential Surviving Guide

November 27, 2009 6 comments

On December 10, GlassFish v3 GA will spread the world. As you are aware, the marketing vehicle for this release will be Java EE 6 and the fact that GlassFish is now a full OSGi runtime/container!!! Both are innovative technology, but they will not save your life once you put GlassFish in production hence this survival guide :-). At the end, once your OSGi/EE 6 application is ready, you still want to have the same great performance you’ve got with GlassFish v2. This blog will gives some hints about how to configure and prepare GlassFish v3 for production use.

v3runtime.png

New Architecture

With v3, the Grizzly Web Framework role has significantly increased if you compare with v2. In v2, its role was to serve HTTP requests in front of the Tomcat based WebContainer. In v3, Grizzly is used as an extensible micro-kernel which handle almost all real time operations including dispatching HTTP requests to the Grizzly’s Web based extensions (Static Resource Container, Servlet Container, JRuby Container, Python Container, Grails Container), Administrative extensions (Java WebStart support, Admin CLI), WebService extension (EJB) and Monitoring/Management REST extensions.

v3-diagram.pgn

At runtime, Grizzly will do the following

v3runtime.png

If you are familiar with Grizzly’s internals

v3runtime.png

As you can see, it is critical to properly configure GlassFish in order to get the expected performance for your application and GlassFish in general.

Debugging GlassFish

Before jumping into the details, I recommend you always run GlassFish using the following property, which display in the server log the Grizzly internal configuration for both the NIO and Web Framework

-Dcom.sun.grizzly.displayConfiguration=true or
network-config>network-listeners>network-listener>transport#display-configuration

If you need to see what Grizzly is doing under the hood like the request header received, the response written etc. you may want to turn on snoop so you don’t need to use Wireshark or ngrep

-Dcom.sun.grizzly.enableSnoop=true or 
network-config>network-listeners>network-listener>transport#enable-snoop 

Note that if you enable that mechanism, the performance will drop significantly so use it only for debugging purpose.

Configuring the VM

Makes sure you remove in domain.xml the following jvm-options:

-Xmx512 -client

and replace it with

-server -XX:+AggressiveHeap -Xmx3500m -Xms3500m -Xss128k
-XX:+DisableExplicitGC

For anything other than Solaris/SPARC, 3500m needs to be 1400m. On a multi-CPU machine, add:

-XX:ParallelGCThreads=N -XX:+UseParallelOldGC

where N is the number of CPUs if < 8 (so really, you can leave it out altogether in that case) and N = number of CPUs / 2 otherwise. On a Niagara, add:

-XX:LargePageSizeInBytes=256m

You can also install the 64-bit JVM and use

-XX:+UseCompressedOops

with JDK 6u16 and later. A 64-bit JVM with

-XX:+UseCompressedOops

will allow you to specify larger Java heaps, especially useful on Windows x64, where you are limited to about

-Xmx1400m

of max Java heap. Note that a 64-bit JVM will mean you'll need to be running a 64-bit operating system. That's not an issue with Solaris. Many people who run Linux only run the 32-bit version of Linux. And, for Windows users, they'll need a 64-bit Windows in order to use a 64-bit Windows JVM. A 64-bit JVM with -XX:+UseCompressedOops will give you larger Java heaps with 32-bit performance. 64-bit JVMs also provides additional CPU registers to be available on Intel/AMD platforms.

Configuring the Thread Pool

Make sure you take a look at "what changed" since v2 and how you can properly configure Grizzly in v3. The one you should care are acceptors-thread

network-config>transports>transport>tcp#acceptor-threads

and the number of worker threads

network-config>thread-pools>http-threadpool

The recommended value for acceptors-thread should be the number of core/processor available on the machine you deploy on. I recommend you always run sanity performance test using the default value (1) and with the number of core just to make sure. Next is to decide the number of threads required per HTTP port. With GlassFish v2, the thread pool configuration was shared amongst all HTTP port, which was problematic, as some port/listener didn't needed to have that many threads as port 8080. We fixed that in v3 so you can configure the thread pool per listener. Now the ideal value for GlassFish v3 should always be between 20 and 500 maximum as Grizzly use an non blocking I/O strategy under the hood, and you don't need as many threads as if you were using a blocking I/O server like Tomcat. Here I can't recommend a specific number, it is always based on what your application is doing. For example, if you do a lot of database query, you may want to have a higher number of threads just in case the connection pool/jdbc locks on a database, and "waste" threads until they unlock. In GlassFish v2, we did see a lot of applications that were hanging because all the worker threads were locked by the connection-pool/jdbc. The good things in v3 is those "wasted" threads will eventually times out, something that wasn't available with v2. The default value is 5 minutes, and this is configurable

configs.config.server-config.thread-pools.thread-pool.http-thread-pool.idle-thread-timeout-seconds

I/O strategy and buffer configuration

In terms of buffers used by Grizzly to read and write I/O operations, the default (8192) should be the right value but you can always play with the number

network-config>protocols>protocol>http#header-buffer-length-byte
network-config>protocols>protocol>http#send-buffer-size

If your application is doing a lot of I/O operations like write, you can also tell Grizzly to use an asynchronous strategy

-Dcom.sun.grizzly.http.asyncwrite.enabled=true

When this strategy is used, all I/O write will be executed using a dedicated thread, freeing the worker thread that executed the operation. Again, it could make a big differences. An alternative that could be considered also is if you are noticing that some write operations seems to takes more time than expected. You may try to increase the pool of "write processor" by increasing the number of NIO Selector:

-Dcom.sun.grizzly.maxSelectors=XXX

Make sure this number is never smaller than the number of worker thread as it will gives disastrous performance result. You should increase that number if you application use the new Servlet 3.0 Async API, the Grizzly Comet Framework or Atmosphere (recommended). When asynchronous API are used, GlassFish will needs more "write processor" than without

Let Grizzly magically configure itself

Now Grizzly supports two "unsupported" properties in GlassFish that can always be used to auto configure GlassFish by itself. Those properties may or may not make a difference, but you can always try them with and without your configuration. The first one will configure for you the buffers, acceptor-threads and worker threads:

-Dcom.sun.grizzly.autoConfigure=true

The second one will tell Grizzly to change its threading strategy to leader/follower

-Dcom.sun.grizzly.useLeaderFollower=true

It may or may not make a difference, but worth the try. You can also force Grizzly to terminates all its I/O operations using a dedicated thread

com.sun.grizzly.finishIOUsingCurrentThread=false

It may makes a difference if you application do a small amount of I/O operations under load.

Cache your static resources!

Now by default, the Grizzly HTTP File Caching is turned off. To get decent static resources performance, I strongly recommend you turn it on (it makes a gigantic difference)

network-config>protocols>protocol>http>file-cache#enabled

Only for JDK 7

Now, if you are planning to use JDK 7, I recommend you switch Grizzly ByteBuffer strategy and allocate memory outside the VM heap by using direct byte buffer

-Dcom.sun.grizzly.useDirectByteBuffer=true

Only on JDK 7 as with JDK 6, using allocating heap byte buffer gives better performance than native. Now if you realize GlassFish is allocate too much native memory, just add

-Dcom.sun.grizzly.useByteBufferView=false

That should reduce the native memory usage.

WAP and Slow Network

If your application will be used by Phone using the WP protocol or if from slow network, you may configure extends the default time out when Grizzly is executing I/O operations:

-Dcom.sun.grizzly.readTimeout or
network-config>network-listeners>network-listener>transport#read-timeout

for read, and

com.sun.grizzly.writeTimeout or
network-config>network-listeners>network-listener>transport#write-timeout

for write. The default for both is 30 seconds. That means Grizzly will wait 30 seconds for incoming bytes to comes from the client when processing a request, and 30 seconds when writing bytes back to the client before closing the connection. On slow network, 30 seconds for executing the read operations may not be enough and some client may not be able to execute requests. But be extra careful when changing the default value as if the value is too high, a worker thread will be blocked waiting for bytes and you may end up running out of worker threads. Note to say that a malicious client may produce a denial-of-service by sending bytes very slowly. It may takes as 5 minutes (see the thread times out config above) before Grizzly will reclaims the worker threads. If you experience write times out, e.g the remote client is not reading the bytes the server is sending, you may also increase the value but instead I would recommend you enable the async I/O strategy described above to avoid locking worker thread.

Configuring the keep alive mechanism a-la-Tomcat

The strategy Grizzly is using for keeping a remote connection open is by pooling the file descriptor associated with the connection, and when an I/O operations occurs, get the file description from the pool and spaw a thread to execute the I/O operation. As you can see, Grizzly isn't blocking a Thread who waits for I/O operation (the next client's request). Tomcat strategy is different, e.g when Tomcat process requests, a dedicated thread will block between the client requests for maximum 30 seconds. This gives really good throughput performance but it doesn't scale very well, as you need one thread per connection. But if your OS have tons of Threads available, you can always configure Grizzly to use a similar strategy:

-Dcom.sun.grizzly.keepAliveLockingThread=true

Tomcat also have an algorithm that will reduce the waiting time a Thread can block waiting for the new I/O operations, so under load threads don't get blocked too long, giving the change to other requests to execute. You can enable a similar algorithm with Grizzly:

-Dcom.sun.grizzly.useKeepAliveAlgorithm=true

Depending on what your application is doing, you may get nice performance improvement by enabling those properties.

Ask questions!

As I described here, I will no longer work on GlassFish by the time you read this blog, so make sure you ask your questions using the GlassFish mailing list (users@glassfish.dev.java.net) or you can always follow me on Twitter! and ask the question there!

var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www.");
document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));

var pageTracker = _gat._getTracker("UA-3111670-3");
pageTracker._initData();
pageTracker._trackPageview();

Categories: Uncategorized
Follow

Get every new post delivered to your Inbox.

Join 51 other followers