Writing scalable server applications in Java technology has always been difficult. Before the advent of NIO, thread management issues made it impossible for an HTTP server to scale to thousands of users. I’m gonna start blogging on Grizzly, the HTTP Connector based on NIO shipped in GlassFish.
First, the truth: This is my first ever blog, I’m Quebecois and do a lot of typos (and speak with an ugly accent)..and I’m tempted to write in french…so there is still time to hit the back button of your browser!
For my first blog ever (and hopefully not the last), I will describe a new HTTP Connector based on NIO, called Grizzly on which I’m working on. Grizzly is currently the HTTP front-end for SJSAS PE 8.1 (throttled version), and included in the GlassFish project.
Grizzly has been designed to work on top of the Apache Tomcat Coyote HTTP Connector. The Coyote Connector is used in Tomcat 3/4/5 and has proven to be a highly performant HTTP Connector when it is time to measure raw throughput. But as other Java based HTTP Connector, scalability is always limited to the number of available threads, and when keep-alive is required, suffer the one thread per connection paradigm. Because of this, scalability is most of the time limited by the platform’s maximum thread number. To solve this problem, people usually put Apache in front of Java, or use a cluster to distribute requests among multiple Java server.
With NIO available in 1.4, I’ve started thinking about exploring a new HTTP connector based on NIO. Grizzly started as a side project and based on throughput performance number I surprisingly got (I wasn’t scare about scalability), I’ve replaced Coyote with Grizzly. At the time, I didn’t find any open source framework I can rely on to start my work, so I started to write my own framework (couple of weeks after I started, Mike Spille blogged about EmberIO). EmberIO looked promising but seems little activities occurred since then.
Another reason I started the work was because I was tired of hearing that NIO wasn’t appropriate for HTTP, without seeing a single real implementation. I wanted to see real stuff, not judgments peoples have. If performance wasn’t there, I would have never bloged about Grizzly…
Grizzly differ from Coyote in two areas. First, Grizzly allow the pluggability of any kind of thread pool (three are currently available in the workspace). I will not give details now, but will blog later with the result I’m getting with those threads pools, and why you may want to change the default one based on what you are doing with GlassFish. Second, Grizzly supports two modes: traditional IO and non blocking IO. Why? the main reason is SSL. At the time I’ve started, SSL non blocking wasn’t available, so I needed to use Coyote SSL implementation (which is quite good). I should soon add support for SSL non blocking.
OK enough words. Let me try to explain Grizzly architecture. The problem with NIO non blocking and HTTP is your never know when all the bytes have been read (this is how non blocking works). With blocking, you just wait on the socket’s input stream. With non-blocking, you cannot wait on the socket input stream because it will return as soon as you have read all the available data. But that doesn’t means you have received all the request bytes. Technically, you have no guarantee than all the bytes has been read from the socket channel:
count = socketChannel.read(byteBuffer));
This may not read the entire stream bytes, and may require several extra read operations. Unfortunately, this situation occurs frequently when reading HTTP requests. I’ve explored several strategies (which I will details in my next blog) and kept the most performant one, which consist of a state machine used to parse the content-length header and predict the end of the stream.
Having to support several strategies and several IO modes, I did design Grizzly in such a way it is easy to integrate new strategy within the framework. A Task based architecture has been used, each task representing an operation when manipulating the request:
+ AcceptTask for managing OP_ACCEPT
+ ReadTask for managing OP_READ
+ ReadBlockingTask for managing blocking IO operation.
+ ProcessorTask for parsing/processing the byte stream.
Every task can execute on their own thread pool or use a shared one. By default, one thread handle OP_ACCEPT and the Selector.select() operation, and a thread pool all others operations. This is how I’m getting the best performance number when running internal benchmarks. Designed to measure scalablity and throughput simultaneously, our benchmarks look for how many concurrent clients can we support, with the following criteria:
+ Avg. client think time is 8 seconds
+ 90% response time is 3 seconds
+ Error rate < .1%
Plug : But I will not discuss the numbers here, because I keep the numbers for my JavaOne session on June 28 🙂
Well, that’s it for today. Next time I will gives more details about the strategies I have explored (with code, not words so less typos) and the characteristics of every threads pools available in Grizzly. And sorry if it was boring…at least I’m exercising my english writting skills 🙂
P.S BTW, even if I use “I”, I’ve developed Grizzly with Charlie Hunt(Netbeans team), Scott Oaks (Performance team) and Harold Carr (Corba team).
P.P.S Grizzly is far from perfect, so patches and strategies are more than welcome!
_uacct = “UA-3111670-1”;