Recently I was working on a bug where Grizzly was leaking memory. From the code, I was convinced the leak wasn’t in Grizzly (of course :-))! The problem was occurring when GlassFish was stressed during three days. After approximatively 16 hours, some components in GlassFish weren’t able to get a file descriptor, so I’ve suspected a memory leak somewhere in other’s code ;-). The test was using JDK 1.5ur7, and it was very hard to find what caused the problem:
ERROR /export1/as90pe/domains/domain1/imq/instances/imqbroker/etc/accesscontrol.properties (Too many open files)
euh….OK I admit I wasn’t able to debug this using strace, pfiles etc. and was ready to blame the IMQ team :-). Then I’ve decided to switch to Mustang to see if I can get a better error message. Thanks to Mustang, I’ve got:
Exception in thread “main” java.lang.OutOfMemoryError: Java heap space
I decided to ping Alan to make sure he wasn’t aware of any socket leak, and he recommended I try the improved jhat. I was a little scared of using this tool, and I wasn’t sure if I should use strace/truss instead…..I was very very surprised by the new Mustang‘s jhat. Since then, I can’t live without it :-). What I did is:
% jmap -dump:file=heap.bin
% jhat -J-mx512m heap.bin
This started a web server, so I did:
Wow! From that page, I was able to browse the heap and find which object has been created by whom. The histo is a very good starting point:
because it tells you the number of allocated objects at the time the jmap was executed. From there, you can click and see which object is having reference to what. The instance count is also helpful
because it tells you the candidate for the memory leak. Finally the SQL query page is amazing:
I was looking at all the current active HTTP requests, so I did:
select s from com.sun.enterprise.web.connector.grizzly.ReadTask s where s.byteBuffer.position > 0
and got them!!! I don’t think you can do this with any available profilers right now.
For Grizzly, the number of java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask was extremely high, because Future aren’t gc() unless you explicitely purge() them, which is something I didn’t expect (and I’m not the only one I’m sure). Thus I’ve decided to not use any java.util.concurrent.ScheduledThreadPoolExecutor and instead implement a better strategy for keep-aliving connections…and now the leak disappeared (not to say performance is much better!)
Of course profilers like NetBeans Profiler can always be used….but I’ve find jmap/jhat so fast and simple to use, not to say I really liked the Object Query Language (OQL) query page where I can get the exact instance I’m looking at….good work J2SE team!!
_uacct = “UA-3111670-1”;
As described in part II, it is possible to extend the Grizzly Http Engine by writting Pipeline, Algorithm, Handler and AsyncHandler. By default, every HTTP connection are executed synchronously, e.g the request is parsed, the servlet executed and the response flushed to the browser. There is situation where this model doesn’t work, e.g. when a business process is calling out to another one over a slow protocol, or if there is a work flow interruption, like the “manager approval email” case. Fortunately, it is possible in Grizzly to implement asynchronous request processing (ARP) using the AsyncHandler interface. But not only in Grizzly 😉
Next couple of paragraphs will discuss the set of interfaces available, describe the default implementation and conclude with a Servlet which execute only when a new message is available inside a Gmail account. The Servlet can be seen as HTTP Gmail notifier. But first, what’s the goal or supporting ARP?
The goal is to be able to build, on top of Grizzly and NIO, a scalable ARP implementation that doesn’t hold one thread per connection, and achieve as closer as possible the performance of synchronous request processing (SRP).
Grizzly currently expose three interfaces that can be used to implement ARP. They are:
AsyncHandler: This interface is the main entry point. When ARP is enabled, Grizzly will delegate the request processing execution to this interface. Instead of executing a Task, the Task execution will be delegated to the AsyncHandler implementation. This interface is mandatory.
AsyncExecutor: This interface implementation will usually decide when a Task needs to be interrupted, under which conditions (using AsyncFilter), and when the Task needs to be resumed. A Task is interrupted when the conditions required for its execution aren’t meet. An AsyncExecutor must have one or more AsyncFilter. If no AsyncFilter are defined, the SRP model will happens. This interface isn’t mandatory, but recommended.
AsyncFilter: Implementation of this interface will determine if the current request meet its execution conditions, e.g. does the request need to be executed or be interrupted. An AsyncFilter who decide to allow all requests to be executed without being interrupted will simulate the SRP mode.
The Default implementation
GlassFish ship with a default ARP implementation. The default implementation consist of a DefaultAsyncHandler, a DefaultAsyncExecutor and a new Task called AsyncProcessorTask. The new AsyncProcessorTask is a wrapper around a ProcessorTask, which contains all the code for executing an HTTP request and used with the SRP mode. Mainly, ARP happens when:
1. The request execution is delegated to the DefaultAsyncHandler
2. In DefaultAsyncHandler, the Task is wrapped with an AsyncProcessorTask
3. The AsyncProcessorTask is executed. Its execution consist of delegating the request processing to an DefaultAsyncExecutor.
4. AsyncProcessorTask will first invoke DefaultAsyncExecutor.preExecute(). preExecute() will parse the request line and its headers.
5. Second, AsyncProcessorTask will invoke DefaultAsyncExecutor.interrupt(). interrupt() will execute all defined AsyncFilter(s).
6. An AsyncFilter will validate its execution conditions. If the conditions aren’t meet, the Task will be interrupted and added to a Scheduler that will re-execute it once the conditions are meet. It is important to note here that the AsyncFilter thread will not be locked or interrupted, but will be returned to the Pipeline (Grizzly Thread Pool Wrapper). This is the beauty of non blocking NIO, where the one thread per socket paradigm doesn’t apply.
7. Once the AsyncFilter determine the execution conditions are meet, the Task will be removed from the interrupted queue and executed. This is usually where a Servlet/JSP are executed.
8. DefaultAsyncExecutor.postExecute() will flush the response to the client.
Note that DefaultAsyncExecutor.preExecute/interrupt/postExecute always delegate the execution to a set of ProcessorTask methods, thus requests are always executed the same way, independently of ARP or SRP.
It is important to note that the default implementation isn’t locking the thread when a Task is interrupted. The default Scheduler only have one thread to hold all SelectionKey associated with all interrupted requests.
OK that’s it for the theory. Let’s work on a real application using GlassFish.
The Gmail Servlet Notifier
Would it be nice if we can have a way to have a Gmail Notifier similar functionalities using HTTP? Not to say there is no Unix version. Of course you can always log in your account and refresh the page, but I don’t want to refresh every minutes or so. You can probably do something similar using AJAX…but I want to do it in Grizzly 🙂
What we need is a Servlet that execute only when new emails are available in a Gmail account. Not only to Gmail, but to any POP3 account. In this case, the execution conditions would consist of connecting to the account, look for new emails. If new emails are available, then execute a Servlet that list all new emails (or display a flashing flag). Whatever the Servlet does, the goal here to to execute it only when its execution conditions are meet.
The solution is simple. We just need to define an AsyncFilter who does:
1. Connect to our Gmail account
2. Look for new emails. If emails are available, execute the Task.
3. If no news emails, interrupt the Task (ARP), wait a couple of seconds (here 10 seconds), then re-evaluate if new emails are in. If new emails has come, then execute the Servlet.
Fortunately we can use JavaMail for (1), and Grizzly default ARP implementation for (2) and (3), and by adding a new AsyncFilter implementation. Then, we just have to:
1. Starts Grizzly using ARP mode.
2. Create a JavaMail Session for Gmail to be used by the AsyncFilter.
2. Deploy the Gmail war file. All the account information (username, password, server url, port, etc.) are included inside web.xml (I agree the password shouldn’t be there).
3. Use a browser to invoke the Servlet. The browser will keep the connection openned until you receive a new email.
4. When new emails are read, the Servlet will be executed and your email information (headers, text) will be displayed.
The implementation just consist of:
+ JavaMailAsyncFilter: this is the AsyncFilter than will determine if ARP or SRP is used.
+ JavaMailAsyncFilterHandler: This interface need to be implemented by the Servlet. Via this interface, the JavaMailAsyncFilter will be able to determine if the execution conditions are meet or not.
+ JavaMailAsyncFilterEvent: this object contains context information about the current execution conditions. This object is shared between the JavaMailAsyncFilter and the Servlet implementation.
+ EmailNotifierServlet: the Servlet that will be interrupted unless new emails are available.
You can find the basic implementation here. Make sure you read the README.TXT which explain how to build and deploy in GlassFish. Since ARP is not a default feature, it is a little complicated to setup (feel free to post here for help). I’ve tested with my local POP3 provider and my Gmail account and it worked fine.
When testing with Gmail, I selected Settings > Forwarding and POP. In the POP Download, I’ve picked “Enable POP only for mail that arrives now on”. Thus my first request to the Servlet will not meet the execution conditions, and the Servlet will be interrupted until I send an email to myself.
Of course the Gmail implementation is far from perfect, and can be greatly improved (improvement are more that welcome!). My goal here was to demonstrate how you can extend Grizzly and start supporting ARP. As always, contributions are welcome.
That’s it. Next time I will try to give some NIO tricks and tips (specially when it time to deal with OP_WRITE).
P.S Thanks to Andrea Egloff for his useful feedback on the default ARP implementation.
_uacct = “UA-3111670-1”;