Archive

Archive for July, 2010

Friday’s Trick #3: Preventing memory leak with websocket/comet applications

Both Websocket and/or Comet applications can easily crash your Webserver if the resources associated with the upgraded/suspended connections aren’t managed adequately. Today I will explain two pitfalls to avoid when writing asynchronous applications, independently of the transport used (WebSocket or/and Comet).

An asynchronous application can crash with an out of memory (OOM) error under the following conditions:

  • Too many suspended connections: for every suspended connection (or upgraded for Websocket), the Webserver always have some resources associated with it like byte arrays, buffer, etc. If you suspend/upgrade too many connections, you can easily run into OOM as the garbage collector will never be able to reclaims those resources.
  • Disconnected Connections: WebServer that aren’t supporting Comet like Tomcat 5.5 or all Jetty version (unfortunately!) aren’t detecting when a connection get closed, either by a Browser or a Proxy. In that case, resources associated with those connections will never be reclaimed by the garbage collector.

One point to note here is that most if not all Comet API support a timeout when suspending a connection:

@Suspend(30, TimeUnit.SECONDS);

The above means the connection will be suspended for 30 seconds if there is no activity happening, and then resumed. This is used most of the time when the browser is using the long polling technique, e.g if a server side event occur, you resume and clean the resources associated with that connection. If no events, then the resource will be cleared after 30 seconds, so the probability of OOM is reduced. You can apply the same solution for Websocket or Http Streaming, e.g suspend/upgrade for a long period of time like:

@Suspend(60, TimeUnit.MINUTES);

Of course setting a higher time out increase the OOM probability. Hence you need to be extra careful when setting that value. One solution is to make sure you aren’t suspending too many connections per server by clustering your application, and distributing the load amongst your nodes. Another solution is to monitor the number of suspended connections, and resume them using some policy (like FIFO) when a threshold is reached. For example, with Atmosphere, you configure the policy by just doing:

broadcaster.setSuspendPolicy(threshold, POLICY.RESUME); // or POLICY.REJECT

Atmosphere will start resuming connections or indicate that the limit has been reached when the threshold is reached. If you aren’t using Atmosphere, you must implement a similar mechanism but that can be painful. 🙂

The second issue, which I consider more serious, is when Webserver aren’t able to detect when a connection get closed by the browser or a proxy. This can happen if you are using Jetty (all versions) or a WebServer that doesn’t support Comet natively. The effect is extremely bad, as all suspended connections’ resources will be locked in memory forever and never reclaimed by the garbage collector. You can clean some of them by using the @Suspend timeout — but it complexify your application logic, e.g is it a timeout or a disconnection.  Worse, for WebSocket and Http Streaming, which usually never times out (or times out after a long period), you are under a high risk on OOM. With Atmosphere, all you need to to is to tell the framework to clean idle resources for you after a certain delay:

<init-param>

    <param-name>org.atmosphere.cpr.CometSupport.maxInactiveActivity</param-name>

    <param-value>30000</param-value>

</init-param>

Using that mechanism you are guarantee that if the WebServer isn’t detecting the closed connection, Atmosphere will emulate it and appropriately tell your application that a connection has been closed (not resumed, which usually isn’t implemented using the same logic).

For any questions or to download Atmosphere Client and Server Framework, go to our main site and use our Nabble forum, or follow the team or myself and tweet your questions there! You can also checkout the code on Github.

Categories: Atmosphere, Comet, Websocket

Friday’s Trick #2: Websocket/Comet survival guide to Proxy, Firewall and Network Outage

Independently of what transport you are using (WebSocket or Comet or both), a connection can always be closed by a Proxy or Firewall, or an expected network outage can close your connection. Why is it a problem? It’s problematic when a disconnection happens as you may loose server side events if you don’t architect your application correctly. This week I will describe how you can avoid loosing server side events using Atmosphere.

You can loose server side events under the following conditions:

  • long-polling: between reconnection, servers side events may happens and if they aren’t persisted, those events will never reach your client.
  • websocket: Websocket are new and most if not all firewall will close them after some X idle times. Again, all server sides events will be lost
  • http-streaming: Some proxy really don’t like the http-streaming technique, and will close it right away. Again, possibility to loose server sides events.
  • Unexpected network outage: the connection can also be closed by something between your browser and server.

For some application like chat, it may not be an issue if you loose some server side events, but for the majority of them it is critical to never loose any server sides event.

BroadcasterCache

The problem is easily solved with Atmosphere by either implementing your own BroadcasterCache, which is as simple as:

void addToCache(AtmosphereResource<V, W> r, Object serverSideEvent);

List<Object> retrieveFromCache(AtmosphereResource<V, W> r);

You can easily implement that interface and persist any server side events using memcached, your favorite database, etc.  As soon as you tell Atmosphere which BroacasterCache to use, the magic will happens.  If you aren’t familiar with Atmosphere, the AtmosphereResource represents a request from where you can decide to suspend, resume or broadcast server side events. In case you implement your own, you can always persist using this object as a key or retrieve some information from it in order to construct the list of missed server side events.

Atmosphere ships with two build in BroadcasterCache:

  • HeaderBroadcasterCache: Use a special header to tell the server what was your last receives server side events.
  • SessionBroadcasterCache: Use the HttpSession/Cookie to tell the server what was your last receives server side events.

The HeaderBroadcasterCache uses the following header:

X-Cache-Date: Fri Jul 02 2010 10:34:21 GMT-0400 (EST).

So if an disconnection happens, independently of the transport used, you are guarantee to retrieve the server side events that occurred during your disconnected period. Hence your application is shielded from loosing data. By default, the Atmosphere JQuery Plugin always add the  X-Cache-Date header, so all you need to do is to configure your BroadcasterCache in either web.xml, atmosphere.xml or programmatically:

<init-param>
    <param-name>org.atmosphere.cpr.broadcasterCacheClass</param-name>
    <param-value>org.atmosphere.cache.HeaderBroadcasterCache</param-value>
</init-param>

That’s it!

For any questions or to download Atmosphere Client and Server Framework, go to our main site and use our Nabble forum, or follow the team or myself and tweet your questions there! You can also checkout the code on Github.

Categories: Atmosphere, Comet, Websocket

Leaving Ning

Today I’ve officially resigned from Ning. Ning was a wonderful place to work, but I wanted to spend more time with my three little monsters and avoid traveling to California so often. I’ve never worked in a team of skilled architects like that and I will miss all the learning I was doing every day. Thanks to all of you and I’m sure Ning will be a great success!! I hope we all keep in touch!


What I’m gonna do? My quest for my next company start this week … things may move quite fast as you all know 🙂 For sure I will take a couple of weeks off as I’ve left Sun on a Friday and the next Monday I was at Ning. Bad idea but I was so trilled to join a great place like Ning!

I will not disappear completely as I can’t stop improving my Atmosphere Framework and support our growwwwwing community..so I will not be 100% on vacation. I also want to explore Akka as this project is so interesting and the community there is just awesome, and i can be dangerous as I have commit access :-)! I will also continue to actively work on the Async Http Client…actually I will spend more time on it now!!.

So, Just follow me on Twitter for summer news :-).

Categories: Uncategorized