Monday, August 8, 2011

Linux guest on Vmware problem solved...

So we've had this intermittent issue over the last year+ where sometimes when we add memory to a linux vm in our vmware environment, the box will act really weird with the new memory. It looks like there's a memory leak or something, because as soon as we starting applications and begin using the box, we run out of memory, it becomes slow as it starts swapping, etc. I've dug into it before, but could never identify where the memory was going. I could tell that there was no more available memory, but even once we stopped the applications, we wouldn't have as much free memory as we should have (free meaning including RAM that was currently being used for fs cache/buffer). I think the ultimate solution was usually to rebuild the box =( This also happened sometimes when building a new vm that was cloned from a different one. I always suspected it was something in vmware, maybe a memory leak, maybe something in the memory add and cloning process, but I could never nail it down.

So it happened again recently, we added memory and the box started acting up - additional reboots resulted in the same issue. I jumped on the box right away and actually saw the vmmemctl process in the guest was one of the top processes by cpu usage. I remember last time this happened I got down the path of suspecting it was something with vmmemctl and or memory ballooning, but not knowing vmware I didn't put two and two together and stopped digging before things clicked for me. This time after looking at what was happening on the box, doing some quick googling, I found this forum post- http://communities.vmware.com/thread/133290 and then went into the vsphere client, checked out that guest's limit... sure enough the limit was set to 1gb of ram even though there was 6gb of ram allocated to the guest!! I had the sysadmins change it to unlimited, but the box didn't start responding right away, so we did one more reboot... and problem solved.

Looking back, it all makes sense now. I talked to our sysadmin who was most familiar with these vmware controls (not the one I had worked with on this in the past)_and see how adding memory and cloning definitely could cause this misconfiguration to all of a sudden become a problem. I'm not sure how the limit got set in the first place, since it's not something our sysadmins use... we are wondering if upgrading the virtual hardware or vmware tools sets it sometimes??

The other thing that still bugs me is how vmmemctl "uses" memory when vmware is doing memory ballooning or enforcing memory limits to make the guest think there's no more available memory... Did I miss something as far as detecting that's the process that was using memory? Or is there no way from the guest point of view to be able to tell that ballooning/limit enforcement is kicking in at the time?

Saturday, July 23, 2011

Using a HornetQ bridge with JMS order by group

Left this as is for reference, but it doesn't work as designed. See bottom of the post for how to actually get this to work.
Some of our jms processes (several message producers sending to HornetQ jms bridges that push messages to Jboss Messaging queues living in Jboss EAP 5) use message groups for controlling when messages get processed. In this particular environment there are a couple extra steps to make sure it works correctly:

1. Add a new connection factory to the jboss instance where JBM is hosted - edit file deploy/messaging/connection-factories-service.xml and add:

<mbean code="org.jboss.jms.server.connectionfactory.ConnectionFactory"
name="jboss.messaging.connectionfactory:service=GroupedConnectionFactory"
xmbean-dd="xmdesc/ConnectionFactory-xmbean.xml">
<depends optional-attribute-name="ServerPeer">jboss.messaging:service=ServerPeer</depends>
<depends optional-attribute-name="Connector">jboss.messaging:service=Connector,transport=bisocket</depends>
<depends>jboss.messaging:service=PostOffice</depends>

<attribute name="JNDIBindings">
<bindings>
<binding>/GroupedConnectionFactory</binding>
<binding>/GroupedXAConnectionFactory</binding>
<binding>java:/GroupedConnectionFactory</binding>
<binding>java:/GroupedXAConnectionFactory</binding>
</bindings>
</attribute>

<attribute name="EnableOrderingGroup">true</attribute>
</mbean> 

You could of course, edit the existing ConnectionFactory to add the EnableOrderingGroup parameter if you wanted to enable the ordering group for your existing CF, rather than adding a new one and leaving the existing one alone.

2. Change hornetq config (this assumes you already have the other hornetq config done) - edit hornetq-beans.xml, change the target connection factory definition to look like this, making sure you specify the JNDI name of the CF you created in step 1 (GroupedXAConnectionFactory in this case)


<!-- TargetCFF describes the ConnectionFactory used to connect to the target destination --> 
<bean name="TargetCFF" class="org.hornetq.jms.bridge.impl.JNDIConnectionFactoryFactory"> 
<constructor> 
<parameter> 
<inject bean="TargetJNDI" /> 
</parameter> 
<parameter>GroupedXAConnectionFactory</parameter> 
</constructor> 
</bean> 


3. When creating the message to send, make sure you do this:
msg.setStringProperty("JMSXGroupID", myGroupID);

Depending on how you have them configured, you probably will have to restart hornetq and jboss for the changes to take effect. Once completing all these steps, you will be able to send messages through the bridge to the jbm queue and still have the messages grouped/ordered correctly.

Update: This does NOT work since Jboss Messaging doesn't seem to follow the jms spec for message groups. What happens is all the messages that go through this ConnectionFactory end up getting the same group id. I will update the post with how to actually accomplish this.

How to actually do this
1. I think it would work if both messaging servers were JBM, but I'm not sure about the bridging capabilities of jbm, so it may not work for a different reason
2. Have something pull the messages off the hornetq queue where that message grouping works, and submit them to the jbm queues using their setMessageGroup method. This means a lot of extra work and the bridge concept becomes way more complicated, even if it would work.
3. Switch the message provider of the destination (jboss in our case) to hornetq, meaning that both the bridge and the destination have the same message grouping semantics.

We ended up doing #3, and after getting hornetq to work in jboss eap 5 (make sure you read the redhat docs on how to configure this!) it has worked flawlessly.

Tuesday, June 21, 2011

JBoss Drools in a warehouse management system..

So the last year I've been working on an application at work that serves as a module of our warehouse management system. The short version of the problem description is that we were automating the process by which fork lift drivers figured out what pallets needed to be lowered from the storage location up in the racks down to the pick bin (floor level) where the order selectors grab the cases when filling an order. The previous process relied upon the lift drivers to just know when a bin needed to be replenished, and if a bin was empty, the order selector had to find the lift driver and yell to them that bin XYZ needed to be replenished. Not terribly efficient...

The application we built to solve this problem runs on top of Jboss 5 and PostgreSQL 8.4, uses JPA, Jboss Drools, EJB, and JMS. Our first version actually used JBPM as well, but we went away from that as it wasn't a good fit for what we needed. We also started with GWT on the front end but ditched that a few months into the project in favor of Grails + Jquery.

To make all this happen we had to integrate with our legacy COBOL homegrown WMS and several different pieces of the voice picking system used by the order selectors. Obviously all the data is coming from the COBOL wms system, but we added hooks into the voice picking system so the selectors could both "call in" a replenishment if the bin was empty, and hear when the bin they called in was replenished.

The early versions made decisions of what task to give a replenisher (fork lift driver) based almost completely on the number of cases in the bin. The only other variables were what region he was authorized to work in and his last known location, which we used to try to give him a replenishment task to do that was close to him. We have added more and more data to those calculations, which have become pretty significant.

The calculations of what the priority of the replenishment tasks are, and the calculations for which of the tasks someone should actually do are all done using Drools. I had never used a rules engine before this project, and it took a little time to get the hang of it, but I have to say now, I really enjoy using Drools! Our last major change included shifting the priority calculation from stateless rules sessions to a stateful rules session, which brought on another whole set of challenges. The task priorities are now set using not just the number of cases in the bin, but also all the various kinds of demand information in the system and how far along they are in their life-cycle. By demand information I mean orders that are being worked on currently by selectors, orders that haven't been processed yet, etc. The task dispensing logic is still done using stateless rules sessions, but includes logic for special situations, what to do in the event of a tie between the score of two tasks, etc in addition to the proximity logic.

I could go on and on about lessons learned along the way, which I plan on doing in upcoming posts... There are things with jvm turning, jms features, drools features, etc that I think would be useful for others developing applications.

Friday, February 11, 2011

JVM GC Algorithms

Not really much unique content today... but if you're a java person here are a couple very helpful links. I've often tried to figure out exactly which gc algorithms are being used, how that corresponds to which jvm params you use, and if certain young/tenured garbage collectors have to be used together. I've known for a long time that jconsole will show you which ones are being used, but the names it uses don't always seems to correlate to other documentation, jvm params, etc.

So I got on this hunt the other day of figuring out the specifics and found these two resources which in my mind, spell out all the details!

A Sun engineer's blog post
Neo4j tuning information - see the table a couple paragraphs down from the section title.

Tuesday, February 8, 2011

Fun with HP-UX and glance

Wow it's been a while since I posted... A lot of fun things going on though and I've found plenty of random useful pieces of technical info along the way that I've neglected to share. Here's a small tidbit of info I learned yesterday.

We have had a couple hard to diagnose problems with our primary HP-UX 11.11 server at work recently, so I've been trying to get more acclimated to hp-ux's diagnostic tools. Makes me really miss the plethora of options in linux... Anyway yesterday's tool of the day was gpm, the graphical version of Glance. Our hp-ux admin mentioned I should check it out, and that he had it running fine from his Fedora 14 laptop using ssh + x forwarding. So I tried to fire it up as well (on my F14 laptop) and was greeted with an error something like this:
***** FATAL ERROR *******
Module: ../../../present/cw/ddx.c Line: 2919
Message: FontVal called on unallocated fontset
***** FATAL ERROR *******

I don't remember the exact line (that was just one of the hits I found on google) but that was the basic message. Some of the hits mentioned having to mess with a font server on hp-ux but I was able to start gpm from my colleague's laptop using my login to the hp-ux server, so I was sure it was something on my machine.

A quick query of the rpm's whose package name included "font" that we had installed on our laptops showed me with 70 some packages and him with 200 some packages... I certainly didn't want to wade through all them... but I took a quick look, then fired up the graphical packagekit frontend to install some of the font packages that sounded applicable.

I installed the following packages:
iso8859-2-fonts-common-1.0-24.fc14.noarch
gnu-free-fonts-common-20100919-1.fc14.noarch
gnu-free-serif-fonts-20100919-1.fc14.noarch
gnu-free-mono-fonts-20100919-1.fc14.noarch
gnu-free-sans-fonts-20100919-1.fc14.noarch
gnu-free-fonts-compat-20100919-1.fc14.noarch
iso8859-2-100dpi-fonts-1.0-24.fc14.noarch
iso8859-2-misc-fonts-1.0-24.fc14.noarch
iso8859-2-75dpi-fonts-1.0-24.fc14.noarch
xorg-x11-fonts-ISO8859-1-75dpi-7.2-12.fc14.noarch
xorg-x11-fonts-ISO8859-15-100dpi-7.2-12.fc14.noarch
xorg-x11-fonts-ISO8859-14-100dpi-7.2-12.fc14.noarch
xorg-x11-fonts-ISO8859-2-100dpi-7.2-12.fc14.noarch
xorg-x11-fonts-ISO8859-9-100dpi-7.2-12.fc14.noarch
xorg-x11-fonts-ISO8859-9-75dpi-7.2-12.fc14.noarch
xorg-x11-fonts-ISO8859-15-75dpi-7.2-12.fc14.noarch
xorg-x11-fonts-ISO8859-14-75dpi-7.2-12.fc14.noarch
xorg-x11-fonts-ISO8859-2-75dpi-7.2-12.fc14.noarch
ttmkfdir-3.0.9-32.fc12.i686
xorg-x11-fonts-Type1-7.2-12.fc14.noarch

That did the trick! Maybe some rainy day I will figure out just which one(s) are really needed, but I'm all set for now.

Update: Primoz tracked down the two packages actually needed instead of installing every single one of the ones I did:
xorg-x11-fonts-ISO8859-1-100dpi-7.2-12.fc14.noarch
xorg-x11-fonts-ISO8859-1-75dpi-7.2-12.fc14.noarch