Tuesday, January 3, 2012

Tracing unix/linux processes

Unix and Linux provide similar tools for doing system traces. These system traces show the interaction between user space programs and the kernel of the OS - ie the low level operations such as opening files, reading data, etc. Doing system traces can be useful when troubleshooting applications and OS issues. The typical use case for these tools assume that you know the process id of the process you want to trace. You can find the process id using commands like "ps -ef | grep something" where something is the part of the signature of the process that you want to trace

You will probably need sudo/root in order to use the tracing tools. Once you launch the tool, you use Ctrl-C to end the tracing session. Tracing is something that can be done against processes without being overly disruptive, but it will definitely slow the process down. You should always exercise caution in what you trace and when/how long you trace.

The commands that you see displayed in tusc/strace and their corresponding parameters/return values can typically be tracked down with a little time on google. There is normally pretty decent documentation somewhere out there on the various syscalls you will see in tusc/strace.

Unix - tusc
The system tracing tool on unix is called tusc, here are some example usages:
  • tusc 1234 Does a standard trace of process id 1234 and send the output to stdout. Most useful for getting a quick glance at what a process is doing, why it's hung, why it's using lots of processor, etc.
  • tusc -o outfile.dat 1234 Does a standard trace of process id 1234 and send the output to the file called outfile.dat - this file can easily grow very large so be careful when using this option. Don't run it any longer than necessary!
  • tusc -f -o outfile.dat 1234 Does a trace, redirects output to outfile.dat, and also traces any children processes of process 1234
  • tusc -c 1234 Gives a count/aggregation of the system calls the process did during the duration of your tracing session. Useful for seeing a higher level picture of where the process was spending its time.
  • man tusc Find out more information about tusc's options

Linux - strace
Strace provides similar capabilities, but the commands can be slightly different. Below are the commands used in linux to accomplish the same thing as the corresponding command in unix:

  • strace -p 1234 Does a standard trace of process id 1234 and send the output to stdout. Most useful for getting a quick glance at what a process is doing, why it's hung, why it's using lots of processor, etc.
  • strace -o outfile.dat -p 1234 Does a standard trace of process id 1234 and send the output to the file called outfile.dat - this file can easily grow very large so be careful when using this option. Don't run it any longer than necessary!
  • strace -f -o outfile.dat -p 1234 Does a trace, redirects output to outfile.dat, and also traces any children processes of process 1234
  • strace -c -p 1234 Gives a count/aggregation of the system calls the process did during the duration of your tracing session. Useful for seeing a higher level picture of where the process was spending its time.
  • man strace Find out more information about strace's options

Translating a process file descriptor into the actual filename
There are times when using the tusc or strace tools that you will see a syscall that involves a file descriptor. When ever a file gets opened by a process, that process has a file descriptor that it uses internally to refer to that process. When doing troubleshooting, it is helpful to translate this file descriptor to the actual physical file.

Here are some sample lines from an actual tusc tracing session:

fcntl(46, F_SETLK, 0x7eff19c8) ................................................ ERR#13 EACCES
fcntl(46, F_GETLK, 0x7eff19c8) ................................................ = 0
pread(46, "10121419\001\001\0\0\001\0\0\001".., 512, 0) ....................... = 512

In this case the process is calling fcntl (file control operations) and pread (reading/writing from/to a certain part of the file) with the first parameter of 46. This is the process' file descriptor for some file. To determine the physical file that corresponds to this... (this assumes you have some unix admin experience)

1. Open glance
2. Hit F for the Open Files report
3. It will prompt you for the process id, key in the process id you are troubleshooting
4. The list of open files in ascending file descriptor order is displayed, record the file system and inode number that corresponds to your file descriptor (FD)
5. Exit glance
6. Run this command: find /filesystem -inum 4567 replacing /filesystem and 4567 with your filesystem and inode number respectively. This will show the file that corresponds to this information

Here are some sample lines from an actual strace tracing session:

9447 fcntl(30, F_SETLK, {type=F_WRLCK, whence=SEEK_SET, start=1838582860, len=1}) = -1 EAGAIN (Resource temporarily unavailable)
9447 fcntl(30, F_GETLK, {type=F_WRLCK, whence=SEEK_SET, start=1838582860, len=1, pid=28897}) = 0
9447 pread(30, "\1\1\24\31\0\4\0\1\0\1\0\1\2\275\309\0\0\1\2\176\100\0\0\1\2\176\231\0\1\0\0"..., 512, 0) = 512

In this case the process is calling fcntl (file control operations) and pread (reading/writing from/to a certain part of the file) with the first parameter of 30. This is the process' file descriptor for some file. To determine the physical file that corresponds to this...

You must be the owner of the process or have sudo access
Run this command: ls -ltr /proc/9447/fd/30 where 9447 is your process id and 30 is the file descriptor number you observed during tracing. It will display something like this that tells you what file this represents:
l-wx------ 1 root root 64 Dec 13 14:38 /proc/9447/fd/30 -> /data/somefile

Monday, August 8, 2011

Linux guest on Vmware problem solved...

So we've had this intermittent issue over the last year+ where sometimes when we add memory to a linux vm in our vmware environment, the box will act really weird with the new memory. It looks like there's a memory leak or something, because as soon as we starting applications and begin using the box, we run out of memory, it becomes slow as it starts swapping, etc. I've dug into it before, but could never identify where the memory was going. I could tell that there was no more available memory, but even once we stopped the applications, we wouldn't have as much free memory as we should have (free meaning including RAM that was currently being used for fs cache/buffer). I think the ultimate solution was usually to rebuild the box =( This also happened sometimes when building a new vm that was cloned from a different one. I always suspected it was something in vmware, maybe a memory leak, maybe something in the memory add and cloning process, but I could never nail it down.

So it happened again recently, we added memory and the box started acting up - additional reboots resulted in the same issue. I jumped on the box right away and actually saw the vmmemctl process in the guest was one of the top processes by cpu usage. I remember last time this happened I got down the path of suspecting it was something with vmmemctl and or memory ballooning, but not knowing vmware I didn't put two and two together and stopped digging before things clicked for me. This time after looking at what was happening on the box, doing some quick googling, I found this forum post- http://communities.vmware.com/thread/133290 and then went into the vsphere client, checked out that guest's limit... sure enough the limit was set to 1gb of ram even though there was 6gb of ram allocated to the guest!! I had the sysadmins change it to unlimited, but the box didn't start responding right away, so we did one more reboot... and problem solved.

Looking back, it all makes sense now. I talked to our sysadmin who was most familiar with these vmware controls (not the one I had worked with on this in the past)_and see how adding memory and cloning definitely could cause this misconfiguration to all of a sudden become a problem. I'm not sure how the limit got set in the first place, since it's not something our sysadmins use... we are wondering if upgrading the virtual hardware or vmware tools sets it sometimes??

The other thing that still bugs me is how vmmemctl "uses" memory when vmware is doing memory ballooning or enforcing memory limits to make the guest think there's no more available memory... Did I miss something as far as detecting that's the process that was using memory? Or is there no way from the guest point of view to be able to tell that ballooning/limit enforcement is kicking in at the time?

Saturday, July 23, 2011

Using a HornetQ bridge with JMS order by group

Left this as is for reference, but it doesn't work as designed. See bottom of the post for how to actually get this to work.
Some of our jms processes (several message producers sending to HornetQ jms bridges that push messages to Jboss Messaging queues living in Jboss EAP 5) use message groups for controlling when messages get processed. In this particular environment there are a couple extra steps to make sure it works correctly:

1. Add a new connection factory to the jboss instance where JBM is hosted - edit file deploy/messaging/connection-factories-service.xml and add:

<mbean code="org.jboss.jms.server.connectionfactory.ConnectionFactory"
<depends optional-attribute-name="ServerPeer">jboss.messaging:service=ServerPeer</depends>
<depends optional-attribute-name="Connector">jboss.messaging:service=Connector,transport=bisocket</depends>

<attribute name="JNDIBindings">

<attribute name="EnableOrderingGroup">true</attribute>

You could of course, edit the existing ConnectionFactory to add the EnableOrderingGroup parameter if you wanted to enable the ordering group for your existing CF, rather than adding a new one and leaving the existing one alone.

2. Change hornetq config (this assumes you already have the other hornetq config done) - edit hornetq-beans.xml, change the target connection factory definition to look like this, making sure you specify the JNDI name of the CF you created in step 1 (GroupedXAConnectionFactory in this case)

<!-- TargetCFF describes the ConnectionFactory used to connect to the target destination --> 
<bean name="TargetCFF" class="org.hornetq.jms.bridge.impl.JNDIConnectionFactoryFactory"> 
<inject bean="TargetJNDI" /> 

3. When creating the message to send, make sure you do this:
msg.setStringProperty("JMSXGroupID", myGroupID);

Depending on how you have them configured, you probably will have to restart hornetq and jboss for the changes to take effect. Once completing all these steps, you will be able to send messages through the bridge to the jbm queue and still have the messages grouped/ordered correctly.

Update: This does NOT work since Jboss Messaging doesn't seem to follow the jms spec for message groups. What happens is all the messages that go through this ConnectionFactory end up getting the same group id. I will update the post with how to actually accomplish this.

How to actually do this
1. I think it would work if both messaging servers were JBM, but I'm not sure about the bridging capabilities of jbm, so it may not work for a different reason
2. Have something pull the messages off the hornetq queue where that message grouping works, and submit them to the jbm queues using their setMessageGroup method. This means a lot of extra work and the bridge concept becomes way more complicated, even if it would work.
3. Switch the message provider of the destination (jboss in our case) to hornetq, meaning that both the bridge and the destination have the same message grouping semantics.

We ended up doing #3, and after getting hornetq to work in jboss eap 5 (make sure you read the redhat docs on how to configure this!) it has worked flawlessly.

Tuesday, June 21, 2011

JBoss Drools in a warehouse management system..

So the last year I've been working on an application at work that serves as a module of our warehouse management system. The short version of the problem description is that we were automating the process by which fork lift drivers figured out what pallets needed to be lowered from the storage location up in the racks down to the pick bin (floor level) where the order selectors grab the cases when filling an order. The previous process relied upon the lift drivers to just know when a bin needed to be replenished, and if a bin was empty, the order selector had to find the lift driver and yell to them that bin XYZ needed to be replenished. Not terribly efficient...

The application we built to solve this problem runs on top of Jboss 5 and PostgreSQL 8.4, uses JPA, Jboss Drools, EJB, and JMS. Our first version actually used JBPM as well, but we went away from that as it wasn't a good fit for what we needed. We also started with GWT on the front end but ditched that a few months into the project in favor of Grails + Jquery.

To make all this happen we had to integrate with our legacy COBOL homegrown WMS and several different pieces of the voice picking system used by the order selectors. Obviously all the data is coming from the COBOL wms system, but we added hooks into the voice picking system so the selectors could both "call in" a replenishment if the bin was empty, and hear when the bin they called in was replenished.

The early versions made decisions of what task to give a replenisher (fork lift driver) based almost completely on the number of cases in the bin. The only other variables were what region he was authorized to work in and his last known location, which we used to try to give him a replenishment task to do that was close to him. We have added more and more data to those calculations, which have become pretty significant.

The calculations of what the priority of the replenishment tasks are, and the calculations for which of the tasks someone should actually do are all done using Drools. I had never used a rules engine before this project, and it took a little time to get the hang of it, but I have to say now, I really enjoy using Drools! Our last major change included shifting the priority calculation from stateless rules sessions to a stateful rules session, which brought on another whole set of challenges. The task priorities are now set using not just the number of cases in the bin, but also all the various kinds of demand information in the system and how far along they are in their life-cycle. By demand information I mean orders that are being worked on currently by selectors, orders that haven't been processed yet, etc. The task dispensing logic is still done using stateless rules sessions, but includes logic for special situations, what to do in the event of a tie between the score of two tasks, etc in addition to the proximity logic.

I could go on and on about lessons learned along the way, which I plan on doing in upcoming posts... There are things with jvm turning, jms features, drools features, etc that I think would be useful for others developing applications.

Friday, February 11, 2011

JVM GC Algorithms

Not really much unique content today... but if you're a java person here are a couple very helpful links. I've often tried to figure out exactly which gc algorithms are being used, how that corresponds to which jvm params you use, and if certain young/tenured garbage collectors have to be used together. I've known for a long time that jconsole will show you which ones are being used, but the names it uses don't always seems to correlate to other documentation, jvm params, etc.

So I got on this hunt the other day of figuring out the specifics and found these two resources which in my mind, spell out all the details!

A Sun engineer's blog post
Neo4j tuning information - see the table a couple paragraphs down from the section title.

Tuesday, February 8, 2011

Fun with HP-UX and glance

Wow it's been a while since I posted... A lot of fun things going on though and I've found plenty of random useful pieces of technical info along the way that I've neglected to share. Here's a small tidbit of info I learned yesterday.

We have had a couple hard to diagnose problems with our primary HP-UX 11.11 server at work recently, so I've been trying to get more acclimated to hp-ux's diagnostic tools. Makes me really miss the plethora of options in linux... Anyway yesterday's tool of the day was gpm, the graphical version of Glance. Our hp-ux admin mentioned I should check it out, and that he had it running fine from his Fedora 14 laptop using ssh + x forwarding. So I tried to fire it up as well (on my F14 laptop) and was greeted with an error something like this:
***** FATAL ERROR *******
Module: ../../../present/cw/ddx.c Line: 2919
Message: FontVal called on unallocated fontset
***** FATAL ERROR *******

I don't remember the exact line (that was just one of the hits I found on google) but that was the basic message. Some of the hits mentioned having to mess with a font server on hp-ux but I was able to start gpm from my colleague's laptop using my login to the hp-ux server, so I was sure it was something on my machine.

A quick query of the rpm's whose package name included "font" that we had installed on our laptops showed me with 70 some packages and him with 200 some packages... I certainly didn't want to wade through all them... but I took a quick look, then fired up the graphical packagekit frontend to install some of the font packages that sounded applicable.

I installed the following packages:

That did the trick! Maybe some rainy day I will figure out just which one(s) are really needed, but I'm all set for now.

Update: Primoz tracked down the two packages actually needed instead of installing every single one of the ones I did:

Monday, April 26, 2010

Thoughts on Google Docs

I had a number of conversations with people about the new version of g'docs and thought I'd put some thoughts on here about it. I'm a little late in translating my back of the envelope thoughts to the blog, but I still thought I would share...

I think Google's announcement of the new version of g'docs was over hyped by the media. That doesn't change the fact that it was a significant announcement and a very useful set of changes though. The actual upgrade though couldn't get simpler - go into settings and click an option to try the new version... repeat the process to roll back. Nice and simple, and you get all the useful features that you can read about in every tech magazine and website. Now, let's compare the upgrade process for the last MS Office release. Once you buy, download, or otherwise obtain the media and install, you go through the install process (who knows how long this process takes and how much it costs) and what do you get? A new toolbar and a couple other things that most people won't really use. Oh wait, you get compatibility with the newest version of the office format. So I have to upgrade to a new version just so I can create/use documents that use the new format? And this benefits me how?

Google docs does not have as much traditional office suite functionality as MS office... there I said it. But think about that... how much traditional office suite functionality do people really need? If there has been so much critical functionality added in the last few releases of MS office, why are so many people still using 2000 or 2003 versions (if not older... heh)?

What's the biggest glaring omission from MS Office? It's not the ability to do some crazy formatting thing when doing a mail merge that 1% of users do once a year. That's very nice to have if you are that person. However, for rest of us, the missing functionality is collaboration. And that is one of the things Google improved with the new version of g'docs. They took something they already do great and improved upon it, while MS office is still in the stone ages. I'm sure you can do something like that using some crazy combination of SharePoint, expensive Windows only office connectors, etc. But it's not collaboration for the masses!

And oh by the way, Google's new version is a rewrite under the covers which allows faster innovation in the future.

This sounds like a pretty good deal to me. So yes, you could dismiss the Google docs upgrade as not important... but if you do I think you're missing it. I mean honestly, I could dismiss any office suite upgrade as trivial - who gets excited about office suites after all? But in the office suite grand scheme of things, I think this is something to get excited about.

Since scratching down my original thoughts, the ms/fb announcement has come out... that certainly looks interesting and it's always good to see more competition in any space. But let's be honest people - do you really thing MS would be doing that if it weren't for Google Docs??