You will probably need sudo/root in order to use the tracing tools. Once you launch the tool, you use Ctrl-C to end the tracing session. Tracing is something that can be done against processes without being overly disruptive, but it will definitely slow the process down. You should always exercise caution in what you trace and when/how long you trace.
The commands that you see displayed in tusc/strace and their corresponding parameters/return values can typically be tracked down with a little time on google. There is normally pretty decent documentation somewhere out there on the various syscalls you will see in tusc/strace.
Unix - tusc
The system tracing tool on unix is called tusc, here are some example usages:
- tusc 1234 Does a standard trace of process id 1234 and send the output to stdout. Most useful for getting a quick glance at what a process is doing, why it's hung, why it's using lots of processor, etc.
- tusc -o outfile.dat 1234 Does a standard trace of process id 1234 and send the output to the file called outfile.dat - this file can easily grow very large so be careful when using this option. Don't run it any longer than necessary!
- tusc -f -o outfile.dat 1234 Does a trace, redirects output to outfile.dat, and also traces any children processes of process 1234
- tusc -c 1234 Gives a count/aggregation of the system calls the process did during the duration of your tracing session. Useful for seeing a higher level picture of where the process was spending its time.
- man tusc Find out more information about tusc's options
Linux - strace
Strace provides similar capabilities, but the commands can be slightly different. Below are the commands used in linux to accomplish the same thing as the corresponding command in unix:
- strace -p 1234 Does a standard trace of process id 1234 and send the output to stdout. Most useful for getting a quick glance at what a process is doing, why it's hung, why it's using lots of processor, etc.
- strace -o outfile.dat -p 1234 Does a standard trace of process id 1234 and send the output to the file called outfile.dat - this file can easily grow very large so be careful when using this option. Don't run it any longer than necessary!
- strace -f -o outfile.dat -p 1234 Does a trace, redirects output to outfile.dat, and also traces any children processes of process 1234
- strace -c -p 1234 Gives a count/aggregation of the system calls the process did during the duration of your tracing session. Useful for seeing a higher level picture of where the process was spending its time.
- man strace Find out more information about strace's options
Translating a process file descriptor into the actual filename
There are times when using the tusc or strace tools that you will see a syscall that involves a file descriptor. When ever a file gets opened by a process, that process has a file descriptor that it uses internally to refer to that process. When doing troubleshooting, it is helpful to translate this file descriptor to the actual physical file.
Unix
Here are some sample lines from an actual tusc tracing session:
fcntl(46, F_SETLK, 0x7eff19c8) ................................................ ERR#13 EACCES
fcntl(46, F_GETLK, 0x7eff19c8) ................................................ = 0
pread(46, "10121419\001\001\0\0\001\0\0\001".., 512, 0) ....................... = 512
In this case the process is calling fcntl (file control operations) and pread (reading/writing from/to a certain part of the file) with the first parameter of 46. This is the process' file descriptor for some file. To determine the physical file that corresponds to this... (this assumes you have some unix admin experience)
1. Open glance
2. Hit F for the Open Files report
3. It will prompt you for the process id, key in the process id you are troubleshooting
4. The list of open files in ascending file descriptor order is displayed, record the file system and inode number that corresponds to your file descriptor (FD)
5. Exit glance
6. Run this command: find /filesystem -inum 4567 replacing /filesystem and 4567 with your filesystem and inode number respectively. This will show the file that corresponds to this information
Linux
Here are some sample lines from an actual strace tracing session:
9447 fcntl(30, F_SETLK, {type=F_WRLCK, whence=SEEK_SET, start=1838582860, len=1}) = -1 EAGAIN (Resource temporarily unavailable)
9447 fcntl(30, F_GETLK, {type=F_WRLCK, whence=SEEK_SET, start=1838582860, len=1, pid=28897}) = 0
9447 pread(30, "\1\1\24\31\0\4\0\1\0\1\0\1\2\275\309\0\0\1\2\176\100\0\0\1\2\176\231\0\1\0\0"..., 512, 0) = 512
In this case the process is calling fcntl (file control operations) and pread (reading/writing from/to a certain part of the file) with the first parameter of 30. This is the process' file descriptor for some file. To determine the physical file that corresponds to this...
You must be the owner of the process or have sudo access
Run this command: ls -ltr /proc/9447/fd/30 where 9447 is your process id and 30 is the file descriptor number you observed during tracing. It will display something like this that tells you what file this represents:
l-wx------ 1 root root 64 Dec 13 14:38 /proc/9447/fd/30 -> /data/somefile