16 Linux server monitoring commands you really need to know
Want to know what’s really going on with your server? Then you need to know these essential commands. Once you’ve mastered them, you’ll be well on your way to being an expert Linux system administrator.
Sure, you can use a GUI program to pull up much of the information that these shell commands can give you, depending on the Linux distribution. SUSE Linux Enterprise Server and openSUSE, for example, have an excellent graphical configuration and management tool, YaST. And there are universal tools, such as Webmin and cPanel, which can be used on any Linux server.
However, it’s a Linux administrator truism that you should run a GUI on a server only when you absolutely must. That’s because Linux GUIs take up system resources that could be better used elsewhere. So, while using a GUI program is fine for basic server health checkups, if you want to know what’s really happening, turn off the GUI and use these tools from the Linux command shell.
This also means you should start a GUI on a server only when it’s required; don’t leave it running. For optimum performance, a Linux server should run at runlevel 3, which fully supports networking and multiple users but doesn’t start the GUI when the machine boots. If you really need a graphical desktop, you can always get one by running startx from a shell prompt.
If your server starts by booting into a graphical desktop, you need to change this. To do so, head to a terminal window, su to the root user, and use your favorite editor on /etc/inittab.
Once there, find the initdefault
line and change it from:
id:5:initdefault:
to
id:3:initdefault:
If there is no inittab file, create it and add the id:3 line. Save and exit. The next time you boot into your server, it will boot into runlevel 3. If you don’t want to reboot after this change, you can also set your server’s run level immediately with the command:
init 3
Once your server is running at init 3, you can start using the following shell programs to see what’s happening inside your server.
iostat
The iostat
command shows in detail what your storage subsystem is up to. You usually use iostat to monitor how well your storage subsystems are working in general and to spot slow I/O problems before your clients notice that the server is running slowly. Trust me, you want to spot these problems before your users do!
meminfo and free
Meminfo
gives you a detailed list of what’s going on in memory. Typically you access meminfo
’s data by using another program such as cat
or grep
. For example,
cat /proc/meminfo
gives you the details of what’s going on in your server’s memory at any given moment.
For a quick “just the facts” look at memory, you can use the free
command. In short, free
gives you the overview; meminfo
gives you the details.
mpstat
The mpstat
command reports on the activities of each of the available CPUs on a multi-processor server. These days, thanks to multi-core processors, that’s almost all servers. Mpstat also reports on the average activities of all your server’s CPUs. It enables you to display overall CPU statistics per system or per processor. This overview can alert you to possible application problems before they get to the point of annoying users.
Keep up with the latest in IT-driven business innovation with our weekly newsletter.
netstat
Netstat
, like ps
, is a Linux tool that administrators use every day. It displays a lot of network-related information, such as socket usage, routing, interface, protocol, network statistics, and more. Some of the most commonly used options are:
-a Show all socket information -r Show routing information -i Show network interface statistics
-s Show network protocol statistics
nmon
Nmon
, short for Nigel’s Monitor, is a popular open-source tool to monitor Linux system performance. Nmon
watches the performance information for several subsystems, such as processor utilization, memory utilization, run queue information, disk I/O statistics, network I/O statistics, paging activity, and process metrics. You can then view nmon
’s real-time system measurements via its curses
“graphical” interface.
To run nmon
, you start the tool from the shell. Once up, you select the subsystems to monitor by typing in its one-key commands. For example, to get CPU, memory, and disk statistics, you type c
, m
, and d
. You can also use nmon
with the -f
flag to save performance statistics to a CSV file for later analysis.
For day-to-day server monitoring, I find nmon
to be the single most useful program in my Linux system management toolkit.
pmap
The https://linux.die.net/man/1/pmap
command reports the amount of memory that your server’s processes are using. You can use this tool to determine which processes on the server are being allocated memory and whether any of these processes are being piggy with RAM.
ps and pstree
The ps
and pstree
commands are two of the Linux administrator’s best friends. They both provide a list of all currently running processes. The ps
command tells you how much memory and processor time the server’s programs are using; pstree
shows less information but highlights which processes are the children of other processes. Armed with this information, you can spot out-of-control processes and kill them off with Linux’s “take no prisoners” kill
command.
sar
The sar
program is the Swiss Army knife of system monitoring tools. The sar
command is actually made up of three separate programs: sar
, which displays the data, and sa1
and sa2
, which collect and store it. Once installed, sar
creates a detailed overview of CPU utilization, memory paging, network I/O and transfer statistics, process creation activity, and storage device activity. The big difference between sar
and nmon
is that the former is better at long-term system monitoring; I find nmon
to be better at giving me a quick read on my server’s status.
strace
Strace
is often thought of as a programmer’s debugging tool, but it’s more than that. It intercepts and records the system calls that are called by a process. This makes it a useful diagnostic, instructional, and debugging tool. For example, you can use strace
to find out which configuration file a program is actually using when it starts up.
Strace
does have one flaw, though. When it’s checking out a specific process, that process’ performance falls through the floor. Thus, I use strace
only when I already have a darned good reason to think that that program is causing trouble.
tcpdump
Tcpdump
is a simple, robust network monitoring utility. Its basic protocol analyzing capability enables you to get a rough view of what is happening on your network. To really dig into what’s going on with your network, however, you want to use Wireshark
(see below).
top
The top
command shows what’s going on with your active processes. By default, it displays the most CPU-intensive tasks running on the server and updates the list every five seconds. You can sort the processes by PID (Process ID); age, newest first; time, by cumulative time; and resident memory usage and total time it’s been using the CPU since startup. I find this a fast and easy way to see if any process is starting to lurch out of control and into trouble.
uptime
Use uptime
to see how long the server has been running and how many users are logged on. It also gives you an overview of the average server load. The optimal value of the load is 1 or less, which means that each process has immediate access to the CPU and there are no CPU cycles lost.
vmstat
For the most part, you use vmstat
to monitor what’s going on with virtual memory. Linux constantly uses virtual memory to get the best possible storage performance.
If your applications are taking up too much memory, you get excessive page-outs—programs moving from RAM to your system’s swap space, which is on the hard drive. Your server can reach a point where it’s spending more time managing memory paging than running your applications, a condition called thrashing. When your computer is thrashing, its performance falls through the floor. Vmstat
, which can display either average data or actual samples, can help you spot memory pig programs and processes before they bring your server to a crawl.
Wireshark
Wireshark
, formerly known as Ethereal (and still often referred to that way), is tcpdump
’s big brother, though more sophisticated and with far more advanced protocol analysis and reporting. Wireshark
has both a GUI interface and a shell interface. If you do any serious network administration, you must use ethereal
. And, if you’re using Wireshark/ethereal
, I highly recommend Chris Sander’s Practical Packet Analysis, a great book on how to get the most out of this useful program.