Search

Combining Log Files

0 views

Merging System Log Files into a Single Chronological Record

When a Linux system runs for weeks or months, the /var/log directory fills with dozens of files: syslog, kern.log, auth.log, and more. Each file is already sorted by the time the logger wrote the entry, but if you want an end‑to‑end view of everything that happened across the entire system, you have to stitch those files together in strict chronological order. The trick is to keep every line in the correct place, even when the logger compresses repeated messages into a single line that says “…repeated N times”. A single Bash one‑liner, combined with a careful sort, does the job with minimal fuss.

The first thing to understand is how the system logger formats its output. A normal line looks like this:

Prompt
Feb 21 12:34:56 buster kernel: device eth0 entered promiscuous mode

Three fields form the timestamp: the month abbreviation, the day of the month, and the time down to the second. The rest of the line is the actual message. Because the logger only writes one second of resolution and never records the year, you have to be careful when your log files span a month‑end, year‑end, or daylight‑saving transition.

Because every line already starts with a timestamp, you can sort the combined output by the first three fields. The sort command does that very well. You tell it to treat the month, day, and time as three separate keys: -k 1,1M for the month, -k 2,2n for the day, and -k 3,3 for the time. The M flag makes sort interpret the month names alphabetically (Jan, Feb, etc.). The n flag tells sort to interpret the day as a number so that “10” comes after “2” correctly. Without those flags the default alphanumeric order would place “Feb 9” after “Feb 10”, which would break the chronology.

There is one wrinkle, however: when the logger sees the same exact message repeated many times, it writes a single line that looks like this:

Prompt
Feb 21 12:48:16 buster last message repeated 7923 times

The timestamp for that line is usually later than the last time the original message actually appeared. If you sort the files blindly, you might intermix unrelated messages from a different log file that happened between the last original line and the repeated‑message line. That would create a false sense of continuity.

To keep the pair of lines glued together, the script below inserts a null character between the original message and its repetition line. A null character (ASCII 0) never shows up in normal log text, so it acts as a safe delimiter. The script remembers the last line it read and, if it sees a repetition line next, it prints the stored line followed by the null, then the repetition line. If the file ends before a repetition line appears, the script prints the last line as usual and forgets it so that it won't be mistakenly paired with the next file’s first line.

Here is the full shell one‑liner. Copy it to a file, give it execute permission, and run it against as many log files as you like.

Prompt
#!/bin/sh</p> <p>perl -ne '</p> <p> print $last, /last message repeated \d+ times$/ ? "" : " " if $last;</p> <p> chomp($last = $_);</p> <p> if (eof) {</p> <p> print $last if defined $last;</p> <p> undef $last;</p> <p> }</p> <p>' "$@" | sort -s -k 1,1M -k 2,2n -k 3,3 | tr '\0' ' '

After sorting, the tr command swaps every null back into a real newline, splitting the glued pair into two separate lines again. Because the null was only ever inserted between an original and its repetition, the two lines stay adjacent in the sorted stream.

The -s flag on sort tells the tool to be stable: if all three keys match, it keeps the lines in the order they originally appeared in the input files. That matters when you have multiple messages that share the same timestamp; stability preserves the natural order you’d see if you were reading the logs one file at a time. If you need to remove duplicate messages that appear in multiple log files, replace -s with -u and add a fourth key that looks at the message body: -k 4. This will drop identical lines but can reorder messages that happen at the same second, because the sort is no longer stable. The trade‑off is worth it if you want a clean, duplicate‑free history.

There are a few more edge cases to be aware of. The system logger does not record the year. If your log files cross from December into January, you must merge each year separately and then concatenate the results. Similarly, because timestamps are written in the local time zone, a daylight‑saving transition can cause the clock to jump back an hour. If you try to merge logs that straddle that boundary, you could get duplicate timestamps that sort in the wrong order. The safest practice is to split the logs just before and after the transition, merge them independently, and then join the two sorted outputs.

Remote logging introduces another layer of complexity. If your syslog daemon receives messages from other machines, the timestamps come from the remote host, not from the machine that stores the log. Because each host records its own local time, logs from different time zones still sort correctly as long as you keep the remote timestamps unchanged. If a remote host’s clock is off, the log line will appear in the wrong place; that’s a separate problem that needs time‑synchronization rather than sorting tricks.

Putting all of this together, the workflow is simple:

  • Gather all the log files you want to merge, ensuring they belong to the same time period (same year, same DST segment).
  • Run the script against them: ./merge_logs.sh /var/log/*.log
  • If you want a duplicate‑free output, add --unique to the script (you’ll need to tweak it slightly to call sort -u).
  • Inspect the resulting file for any obvious gaps or misordered entries. If something looks wrong, check the original files around the problematic timestamp.

    For reference, the GNU sort manual page explains all of the key flags and options in detail:

Suggest a Correction

Found an error or have a suggestion? Let us know and we'll review it.

Share this article

Comments (0)

Please sign in to leave a comment.

No comments yet. Be the first to comment!

Related Articles