<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:media="http://search.yahoo.com/mrss/"><channel><title><![CDATA[bjaerris.com]]></title><description><![CDATA[Linux guides, tips and tricks.]]></description><link>https://bjaerris.com/</link><generator>Ghost 0.6</generator><lastBuildDate>Thu, 01 Jan 2026 19:46:37 GMT</lastBuildDate><atom:link href="https://bjaerris.com/rss/" rel="self" type="application/rss+xml"/><ttl>60</ttl><item><title><![CDATA[Rename Multiple Filenames With Different Date Formats Using Bash And Regular Expressions (Regex) Captures/Substitutions]]></title><description><![CDATA[<p>A common way of batch renaming filenames on the command-line, is to use the "mv" &amp; "sed" commands with a "for loop", or using "find" with the exec option. <br>
Here I'll show a couple of examples utilizing both options.</p>

<p>Here is a typical rename example converting uppercase "JPG" filename extensions</p>]]></description><link>https://bjaerris.com/rename-multiple-filenames-with-different-date-formats-using-bash-and-regular-expressions-regex-capturessubstitutions-2/</link><guid isPermaLink="false">860e0698-3dde-4eee-b63f-f0d44e426e56</guid><category><![CDATA[mv for loop]]></category><category><![CDATA[find exec mv]]></category><category><![CDATA[date format]]></category><category><![CDATA[rename multiple files]]></category><category><![CDATA[convert MDY date]]></category><category><![CDATA[convert DMY date]]></category><category><![CDATA[batch rename files]]></category><category><![CDATA[for do mv]]></category><dc:creator><![CDATA[Lars Bjaerris]]></dc:creator><pubDate>Sun, 05 Apr 2015 12:14:41 GMT</pubDate><content:encoded><![CDATA[<p>A common way of batch renaming filenames on the command-line, is to use the "mv" &amp; "sed" commands with a "for loop", or using "find" with the exec option. <br>
Here I'll show a couple of examples utilizing both options.</p>

<p>Here is a typical rename example converting uppercase "JPG" filename extensions to lowercase "jpeg" using "mv", "sed" and command substitution.</p>

<p>Create test folder and populate it with a couple of test files.</p>

<pre><code>$ mkdir test_folder &amp;&amp; cd test_folder/
$ touch {1..3}.JPG
$ ls -1
1.JPG
2.JPG
3.JPG
</code></pre>

<p>Batch rename the files with "JPG" filename extension to "jpeg" filename extension, using sed.</p>

<pre><code>$ for i in *.JPG; do mv "$i" "$(echo "$i" | sed 's/^\(.*\)\.JPG$/\1\.jpeg/')"; done
$ ls -1
1.jpeg
2.jpeg
3.jpeg
</code></pre>

<p>For a more elegant and faster solution using Bash parameter expansion/substitution.</p>

<pre><code>$ for i in *.jpeg; do mv "$i" "${i%.jpeg}.jpg"; done
$ ls -1
1.jpg
2.jpg
3.jpg
</code></pre>

<p>This works well for a single folder containing files we want to rename. <br>
If we have several nested folders with files we want to rename, using the find command and its exec option is a good choice.</p>

<p>This time we'll add another step and move every file we rename into a folder called "jpeg_folder" <br>
Let's clear out our test folder, create subfolders and populate these with test files.</p>

<pre><code>$ rm *
$ mkdir -p Subfolder.{1..5}
$ ls -1
Subfolder.1
Subfolder.2
Subfolder.3
Subfolder.4
Subfolder.5
</code></pre>

<p>Populate the folders with unique filenames and create the "jpeg_folder"</p>

<pre><code>$ COUNT=1 ; for i in $(ls -1) ; do touch ./$i/$COUNT.{1..5}.JPG &amp;&amp; ((COUNT++));done
$ mkdir jpeg_folder
</code></pre>

<p>That gives us the the following test scenario to work with</p>

<pre><code>$ ls -1R
Subfolder.1
Subfolder.2
Subfolder.3
Subfolder.4
Subfolder.5
jpeg_folder

./Subfolder.1:
1.1.JPG
1.2.JPG
1.3.JPG
1.4.JPG
1.5.JPG

./Subfolder.2:
2.1.JPG
2.2.JPG
2.3.JPG
2.4.JPG
2.5.JPG

./Subfolder.3:
3.1.JPG
3.2.JPG
3.3.JPG
3.4.JPG
3.5.JPG

./Subfolder.4:
4.1.JPG
4.2.JPG
4.3.JPG
4.4.JPG
4.5.JPG

./Subfolder.5:
5.1.JPG
5.2.JPG
5.3.JPG
5.4.JPG
5.5.JPG

./jpeg_folder:
</code></pre>

<p>Let's use the "find" command to recursively find all files with a ".JPG" file extension and rename them with a ".jpeg" file extension, while also moving them into the "jpeg_folder"</p>

<pre><code>$ find * -name "*.JPG" -exec sh -c 'mv  "$0" "jpeg_folder/$(basename "${0%.JPG}.jpeg")"' {} \;
$ ls -1R
Subfolder.1
Subfolder.2
Subfolder.3
Subfolder.4
Subfolder.5
jpeg_folder

./Subfolder.1:

./Subfolder.2:

./Subfolder.3:

./Subfolder.4:

./Subfolder.5:

./jpeg_folder:
1.1.jpeg
1.2.jpeg
1.3.jpeg
1.4.jpeg
1.5.jpeg
2.1.jpeg
2.2.jpeg
2.3.jpeg
2.4.jpeg
2.5.jpeg
3.1.jpeg
3.2.jpeg
3.3.jpeg
3.4.jpeg
3.5.jpeg
4.1.jpeg
4.2.jpeg
4.3.jpeg
4.4.jpeg
4.5.jpeg
5.1.jpeg
5.2.jpeg
5.3.jpeg
5.4.jpeg
5.5.jpeg
</code></pre>

<p>As you can see from the previous example, the "find" command can be a very effective tool when paired with a bit Bash command substitution and parameter expansion/substitution.</p>

<h5 id="batchrenamingfileswithdifferentdateformats">Batch Renaming Files With Different Date Formats</h5>

<p>That brings us to files containing different date formats in the filename. <br>
We usually have two common formats, depending on which side of the Atlantic we're on. <br>
Below are those, formatted with the "date" command <br>
So for the "13th of March 2015":</p>

<pre><code># MDY date format (Month, Day, Year)
$ date +"%m-%d-%Y"
03-13-2015

# DMY date format (Day, Month, Year)
$ date +"%d-%m-%Y"
13-03-2015
</code></pre>

<p>Let's convert files with either of these two date formats to your choice of format. <br>
Again let's clear our test folder and populate it with a few test files in which the MDY date format is included in the filename.</p>

<pre><code>$ rm -rf *
$ touch somefilename-03-{13..15}-2015.txt  
$ ls -1
somefilename-03-13-2015.txt
somefilename-03-14-2015.txt
somefilename-03-15-2015.txt
</code></pre>

<p>Let's convert those files to DMY format using sed in a "for loop"</p>

<pre><code>$ for i in *.txt; do mv "$i" "$(echo $i|sed -E 's/(somefilename)\-([0-9]{2})\-([0-9]{2})\-([0-9]{4})(\.txt)/\1-\3-\2-\4\5/')"; done
$ ls -1
somefilename-13-03-2015.txt
somefilename-14-03-2015.txt
somefilename-15-03-2015.txt
</code></pre>

<p>For a faster version, converting it back to MDY format without the use of command substitution, sed or "piping", using Bash regular expressions.</p>

<pre><code>$ for i in *.txt; do if [[ ${i} =~ (somefilename)-([0-9]{2})-([0-9]{2})-([0-9]{4})(\.txt) ]] ; then mv "$i" "${BASH_REMATCH[1]}-${BASH_REMATCH[3]}-${BASH_REMATCH[2]}-${BASH_REMATCH[4]}${BASH_REMATCH[5]}"; fi; done
$ ls -1
ls -1
somefilename-03-13-2015.txt
somefilename-03-14-2015.txt
somefilename-03-15-2015.txt
</code></pre>

<p>We can then convert back and forth between either format by running the same command again.</p>

<pre><code>$ for i in *.txt; do if [[ ${i} =~ (somefilename)-([0-9]{2})-([0-9]{2})-([0-9]{4})(\.txt) ]] ; then mv "$i" "${BASH_REMATCH[1]}-${BASH_REMATCH[3]}-${BASH_REMATCH[2]}-${BASH_REMATCH[4]}${BASH_REMATCH[5]}"; fi; done
$ ls -1
somefilename-13-03-2015.txt
somefilename-14-03-2015.txt
somefilename-15-03-2015.txt
</code></pre>

<p>Voila! Hope you enjoyed it! <br>
Lars Bjaerris</p>]]></content:encoded></item><item><title><![CDATA[Testing Destination Mailserver With Telnet, Netcat And Bash Scripting SMTP/ESMTP]]></title><description><![CDATA[<p><strong>Testing email delivery against a destination/receiving mailserver, using telnet, netcat (ncat, nc) and bash scripting.</strong></p>

<p>After installing a border/destination/incoming MTA, or for that matter any "public" facing server, testing connectivity and functionality/operation from the viewpoint of a remote user/client should be your number one priority.</p>]]></description><link>https://bjaerris.com/testing-destination-mailserver-with-telnet-netcat-and-bash-scripting-smtpesmtp/</link><guid isPermaLink="false">65d48475-104c-433a-ab62-95a7a5422dc2</guid><category><![CDATA[test smtp]]></category><category><![CDATA[telnet smtp]]></category><category><![CDATA[nc smtp]]></category><category><![CDATA[smtp script]]></category><category><![CDATA[email script]]></category><category><![CDATA[smtp relay]]></category><category><![CDATA[ncat smtp]]></category><dc:creator><![CDATA[Lars Bjaerris]]></dc:creator><pubDate>Thu, 05 Mar 2015 14:09:31 GMT</pubDate><content:encoded><![CDATA[<p><strong>Testing email delivery against a destination/receiving mailserver, using telnet, netcat (ncat, nc) and bash scripting.</strong></p>

<p>After installing a border/destination/incoming MTA, or for that matter any "public" facing server, testing connectivity and functionality/operation from the viewpoint of a remote user/client should be your number one priority. <br>
Most TCP/IP servers can be tested with use of commonly available command-line tools such as "nc" and "telnet".</p>

<p>For the sake of this post I will demonstrate SMTP communication against border MTA(s), with the use of both telnet, nc (netcat) and a bash shell script.  </p>

<p>Generally the purpose of a border/destination/incoming MTA is to perform checks against a set of rules during the SMTP conversation/connection, before handing over the email to a local delivery agent or relay to another MTA for further delivery.</p>

<p><strong>Checks usually performed on a destination MTA:</strong></p>

<ul>
<li>Connecting IP address reputation checks (DNSBL lookups and return codes).</li>
<li>Scan for viruses or other malicious payloads.</li>
<li>Decide if the message is UBE (Spam).</li>
<li>Check if destination domain is one we accept email or relay for.</li>
<li>Check if user in the domain or local user is valid for delivery.</li>
</ul>

<p>One thing to keep in mind when configuring a destination MTA; once a destination MTA accepts an email with a 2.x.x SMTP code (Success) that MTA is responsible for further delivery of that message. <br>
Email messages containing viruses, spam etc. are almost always sent with spoofed/fake return email addresses, so ending/closing the SMTP conversation with the connecting MTA and later trying to bounce an email with a NDN (Non-Delivery Notification), will in the majority of cases end up as email backscatter. Also the risk of a malicious sender using your MTA to bounce Spam/Virus emails to other systems should be considered. <br>
Degradation of your ip/domain reputation on DNSBLs and entry in <a href="http://www.backscatterer.org">http://www.backscatterer.org</a> would likely be the eventual outcome of such a configuration.</p>

<p>Make sure to always test your MTA(s) for proper rejection to invalid users and domains during the SMTP conversation, making sure you do not allow a configuration that accepts and then bounces emails.  </p>

<hr>

<p><strong>Let's test a good configuration</strong></p>

<p>Let us test against the MX server handing email for my domain "bjaerris.com" <br>
I have set up a valid email address for this purpose: "valid.email@bjaerris.com".</p>

<p>Let's get an MX record for the domain:</p>

<pre><code>$ dig +short MX bjaerris.com
10 gw4.node25.com.
</code></pre>

<p>Let's connect to "gw4.node25.com" and talk some SMTP:</p>

<pre><code>$ echo $HOSTNAME                               &lt;---- Command
gw3.node25.com
$ telnet gw4.node25.com 25                     &lt;---- Command
Trying 2a01:7e00::f03c:91ff:fe50:fb1a...
Connected to gw4.node25.com.
Escape character is '^]'.
220 gw4.node25.com ESMTP Postfix
HELO gw3.node25.com                            &lt;---- Command
250 gw4.node25.com
MAIL FROM: &lt;smtp.test@gw3.node25.com&gt;          &lt;---- Command
250 2.1.0 Ok
RCPT TO: &lt;invalid@bjaerris.com&gt;                &lt;---- Command
550 5.1.1 &lt;invalid@bjaerris.com&gt;: Recipient address rejected: User unknown in virtual mailbox table
RCPT TO: &lt;test@example.com&gt;                    &lt;---- Command
554 5.7.1 &lt;test@example.com&gt;: Relay access denied
RCPT TO: &lt;valid.email@bjaerris.com&gt;            &lt;---- Command
250 2.1.5 Ok
DATA                                           &lt;---- Command
354 End data with &lt;CR&gt;&lt;LF&gt;.&lt;CR&gt;&lt;LF&gt;
To: valid.email@bjaerris.com                   &lt;---- Data
Subject: Test message                          &lt;---- Data
Hello!                                         &lt;---- Data
.                                              &lt;---- "Data" (Will be interpreted as end of message by the SMTP server)
250 2.0.0 Ok: queued as F3B40498E
QUIT                                           &lt;---- Command
221 2.0.0 Bye
Connection closed by foreign host.
</code></pre>

<p>To summarize the above "good configuration", testing for invalid user, relaying, and valid user.</p>

<pre><code>RCPT TO: &lt;invalid@bjaerris.com
</code></pre>

<p><strong>"550 5.1.1 <a href="https://bjaerris.com/testing-destination-mailserver-with-telnet-netcat-and-bash-scripting-smtpesmtp/&#109;&#x61;&#x69;&#108;&#116;&#x6f;:&#105;&#x6e;&#118;&#x61;&#108;&#105;&#x64;&#64;&#x62;&#106;&#97;&#x65;&#114;&#114;&#105;&#x73;&#x2e;&#99;&#x6f;&#109;">&#105;&#x6e;&#118;&#x61;&#108;&#105;&#x64;&#64;&#x62;&#106;&#97;&#x65;&#114;&#114;&#105;&#x73;&#x2e;&#99;&#x6f;&#109;</a>: Recipient address rejected: User unknown in virtual mailbox table"</strong></p>

<pre><code>RCPT TO: &lt;test@example.com&gt;
</code></pre>

<p><strong>"554 5.7.1 <a href="https://bjaerris.com/testing-destination-mailserver-with-telnet-netcat-and-bash-scripting-smtpesmtp/m&#97;&#105;&#108;&#x74;&#x6f;:&#116;&#x65;&#115;&#116;&#64;&#101;&#x78;&#97;&#109;&#112;&#108;&#101;&#x2e;&#99;&#x6f;&#109;">&#116;&#x65;&#115;&#116;&#64;&#101;&#x78;&#97;&#109;&#112;&#108;&#101;&#x2e;&#99;&#x6f;&#109;</a>: Relay access denied"</strong></p>

<pre><code>"RCPT TO: &lt;valid.email@bjaerris.com&gt;"
</code></pre>

<p><strong>"250 2.1.5 Ok"</strong></p>

<pre><code>.
</code></pre>

<p><strong>"250 2.0.0 Ok: queued as F3B40498E"</strong>    </p>

<hr>

<p><strong>Testing with netcat (ncat, nc)</strong></p>

<p>We could use Netcat the same way as the previous telnet example, but for this demonstration we're going to prepare a file with the input and redirect to stdin of the netcat process. <br>
First let's create a file with the commands/data we want to send to the SMTP server.</p>

<pre><code>$ cat &gt; SMTP_TALK &lt;&lt;EOF
&gt; HELO gw3.node25.com
&gt; MAIL FROM:&lt;smtp.test@gw3.node25.com&gt;
&gt; RCPT TO:&lt;invalid@bjaerris.com&gt;
&gt; RCPT TO:&lt;test@example.com&gt;
&gt; RCPT TO:&lt;valid.email@bjaerris.com&gt;
&gt; DATA
&gt; From: [SMTP TEST] &lt;smtp.test@gw3.node25.com&gt;
&gt; To: &lt;valid.email@bjaerris.com&gt;
&gt; Subject: Test message.
&gt; Hello!
&gt; .
&gt; QUIT
&gt; EOF
</code></pre>

<p>Use netcat to connect to the MX server and redirect output from our newly created file to stdin of the netcat process.</p>

<pre><code># Note on some systems Netcat is called "ncat"

$ nc gw4.node25.com 25 &lt; SMTP_TALK
220 gw4.node25.com ESMTP Postfix
250 gw4.node25.com
250 2.1.0 Ok
550 5.1.1 &lt;invalid@bjaerris.com&gt;: Recipient address rejected: User unknown in virtual mailbox table
554 5.7.1 &lt;test@example.com&gt;: Relay access denied
250 2.1.5 Ok
354 End data with &lt;CR&gt;&lt;LF&gt;.&lt;CR&gt;&lt;LF&gt;
250 2.0.0 Ok: queued as 0BEF949CC
221 2.0.0 Bye
</code></pre>

<p>Having the commands ready in a file makes things a bit easier if testing more than one server.  </p>

<hr>

<p><strong>Let's make a bash script for more permanent SMTP/ESMTP testing purposes</strong></p>

<pre><code>#!/bin/sh
# smtp_test.sh
# Lars Bjaerris &lt;lars at bjaerris.com&gt;
# Version 0.2

# Argument checks
if [[ ( $# == "--help" ||  $# == "-h") || ! ($# -eq 1 &amp;&amp; $1 =~ ^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,4}$) ]]; then
    echo -e "\nThis script takes an email address as argument." 
    echo -e "Usage:\n$0 user@example.com \n" 
    exit 1
fi
# Define variables
LOCAL_HOSTNAME=$(uname -n)
EMAIL_ADDR="$1"
REMOTE_LOCAL_ADDR="${EMAIL_ADDR%@*}"
REMOTE_DOMAIN="${EMAIL_ADDR#*@}"
REMOTE_MX=$(dig +short $REMOTE_DOMAIN MX |head -n1 |cut -d ' ' -f 2)
SMTP_PORT="25"
SENDER="smtp.test@$LOCAL_HOSTNAME"
MESSAGE="Hello!"
INVALID_RECIPIENT="invalid@$REMOTE_DOMAIN"
RELAY_TEST="test@example.com"
DATE=$(date '+%a, %d %b %Y %H:%M:%S %z')

# Check if an MX record was returned for the domain
echo 
[ -z "$REMOTE_MX" ] &amp;&amp; echo "Failed to get MX record from \"$REMOTE_DOMAIN\"" &amp;&amp; exit 1 || echo "Got MX:\"$REMOTE_MX\" for DOMAIN:\"$REMOTE_DOMAIN\""

# Populate array with SMTP commands to run
SMTP_commands=( \
    "HELO $LOCAL_HOSTNAME" \
    "MAIL FROM: &lt;$SENDER&gt;" \
    "RCPT TO: &lt;$INVALID_RECIPIENT&gt;" \
    "RCPT TO: &lt;$RELAY_TEST&gt;" \
    "RCPT TO: &lt;$EMAIL_ADDR&gt;" \
    "DATA" \
    "To: &lt;$EMAIL_ADDR&gt;\r\nFrom: [SMTP TEST] &lt;$SENDER&gt;\r\nSubject: Test Message.\r\nDate: $DATE\r\n$MESSAGE\r\n." \
    "quit"
)
# Define sending function
email_smtp () {
    echo "Trying connect to MX:\"$REMOTE_MX\""
    echo
    exec 3&lt;&gt;/dev/tcp/$REMOTE_MX/$SMTP_PORT
    read -u 3 reply
    echo "$reply"
    # Set internal field separator for "SMTP_commands" array looping
    IFS=""
    # Loop over "SMTP_commands" array
    for i in ${SMTP_commands[@]}
    do
        echo "----Sending Data:"
        echo "$i"
        echo -en "$i\r\n" &gt;&amp;3
        read -u 3 reply
        echo "----Server Reply:"
        echo "$reply"
    done
}
#Call senderfunction
email_smtp
</code></pre>

<hr>

<p><strong>Let's test it!</strong></p>

<pre><code>$ ./smtp_test.sh valid.email@bjaerris.com

Got MX:"gw4.node25.com." for DOMAIN:"bjaerris.com"
Trying connect to MX:"gw4.node25.com."

220 gw4.node25.com ESMTP Postfix
----Sending Data:
HELO gw3.node25.com
----Server Reply:
250 gw4.node25.com
----Sending Data:
MAIL FROM: &lt;smtp.test@gw3.node25.com&gt;
----Server Reply:
250 2.1.0 Ok
----Sending Data:
RCPT TO: &lt;invalid@bjaerris.com&gt;
----Server Reply:
550 5.1.1 &lt;invalid@bjaerris.com&gt;: Recipient address rejected: User unknown in virtual mailbox table
----Sending Data:
RCPT TO: &lt;test@example.com&gt;
----Server Reply:
554 5.7.1 &lt;test@example.com&gt;: Relay access denied
----Sending Data:
RCPT TO: &lt;valid.email@bjaerris.com&gt;
----Server Reply:
250 2.1.5 Ok
----Sending Data:
DATA
----Server Reply:
354 End data with &lt;CR&gt;&lt;LF&gt;.&lt;CR&gt;&lt;LF&gt;
----Sending Data:
To: &lt;valid.email@bjaerris.com&gt;\r\nFrom: [SMTP TEST] &lt;smtp.test@gw3.node25.com&gt;\r\nSubject: Test message.\r\nDate: Thu, 05 Mar 2015 14:05:10 +0000\r\nHello!\r\n.
----Server Reply:
250 2.0.0 Ok: queued as F0EBD498E
----Sending Data:
quit
----Server Reply:
221 2.0.0 Bye
</code></pre>

<p><strong>Voila!</strong></p>

<p>Hope you enjoyed it! <br>
Lars Bjaerris</p>]]></content:encoded></item><item><title><![CDATA[Parsing Plaintext Logfiles On The Commandline Using Perl.]]></title><description><![CDATA[<p><strong>Perl, "the Swiss Army chainsaw of scripting languages"</strong></p>

<p>Using Perl, regular expressions and sort to parse and format a plaintext log-file. <br>
And finally comma separate the output (CSV format) for import into e.g. Excel. <br>
The approach here can easily be modified to parse other plaintext files in the same</p>]]></description><link>https://bjaerris.com/parsing-plaintext-logfiles-on-the-commandline-using-perl/</link><guid isPermaLink="false">65fd22d5-ae92-4b52-9a90-175d4e297e8f</guid><category><![CDATA[bash]]></category><category><![CDATA[Perl]]></category><category><![CDATA[regex]]></category><category><![CDATA[parsing]]></category><category><![CDATA[parsing logfile]]></category><category><![CDATA[PCRE]]></category><category><![CDATA[one-liner]]></category><dc:creator><![CDATA[Lars Bjaerris]]></dc:creator><pubDate>Mon, 02 Feb 2015 14:01:27 GMT</pubDate><content:encoded><![CDATA[<p><strong>Perl, "the Swiss Army chainsaw of scripting languages"</strong></p>

<p>Using Perl, regular expressions and sort to parse and format a plaintext log-file. <br>
And finally comma separate the output (CSV format) for import into e.g. Excel. <br>
The approach here can easily be modified to parse other plaintext files in the same manner.</p>

<p>Here is a snippet of an Nginx access.log file example we are going to use for parsing:</p>

<p><code>$ cat ./access.log</code></p>

<pre><code>216.218.206.66 - - [01/Feb/2015:05:35:11 +0000] "GET / HTTP/1.1" 403 162 "-" "-"
199.180.112.34 - - [01/Feb/2015:05:59:48 +0000] "GET //MyAdmin/scripts/setup.php HTTP/1.1" 404 56 "-" "-"
199.180.112.34 - - [01/Feb/2015:05:59:48 +0000] "GET //phpMyAdmin/scripts/setup.php HTTP/1.1" 404 56 "-" "-"
199.180.112.34 - - [01/Feb/2015:05:59:48 +0000] "GET //pma/scripts/setup.php HTTP/1.1" 404 56 "-" "-"
199.180.112.34 - - [01/Feb/2015:05:59:48 +0000] "GET //phpmyadmin/scripts/setup.php HTTP/1.1" 404 56 "-" "-"
199.180.112.34 - - [01/Feb/2015:05:59:48 +0000] "GET //myadmin/scripts/setup.php HTTP/1.1" 404 56 "-" "-"
199.180.112.34 - - [01/Feb/2015:05:59:48 +0000] "GET /muieblackcat HTTP/1.1" 404 162 "-" "-"
111.251.50.236 - - [01/Feb/2015:07:17:47 +0000] "CONNECT mx0.mail2000.com.tw:25 HTTP/1.0" 400 166 "-" "-"
23.239.196.71 - - [01/Feb/2015:07:45:48 +0000] "GET / HTTP/1.1" 301 178 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)"
69.171.237.116 - - [01/Feb/2015:11:06:01 +0000] "GET /content/images/2015/01/ghost_login_owner.png HTTP/1.1" 301 178 "-" "facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)"
69.171.237.116 - - [01/Feb/2015:11:06:02 +0000] "GET /content/images/2015/01/ghost_login_owner.png HTTP/1.1" 200 57525 "-" "facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)"
77.37.231.208 - - [01/Feb/2015:11:26:03 +0000] "GET / HTTP/1.1" 301 178 "-" "Mozilla/5.0 (X11; Linux i586; rv:31.0) Gecko/20100101 Firefox/31.0"
173.252.110.115 - - [01/Feb/2015:11:31:20 +0000] "GET /content/images/2015/01/ghost_login_owner.png HTTP/1.1" 301 178 "-" "facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)"
173.252.110.119 - - [01/Feb/2015:11:31:21 +0000] "GET /content/images/2015/01/ghost_login_owner.png HTTP/1.1" 200 57525 "-" "facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)"
61.240.144.66 - - [01/Feb/2015:12:42:47 +0000] "GET / HTTP/1.0" 200 612 "-" "masscan/1.0 (https://github.com/robertdavidgraham/masscan)"
108.166.85.126 - - [01/Feb/2015:16:05:43 +0000] "GET /admin/config.php HTTP/1.0" 499 0 "-" "-"
23.23.38.251 - - [01/Feb/2015:16:24:00 +0000] "GET / HTTP/1.1" 301 178 "-" "Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3 GTB6 (.NET CLR 3.5.30729)"
23.23.38.251 - - [01/Feb/2015:16:24:01 +0000] "GET / HTTP/1.1" 200 7105 "-" "Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3 GTB6 (.NET CLR 3.5.30729)"
185.5.51.50 - - [01/Feb/2015:16:56:48 +0000] "GET / HTTP/1.1" 301 178 "-" "Mozilla/5.0 (iPhone; CPU iPhone OS 7_1_2 like Mac OS X) AppleWebKit/537.51.2 (KHTML, like Gecko) Version/7.0 Mobile/11D257 Safari/9537.53"
185.5.51.50 - - [01/Feb/2015:16:56:50 +0000] "GET / HTTP/1.1" 200 7105 "-" "Mozilla/5.0 (iPhone; CPU iPhone OS 7_1_2 like Mac OS X) AppleWebKit/537.51.2 (KHTML, like Gecko) Version/7.0 Mobile/11D257 Safari/9537.53"
185.5.51.50 - - [01/Feb/2015:16:56:50 +0000] "GET /assets/css/screen.css?v=cfc2d462d6 HTTP/1.1" 200 9977 "https://bjaerris.com/" "Mozilla/5.0 (iPhone; CPU iPhone OS 7_1_2 like Mac OS X) AppleWebKit/537.51.2 (KHTML, like Gecko) Version/7.0 Mobile/11D257 Safari/9537.53"
185.5.51.50 - - [01/Feb/2015:16:56:51 +0000] "GET /public/jquery.min.js?v=cfc2d462d6 HTTP/1.1" 200 39386 "https://bjaerris.com/" "Mozilla/5.0 (iPhone; CPU iPhone OS 7_1_2 like Mac OS X) AppleWebKit/537.51.2 (KHTML, like Gecko) Version/7.0 Mobile/11D257 Safari/9537.53"
185.5.51.50 - - [01/Feb/2015:16:56:51 +0000] "GET /assets/js/jquery.fitvids.js?v=cfc2d462d6 HTTP/1.1" 200 2698 "https://bjaerris.com/" "Mozilla/5.0 (iPhone; CPU iPhone OS 7_1_2 like Mac OS X) AppleWebKit/537.51.2 (KHTML, like Gecko) Version/7.0 Mobile/11D257 Safari/9537.53"
185.5.51.50 - - [01/Feb/2015:16:56:51 +0000] "GET /assets/js/index.js?v=cfc2d462d6 HTTP/1.1" 200 3075 "https://bjaerris.com/" "Mozilla/5.0 (iPhone; CPU iPhone OS 7_1_2 like Mac OS X) AppleWebKit/537.51.2 (KHTML, like Gecko) Version/7.0 Mobile/11D257 Safari/9537.53"
185.5.51.50 - - [01/Feb/2015:16:56:51 +0000] "GET /content/images/2015/01/lars.jpg HTTP/1.1" 200 47097 "https://bjaerris.com/" "Mozilla/5.0 (iPhone; CPU iPhone OS 7_1_2 like Mac OS X) AppleWebKit/537.51.2 (KHTML, like Gecko) Version/7.0 Mobile/11D257 Safari/9537.53"
185.5.51.50 - - [01/Feb/2015:16:56:51 +0000] "GET /assets/fonts/casper-icons.woff HTTP/1.1" 200 2260 "https://bjaerris.com/" "Mozilla/5.0 (iPhone; CPU iPhone OS 7_1_2 like Mac OS X) AppleWebKit/537.51.2 (KHTML, like Gecko) Version/7.0 Mobile/11D257 Safari/9537.53"
185.5.51.50 - - [01/Feb/2015:16:57:20 +0000] "GET /identifying-services-needing-restart-after-updating-linux-packages/ HTTP/1.1" 200 6642 "https://bjaerris.com/" "Mozilla/5.0 (iPhone; CPU iPhone OS 7_1_2 like Mac OS X) AppleWebKit/537.51.2 (KHTML, like Gecko) Version/7.0 Mobile/11D257 Safari/9537.53"
207.34.25.76 - - [01/Feb/2015:18:01:13 +0000] "GET /robots.txt HTTP/1.1" 301 178 "-" "R6_CommentReader(www.radian6.com/crawler)"
71.11.195.254 - - [01/Feb/2015:21:37:39 +0000] "GET /tmUnblock.cgi HTTP/1.1" 400 166 "-" "-"
175.44.8.98 - - [01/Feb/2015:22:12:06 +0000] "POST /ghost_linux_init_script/ HTTP/1.1" 301 178 "-" "-"
78.133.20.10 - - [01/Feb/2015:22:42:33 +0000] "GET / HTTP/1.1" 200 72 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/600.3.18 (KHTML, like Gecko) Version/7.1.3 Safari/537.85.12"
78.133.20.10 - - [01/Feb/2015:22:42:33 +0000] "GET /favicon.ico HTTP/1.1" 404 162 "https://gw4.node25.com/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/600.3.18 (KHTML, like Gecko) Version/7.1.3 Safari/537.85.12"
31.202.241.242 - - [02/Feb/2015:01:15:13 +0000] "GET / HTTP/1.0" 301 178 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"
207.145.97.131 - - [02/Feb/2015:04:00:25 +0000] "GET / HTTP/1.1" 400 166 "-" "-"
207.145.97.131 - - [02/Feb/2015:04:00:26 +0000] "GET //Net_work.xml HTTP/1.1" 400 166 "-" "-"
198.20.69.74 - - [02/Feb/2015:06:18:10 +0000] "GET / HTTP/1.1" 403 162 "-" "-"
198.20.69.74 - - [02/Feb/2015:06:18:10 +0000] "GET /robots.txt HTTP/1.1" 403 162 "-" "-"
198.20.69.74 - - [02/Feb/2015:06:18:12 +0000] "" 400 0 "-" "-"
198.20.69.74 - - [02/Feb/2015:06:18:13 +0000] "" 400 0 "-" "-"
198.20.69.74 - - [02/Feb/2015:06:18:13 +0000] "" 400 0 "-" "-"
198.20.69.74 - - [02/Feb/2015:06:18:17 +0000] "quit" 400 166 "-" "-"
54.89.61.205 - - [02/Feb/2015:08:25:02 +0000] "GET /robots.txt HTTP/1.1" 200 48 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:28.0) Gecko/20100101 Firefox/28.0 (FlipboardProxy/1.1; +http://flipboard.com/browserproxy)"
54.197.168.201 - - [02/Feb/2015:08:25:02 +0000] "GET /robots.txt HTTP/1.1" 200 48 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:28.0) Gecko/20100101 Firefox/28.0 (FlipboardProxy/1.1; +http://flipboard.com/browserproxy)"
54.89.61.205 - - [02/Feb/2015:08:25:02 +0000] "GET /identifying-services-needing-restart-after-updating-linux-packages/ HTTP/1.1" 200 6638 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:28.0) Gecko/20100101 Firefox/28.0 (FlipboardProxy/1.1; +http://flipboard.com/browserproxy)"
54.197.168.201 - - [02/Feb/2015:08:25:02 +0000] "GET /host-your-own-blog-with-ghost-nginx-on-linux/ HTTP/1.1" 200 5682 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:28.0) Gecko/20100101 Firefox/28.0 (FlipboardProxy/1.1; +http://flipboard.com/browserproxy)"
54.160.249.160 - - [02/Feb/2015:08:25:03 +0000] "GET /robots.txt HTTP/1.1" 301 178 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:28.0) Gecko/20100101 Firefox/28.0 (FlipboardProxy/1.1; +http://flipboard.com/browserproxy)"
54.160.249.160 - - [02/Feb/2015:08:25:03 +0000] "GET /host-your-own-blog-with-ghost-nginx-on-linux/ HTTP/1.1" 200 5682 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:28.0) Gecko/20100101 Firefox/28.0 (FlipboardProxy/1.1; +http://flipboard.com/browserproxy)"
54.82.74.230 - - [02/Feb/2015:08:25:03 +0000] "GET /content/images/2015/01/ghost_login_owner.png HTTP/1.1" 301 178 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:28.0) Gecko/20100101 Firefox/28.0 (FlipboardProxy/1.6; +http://flipboard.com/browserproxy)"
54.82.74.230 - - [02/Feb/2015:08:25:04 +0000] "GET /content/images/2015/01/ghost_login_owner.png HTTP/1.1" 200 57525 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:28.0) Gecko/20100101 Firefox/28.0 (FlipboardProxy/1.6; +http://flipboard.com/browserproxy)"
54.160.249.160 - - [02/Feb/2015:08:25:08 +0000] "GET /host-your-own-blog-with-ghost-nginx-on-linux/ HTTP/1.1" 200 5682 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:28.0) Gecko/20100101 Firefox/28.0 (FlipboardProxy/1.1; +http://flipboard.com/browserproxy)"
186.15.3.50 - - [02/Feb/2015:11:13:54 +0000] "GET /tmUnblock.cgi HTTP/1.1" 400 166 "-" "-"
176.58.116.39 - - [02/Feb/2015:12:01:46 +0000] "GET / HTTP/1.1" 200 7107 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/600.3.18 (KHTML, like Gecko) Version/7.1.3 Safari/537.85.12"
176.58.116.39 - - [02/Feb/2015:12:01:46 +0000] "GET /assets/css/screen.css?v=cfc2d462d6 HTTP/1.1" 304 0 "https://bjaerris.com/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/600.3.18 (KHTML, like Gecko) Version/7.1.3 Safari/537.85.12"
176.58.116.39 - - [02/Feb/2015:12:01:47 +0000] "GET /assets/js/jquery.fitvids.js?v=cfc2d462d6 HTTP/1.1" 304 0 "https://bjaerris.com/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/600.3.18 (KHTML, like Gecko) Version/7.1.3 Safari/537.85.12"
176.58.116.39 - - [02/Feb/2015:12:01:47 +0000] "GET /public/jquery.min.js?v=cfc2d462d6 HTTP/1.1" 304 0 "https://bjaerris.com/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/600.3.18 (KHTML, like Gecko) Version/7.1.3 Safari/537.85.12"
176.58.116.39 - - [02/Feb/2015:12:01:47 +0000] "GET /assets/js/index.js?v=cfc2d462d6 HTTP/1.1" 304 0 "https://bjaerris.com/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/600.3.18 (KHTML, like Gecko) Version/7.1.3 Safari/537.85.12"
176.58.116.39 - - [02/Feb/2015:12:01:48 +0000] "GET /content/images/2015/01/lars.jpg HTTP/1.1" 304 0 "https://bjaerris.com/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/600.3.18 (KHTML, like Gecko) Version/7.1.3 Safari/537.85.12"
176.58.116.39 - - [02/Feb/2015:12:01:54 +0000] "GET / HTTP/1.1" 200 7107 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/600.3.18 (KHTML, like Gecko) Version/7.1.3 Safari/537.85.12"
176.58.116.39 - - [02/Feb/2015:12:01:54 +0000] "GET /assets/css/screen.css?v=cfc2d462d6 HTTP/1.1" 304 0 "https://bjaerris.com/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/600.3.18 (KHTML, like Gecko) Version/7.1.3 Safari/537.85.12"
176.58.116.39 - - [02/Feb/2015:12:01:55 +0000] "GET /public/jquery.min.js?v=cfc2d462d6 HTTP/1.1" 304 0 "https://bjaerris.com/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/600.3.18 (KHTML, like Gecko) Version/7.1.3 Safari/537.85.12"
176.58.116.39 - - [02/Feb/2015:12:01:55 +0000] "GET /assets/js/jquery.fitvids.js?v=cfc2d462d6 HTTP/1.1" 304 0 "https://bjaerris.com/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/600.3.18 (KHTML, like Gecko) Version/7.1.3 Safari/537.85.12"
176.58.116.39 - - [02/Feb/2015:12:01:55 +0000] "GET /assets/js/index.js?v=cfc2d462d6 HTTP/1.1" 304 0 "https://bjaerris.com/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/600.3.18 (KHTML, like Gecko) Version/7.1.3 Safari/537.85.12"
176.58.116.39 - - [02/Feb/2015:12:01:55 +0000] "GET /content/images/2015/01/lars.jpg HTTP/1.1" 304 0 "https://bjaerris.com/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/600.3.18 (KHTML, like Gecko) Version/7.1.3 Safari/537.85.12"
</code></pre>

<p><strong>Let's say we want to:</strong></p>

<ol>
<li>Extract the IP address, date, request, URN and HTTP Status Code.  </li>
<li>Ignore duplicates from the same day. </li>
</ol>

<p>Here one might think that "awk" &amp; "sort" would do, but because of the date format with "hour:minute:seconds", ignoring duplicates with these, eventually ends up convoluted:</p>

<p><code>$ awk '{print $1, $4, $6, $7}' access.log |sort -u</code></p>

<pre><code>108.166.85.126 [01/Feb/2015:16:05:43 "GET /admin/config.php
111.251.50.236 [01/Feb/2015:07:17:47 "CONNECT mx0.mail2000.com.tw:25
173.252.110.115 [01/Feb/2015:11:31:20 "GET /content/images/2015/01/ghost_login_owner.png
173.252.110.119 [01/Feb/2015:11:31:21 "GET /content/images/2015/01/ghost_login_owner.png
175.44.8.98 [01/Feb/2015:22:12:06 "POST /ghost_linux_init_script/
176.58.116.39 [02/Feb/2015:12:01:46 "GET /
176.58.116.39 [02/Feb/2015:12:01:46 "GET /assets/css/screen.css?v=cfc2d462d6
176.58.116.39 [02/Feb/2015:12:01:47 "GET /assets/js/index.js?v=cfc2d462d6
176.58.116.39 [02/Feb/2015:12:01:47 "GET /assets/js/jquery.fitvids.js?v=cfc2d462d6
176.58.116.39 [02/Feb/2015:12:01:47 "GET /public/jquery.min.js?v=cfc2d462d6
176.58.116.39 [02/Feb/2015:12:01:48 "GET /content/images/2015/01/lars.jpg
176.58.116.39 [02/Feb/2015:12:01:54 "GET /                                              &lt;---Duplicate
176.58.116.39 [02/Feb/2015:12:01:54 "GET /assets/css/screen.css?v=cfc2d462d6            &lt;---Duplicate
176.58.116.39 [02/Feb/2015:12:01:55 "GET /assets/js/index.js?v=cfc2d462d6               &lt;---Duplicate
176.58.116.39 [02/Feb/2015:12:01:55 "GET /assets/js/jquery.fitvids.js?v=cfc2d462d6      &lt;---Duplicate
176.58.116.39 [02/Feb/2015:12:01:55 "GET /content/images/2015/01/lars.jpg               &lt;---Duplicate
176.58.116.39 [02/Feb/2015:12:01:55 "GET /public/jquery.min.js?v=cfc2d462d6             &lt;---Duplicate
185.5.51.50 [01/Feb/2015:16:56:48 "GET /
185.5.51.50 [01/Feb/2015:16:56:50 "GET /                                                &lt;---Duplicate
185.5.51.50 [01/Feb/2015:16:56:50 "GET /assets/css/screen.css?v=cfc2d462d6
185.5.51.50 [01/Feb/2015:16:56:51 "GET /assets/fonts/casper-icons.woff
185.5.51.50 [01/Feb/2015:16:56:51 "GET /assets/js/index.js?v=cfc2d462d6
185.5.51.50 [01/Feb/2015:16:56:51 "GET /assets/js/jquery.fitvids.js?v=cfc2d462d6
185.5.51.50 [01/Feb/2015:16:56:51 "GET /content/images/2015/01/lars.jpg
185.5.51.50 [01/Feb/2015:16:56:51 "GET /public/jquery.min.js?v=cfc2d462d6
185.5.51.50 [01/Feb/2015:16:57:20 "GET /identifying-services-needing-restart-after-updating-linux-packages/
186.15.3.50 [02/Feb/2015:11:13:54 "GET /tmUnblock.cgi
198.20.69.74 [02/Feb/2015:06:18:10 "GET /
198.20.69.74 [02/Feb/2015:06:18:10 "GET /robots.txt
198.20.69.74 [02/Feb/2015:06:18:12 "" 400
198.20.69.74 [02/Feb/2015:06:18:13 "" 400                                               &lt;---Duplicate
198.20.69.74 [02/Feb/2015:06:18:17 "quit" 400
199.180.112.34 [01/Feb/2015:05:59:48 "GET /muieblackcat
199.180.112.34 [01/Feb/2015:05:59:48 "GET //myadmin/scripts/setup.php
199.180.112.34 [01/Feb/2015:05:59:48 "GET //MyAdmin/scripts/setup.php
199.180.112.34 [01/Feb/2015:05:59:48 "GET //phpmyadmin/scripts/setup.php
199.180.112.34 [01/Feb/2015:05:59:48 "GET //phpMyAdmin/scripts/setup.php
199.180.112.34 [01/Feb/2015:05:59:48 "GET //pma/scripts/setup.php
207.145.97.131 [02/Feb/2015:04:00:25 "GET /
207.145.97.131 [02/Feb/2015:04:00:26 "GET //Net_work.xml
207.34.25.76 [01/Feb/2015:18:01:13 "GET /robots.txt
216.218.206.66 [01/Feb/2015:05:35:11 "GET /
23.23.38.251 [01/Feb/2015:16:24:00 "GET /
23.23.38.251 [01/Feb/2015:16:24:01 "GET /                                               &lt;---Duplicate
23.239.196.71 [01/Feb/2015:07:45:48 "GET / 
31.202.241.242 [02/Feb/2015:01:15:13 "GET /
54.160.249.160 [02/Feb/2015:08:25:03 "GET /host-your-own-blog-with-ghost-nginx-on-linux/
54.160.249.160 [02/Feb/2015:08:25:03 "GET /robots.txt
54.160.249.160 [02/Feb/2015:08:25:08 "GET /host-your-own-blog-with-ghost-nginx-on-linux/
54.197.168.201 [02/Feb/2015:08:25:02 "GET /host-your-own-blog-with-ghost-nginx-on-linux/
54.197.168.201 [02/Feb/2015:08:25:02 "GET /robots.txt
54.82.74.230 [02/Feb/2015:08:25:03 "GET /content/images/2015/01/ghost_login_owner.png
54.82.74.230 [02/Feb/2015:08:25:04 "GET /content/images/2015/01/ghost_login_owner.png   &lt;---Duplicate
54.89.61.205 [02/Feb/2015:08:25:02 "GET /identifying-services-needing-restart-after-updating-linux-packages/
54.89.61.205 [02/Feb/2015:08:25:02 "GET /robots.txt
61.240.144.66 [01/Feb/2015:12:42:47 "GET /
69.171.237.116 [01/Feb/2015:11:06:01 "GET /content/images/2015/01/ghost_login_owner.png
69.171.237.116 [01/Feb/2015:11:06:02 "GET /content/images/2015/01/ghost_login_owner.png &lt;---Duplicate
71.11.195.254 [01/Feb/2015:21:37:39 "GET /tmUnblock.cgi
77.37.231.208 [01/Feb/2015:11:26:03 "GET /
78.133.20.10 [01/Feb/2015:22:42:33 "GET /
78.133.20.10 [01/Feb/2015:22:42:33 "GET /favicon.ico
</code></pre>

<p>Clearly another approach is needed if we're to easily ignore duplicates from the same day. <br>
Removing the "hour:minute:seconds" part of date stamp from the output to "sort -u" will fix that. <br>
So let's get rid of the duplicate lines, this time using Perl and regular expressions to parse the log lines.</p>

<p><strong>Perl, the right tool for the job.</strong></p>

<p><code>$ perl -lane 'print "$1 $2 $3 $4 " if /^(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\s\-\s\-\s\[([^:]+)[^"]+"([^"]+)"\s(\d{3})/' ./access.log</code></p>

<pre><code>216.218.206.66 01/Feb/2015 GET / HTTP/1.1 403 
199.180.112.34 01/Feb/2015 GET //MyAdmin/scripts/setup.php HTTP/1.1 404 
199.180.112.34 01/Feb/2015 GET //phpMyAdmin/scripts/setup.php HTTP/1.1 404 
199.180.112.34 01/Feb/2015 GET //pma/scripts/setup.php HTTP/1.1 404 
199.180.112.34 01/Feb/2015 GET //phpmyadmin/scripts/setup.php HTTP/1.1 404 
199.180.112.34 01/Feb/2015 GET //myadmin/scripts/setup.php HTTP/1.1 404 
199.180.112.34 01/Feb/2015 GET /muieblackcat HTTP/1.1 404 
111.251.50.236 01/Feb/2015 CONNECT mx0.mail2000.com.tw:25 HTTP/1.0 400 
23.239.196.71 01/Feb/2015 GET / HTTP/1.1 301 
69.171.237.116 01/Feb/2015 GET /content/images/2015/01/ghost_login_owner.png HTTP/1.1 301 
69.171.237.116 01/Feb/2015 GET /content/images/2015/01/ghost_login_owner.png HTTP/1.1 200 
77.37.231.208 01/Feb/2015 GET / HTTP/1.1 301 
173.252.110.115 01/Feb/2015 GET /content/images/2015/01/ghost_login_owner.png HTTP/1.1 301 
173.252.110.119 01/Feb/2015 GET /content/images/2015/01/ghost_login_owner.png HTTP/1.1 200 
61.240.144.66 01/Feb/2015 GET / HTTP/1.0 200 
108.166.85.126 01/Feb/2015 GET /admin/config.php HTTP/1.0 499 
23.23.38.251 01/Feb/2015 GET / HTTP/1.1 301 
23.23.38.251 01/Feb/2015 GET / HTTP/1.1 200 
185.5.51.50 01/Feb/2015 GET / HTTP/1.1 301 
185.5.51.50 01/Feb/2015 GET / HTTP/1.1 200 
185.5.51.50 01/Feb/2015 GET /assets/css/screen.css?v=cfc2d462d6 HTTP/1.1 200 
185.5.51.50 01/Feb/2015 GET /public/jquery.min.js?v=cfc2d462d6 HTTP/1.1 200 
185.5.51.50 01/Feb/2015 GET /assets/js/jquery.fitvids.js?v=cfc2d462d6 HTTP/1.1 200 
185.5.51.50 01/Feb/2015 GET /assets/js/index.js?v=cfc2d462d6 HTTP/1.1 200 
185.5.51.50 01/Feb/2015 GET /content/images/2015/01/lars.jpg HTTP/1.1 200 
185.5.51.50 01/Feb/2015 GET /assets/fonts/casper-icons.woff HTTP/1.1 200 
185.5.51.50 01/Feb/2015 GET /identifying-services-needing-restart-after-updating-linux-packages/ HTTP/1.1 200 
207.34.25.76 01/Feb/2015 GET /robots.txt HTTP/1.1 301 
71.11.195.254 01/Feb/2015 GET /tmUnblock.cgi HTTP/1.1 400 
175.44.8.98 01/Feb/2015 POST /ghost_linux_init_script/ HTTP/1.1 301 
78.133.20.10 01/Feb/2015 GET / HTTP/1.1 200 
78.133.20.10 01/Feb/2015 GET /favicon.ico HTTP/1.1 404 
31.202.241.242 02/Feb/2015 GET / HTTP/1.0 301 
207.145.97.131 02/Feb/2015 GET / HTTP/1.1 400 
207.145.97.131 02/Feb/2015 GET //Net_work.xml HTTP/1.1 400 
198.20.69.74 02/Feb/2015 GET / HTTP/1.1 403 
198.20.69.74 02/Feb/2015 GET /robots.txt HTTP/1.1 403 
198.20.69.74 02/Feb/2015 quit 400 
54.89.61.205 02/Feb/2015 GET /robots.txt HTTP/1.1 200 
54.197.168.201 02/Feb/2015 GET /robots.txt HTTP/1.1 200 
54.89.61.205 02/Feb/2015 GET /identifying-services-needing-restart-after-updating-linux-packages/ HTTP/1.1 200 
54.197.168.201 02/Feb/2015 GET /host-your-own-blog-with-ghost-nginx-on-linux/ HTTP/1.1 200 
54.160.249.160 02/Feb/2015 GET /robots.txt HTTP/1.1 301 
54.160.249.160 02/Feb/2015 GET /host-your-own-blog-with-ghost-nginx-on-linux/ HTTP/1.1 200 
54.82.74.230 02/Feb/2015 GET /content/images/2015/01/ghost_login_owner.png HTTP/1.1 301 
54.82.74.230 02/Feb/2015 GET /content/images/2015/01/ghost_login_owner.png HTTP/1.1 200 
54.160.249.160 02/Feb/2015 GET /host-your-own-blog-with-ghost-nginx-on-linux/ HTTP/1.1 200 
186.15.3.50 02/Feb/2015 GET /tmUnblock.cgi HTTP/1.1 400 
176.58.116.39 02/Feb/2015 GET / HTTP/1.1 200 
176.58.116.39 02/Feb/2015 GET /assets/css/screen.css?v=cfc2d462d6 HTTP/1.1 304 
176.58.116.39 02/Feb/2015 GET /assets/js/jquery.fitvids.js?v=cfc2d462d6 HTTP/1.1 304 
176.58.116.39 02/Feb/2015 GET /public/jquery.min.js?v=cfc2d462d6 HTTP/1.1 304 
176.58.116.39 02/Feb/2015 GET /assets/js/index.js?v=cfc2d462d6 HTTP/1.1 304 
176.58.116.39 02/Feb/2015 GET /content/images/2015/01/lars.jpg HTTP/1.1 304 
176.58.116.39 02/Feb/2015 GET / HTTP/1.1 200 
176.58.116.39 02/Feb/2015 GET /assets/css/screen.css?v=cfc2d462d6 HTTP/1.1 304 
176.58.116.39 02/Feb/2015 GET /public/jquery.min.js?v=cfc2d462d6 HTTP/1.1 304 
176.58.116.39 02/Feb/2015 GET /assets/js/jquery.fitvids.js?v=cfc2d462d6 HTTP/1.1 304 
176.58.116.39 02/Feb/2015 GET /assets/js/index.js?v=cfc2d462d6 HTTP/1.1 304 
176.58.116.39 02/Feb/2015 GET /content/images/2015/01/lars.jpg HTTP/1.1 304
</code></pre>

<p><strong>Before we go further, let me explain the regular expression used in the previous command.</strong></p>

<pre><code>^(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\s\-\s\-\s\[([^:]+)[^"]+("[^"]*")\s(\d{3})

^            # Match start of string.
###
# 1st Capturing group.
(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}) 
\d{1,3})  # Match a digit [0-9] Between 1 and 3 times (greedy).
\.        # Match the character "." literally.
###
\s           # Match any white space character [\r\n\t\f].
\-           # Match the character "-" literally.
\s           # Match any white space character [\r\n\t\f].
\-           # Match the character "-" literally.
\s           # Match any white space character [\r\n\t\f]
\[           # Match the character "[" literally.
###
# 2nd Capturing group.
([^:]+)
[^:]+        # Match anything not a ":", one or more times (greedy).
###
[^"]+        # Match anything not a """, one or more times (greedy).
###
# 3rd Capturing group.
("[^"]*")
"            # Match a single character """ literally.
[^"]*        # Match anything not a """, zero or more times (greedy).
"            # Match a single character """ literally.
###
\s           # Match any white space character [\r\n\t\f].
###
# 4th Capturing group.
(\d{3})
\d{3}        # match a digit [0-9] Exactly 3 times.
</code></pre>

<p>For more on regular expressions: <a href="http://www.regular-expressions.info/tutorial.html">http://www.regular-expressions.info/tutorial.html</a></p>

<p><strong>Let's make it "pretty".</strong></p>

<p>Let's move the HTTP status code to the 3rd column and use the printf function to line up the columns to make the output more legible.</p>

<p><code>$ perl -lane 'printf ("%-16s %-12s %-4s %s\n", $1, $2, $4, $3) if /^(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\s\-\s\-\s\[([^:]+)[^"]+"([^"]+)"\s(\d{3})/' ./access.log</code></p>

<pre><code>216.218.206.66   01/Feb/2015  403  GET / HTTP/1.1
199.180.112.34   01/Feb/2015  404  GET //MyAdmin/scripts/setup.php HTTP/1.1
199.180.112.34   01/Feb/2015  404  GET //phpMyAdmin/scripts/setup.php HTTP/1.1
199.180.112.34   01/Feb/2015  404  GET //pma/scripts/setup.php HTTP/1.1
199.180.112.34   01/Feb/2015  404  GET //phpmyadmin/scripts/setup.php HTTP/1.1
199.180.112.34   01/Feb/2015  404  GET //myadmin/scripts/setup.php HTTP/1.1
199.180.112.34   01/Feb/2015  404  GET /muieblackcat HTTP/1.1
111.251.50.236   01/Feb/2015  400  CONNECT mx0.mail2000.com.tw:25 HTTP/1.0
23.239.196.71    01/Feb/2015  301  GET / HTTP/1.1
69.171.237.116   01/Feb/2015  301  GET /content/images/2015/01/ghost_login_owner.png HTTP/1.1
69.171.237.116   01/Feb/2015  200  GET /content/images/2015/01/ghost_login_owner.png HTTP/1.1
77.37.231.208    01/Feb/2015  301  GET / HTTP/1.1
173.252.110.115  01/Feb/2015  301  GET /content/images/2015/01/ghost_login_owner.png HTTP/1.1
173.252.110.119  01/Feb/2015  200  GET /content/images/2015/01/ghost_login_owner.png HTTP/1.1
61.240.144.66    01/Feb/2015  200  GET / HTTP/1.0
108.166.85.126   01/Feb/2015  499  GET /admin/config.php HTTP/1.0
23.23.38.251     01/Feb/2015  301  GET / HTTP/1.1
23.23.38.251     01/Feb/2015  200  GET / HTTP/1.1
185.5.51.50      01/Feb/2015  301  GET / HTTP/1.1
185.5.51.50      01/Feb/2015  200  GET / HTTP/1.1
185.5.51.50      01/Feb/2015  200  GET /assets/css/screen.css?v=cfc2d462d6 HTTP/1.1
185.5.51.50      01/Feb/2015  200  GET /public/jquery.min.js?v=cfc2d462d6 HTTP/1.1
185.5.51.50      01/Feb/2015  200  GET /assets/js/jquery.fitvids.js?v=cfc2d462d6 HTTP/1.1
185.5.51.50      01/Feb/2015  200  GET /assets/js/index.js?v=cfc2d462d6 HTTP/1.1
185.5.51.50      01/Feb/2015  200  GET /content/images/2015/01/lars.jpg HTTP/1.1
185.5.51.50      01/Feb/2015  200  GET /assets/fonts/casper-icons.woff HTTP/1.1
185.5.51.50      01/Feb/2015  200  GET /identifying-services-needing-restart-after-updating-linux-packages/ HTTP/1.1
207.34.25.76     01/Feb/2015  301  GET /robots.txt HTTP/1.1
71.11.195.254    01/Feb/2015  400  GET /tmUnblock.cgi HTTP/1.1
175.44.8.98      01/Feb/2015  301  POST /ghost_linux_init_script/ HTTP/1.1
78.133.20.10     01/Feb/2015  200  GET / HTTP/1.1
78.133.20.10     01/Feb/2015  404  GET /favicon.ico HTTP/1.1
31.202.241.242   02/Feb/2015  301  GET / HTTP/1.0
207.145.97.131   02/Feb/2015  400  GET / HTTP/1.1
207.145.97.131   02/Feb/2015  400  GET //Net_work.xml HTTP/1.1
198.20.69.74     02/Feb/2015  403  GET / HTTP/1.1
198.20.69.74     02/Feb/2015  403  GET /robots.txt HTTP/1.1
198.20.69.74     02/Feb/2015  400  quit
54.89.61.205     02/Feb/2015  200  GET /robots.txt HTTP/1.1
54.197.168.201   02/Feb/2015  200  GET /robots.txt HTTP/1.1
54.89.61.205     02/Feb/2015  200  GET /identifying-services-needing-restart-after-updating-linux-packages/ HTTP/1.1
54.197.168.201   02/Feb/2015  200  GET /host-your-own-blog-with-ghost-nginx-on-linux/ HTTP/1.1
54.160.249.160   02/Feb/2015  301  GET /robots.txt HTTP/1.1
54.160.249.160   02/Feb/2015  200  GET /host-your-own-blog-with-ghost-nginx-on-linux/ HTTP/1.1
54.82.74.230     02/Feb/2015  301  GET /content/images/2015/01/ghost_login_owner.png HTTP/1.1
54.82.74.230     02/Feb/2015  200  GET /content/images/2015/01/ghost_login_owner.png HTTP/1.1
54.160.249.160   02/Feb/2015  200  GET /host-your-own-blog-with-ghost-nginx-on-linux/ HTTP/1.1
186.15.3.50      02/Feb/2015  400  GET /tmUnblock.cgi HTTP/1.1
176.58.116.39    02/Feb/2015  200  GET / HTTP/1.1
176.58.116.39    02/Feb/2015  304  GET /assets/css/screen.css?v=cfc2d462d6 HTTP/1.1
176.58.116.39    02/Feb/2015  304  GET /assets/js/jquery.fitvids.js?v=cfc2d462d6 HTTP/1.1
176.58.116.39    02/Feb/2015  304  GET /public/jquery.min.js?v=cfc2d462d6 HTTP/1.1
176.58.116.39    02/Feb/2015  304  GET /assets/js/index.js?v=cfc2d462d6 HTTP/1.1
176.58.116.39    02/Feb/2015  304  GET /content/images/2015/01/lars.jpg HTTP/1.1
176.58.116.39    02/Feb/2015  200  GET / HTTP/1.1
176.58.116.39    02/Feb/2015  304  GET /assets/css/screen.css?v=cfc2d462d6 HTTP/1.1
176.58.116.39    02/Feb/2015  304  GET /public/jquery.min.js?v=cfc2d462d6 HTTP/1.1
176.58.116.39    02/Feb/2015  304  GET /assets/js/jquery.fitvids.js?v=cfc2d462d6 HTTP/1.1
176.58.116.39    02/Feb/2015  304  GET /assets/js/index.js?v=cfc2d462d6 HTTP/1.1
176.58.116.39    02/Feb/2015  304  GET /content/images/2015/01/lars.jpg HTTP/1.1
</code></pre>

<p>Much better!</p>

<p><strong>Removing duplicate lines.</strong></p>

<p><code>$ perl -lane 'printf ("%-16s %-12s %-4s %s\n", $1, $2, $4, $3) if /^(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\s\-\s\-\s\[([^:]+)[^"]+"([^"]+)"\s(\d{3})/' ./access.log |sort -u</code></p>

<pre><code>108.166.85.126   01/Feb/2015  499  GET /admin/config.php HTTP/1.0
111.251.50.236   01/Feb/2015  400  CONNECT mx0.mail2000.com.tw:25 HTTP/1.0
173.252.110.115  01/Feb/2015  301  GET /content/images/2015/01/ghost_login_owner.png HTTP/1.1
173.252.110.119  01/Feb/2015  200  GET /content/images/2015/01/ghost_login_owner.png HTTP/1.1
175.44.8.98      01/Feb/2015  301  POST /ghost_linux_init_script/ HTTP/1.1
176.58.116.39    02/Feb/2015  200  GET / HTTP/1.1
176.58.116.39    02/Feb/2015  304  GET /assets/css/screen.css?v=cfc2d462d6 HTTP/1.1
176.58.116.39    02/Feb/2015  304  GET /assets/js/index.js?v=cfc2d462d6 HTTP/1.1
176.58.116.39    02/Feb/2015  304  GET /assets/js/jquery.fitvids.js?v=cfc2d462d6 HTTP/1.1
176.58.116.39    02/Feb/2015  304  GET /content/images/2015/01/lars.jpg HTTP/1.1
176.58.116.39    02/Feb/2015  304  GET /public/jquery.min.js?v=cfc2d462d6 HTTP/1.1
185.5.51.50      01/Feb/2015  200  GET /assets/css/screen.css?v=cfc2d462d6 HTTP/1.1
185.5.51.50      01/Feb/2015  200  GET /assets/fonts/casper-icons.woff HTTP/1.1
185.5.51.50      01/Feb/2015  200  GET /assets/js/index.js?v=cfc2d462d6 HTTP/1.1
185.5.51.50      01/Feb/2015  200  GET /assets/js/jquery.fitvids.js?v=cfc2d462d6 HTTP/1.1
185.5.51.50      01/Feb/2015  200  GET /content/images/2015/01/lars.jpg HTTP/1.1
185.5.51.50      01/Feb/2015  200  GET / HTTP/1.1
185.5.51.50      01/Feb/2015  200  GET /identifying-services-needing-restart-after-updating-linux-packages/ HTTP/1.1
185.5.51.50      01/Feb/2015  200  GET /public/jquery.min.js?v=cfc2d462d6 HTTP/1.1
185.5.51.50      01/Feb/2015  301  GET / HTTP/1.1
186.15.3.50      02/Feb/2015  400  GET /tmUnblock.cgi HTTP/1.1
198.20.69.74     02/Feb/2015  400  quit
198.20.69.74     02/Feb/2015  403  GET / HTTP/1.1
198.20.69.74     02/Feb/2015  403  GET /robots.txt HTTP/1.1
199.180.112.34   01/Feb/2015  404  GET /muieblackcat HTTP/1.1
199.180.112.34   01/Feb/2015  404  GET //myadmin/scripts/setup.php HTTP/1.1
199.180.112.34   01/Feb/2015  404  GET //MyAdmin/scripts/setup.php HTTP/1.1
199.180.112.34   01/Feb/2015  404  GET //phpmyadmin/scripts/setup.php HTTP/1.1
199.180.112.34   01/Feb/2015  404  GET //phpMyAdmin/scripts/setup.php HTTP/1.1
199.180.112.34   01/Feb/2015  404  GET //pma/scripts/setup.php HTTP/1.1
207.145.97.131   02/Feb/2015  400  GET / HTTP/1.1
207.145.97.131   02/Feb/2015  400  GET //Net_work.xml HTTP/1.1
207.34.25.76     01/Feb/2015  301  GET /robots.txt HTTP/1.1
216.218.206.66   01/Feb/2015  403  GET / HTTP/1.1
23.23.38.251     01/Feb/2015  200  GET / HTTP/1.1
23.23.38.251     01/Feb/2015  301  GET / HTTP/1.1
23.239.196.71    01/Feb/2015  301  GET / HTTP/1.1
31.202.241.242   02/Feb/2015  301  GET / HTTP/1.0
54.160.249.160   02/Feb/2015  200  GET /host-your-own-blog-with-ghost-nginx-on-linux/ HTTP/1.1
54.160.249.160   02/Feb/2015  301  GET /robots.txt HTTP/1.1
54.197.168.201   02/Feb/2015  200  GET /host-your-own-blog-with-ghost-nginx-on-linux/ HTTP/1.1
54.197.168.201   02/Feb/2015  200  GET /robots.txt HTTP/1.1
54.82.74.230     02/Feb/2015  200  GET /content/images/2015/01/ghost_login_owner.png HTTP/1.1
54.82.74.230     02/Feb/2015  301  GET /content/images/2015/01/ghost_login_owner.png HTTP/1.1
54.89.61.205     02/Feb/2015  200  GET /identifying-services-needing-restart-after-updating-linux-packages/ HTTP/1.1
54.89.61.205     02/Feb/2015  200  GET /robots.txt HTTP/1.1
61.240.144.66    01/Feb/2015  200  GET / HTTP/1.0
69.171.237.116   01/Feb/2015  200  GET /content/images/2015/01/ghost_login_owner.png HTTP/1.1
69.171.237.116   01/Feb/2015  301  GET /content/images/2015/01/ghost_login_owner.png HTTP/1.1
71.11.195.254    01/Feb/2015  400  GET /tmUnblock.cgi HTTP/1.1
77.37.231.208    01/Feb/2015  301  GET / HTTP/1.1
78.133.20.10     01/Feb/2015  200  GET / HTTP/1.1
78.133.20.10     01/Feb/2015  404  GET /favicon.ico HTTP/1.1
</code></pre>

<p><strong>Want to sort it based on URN/file-path?</strong></p>

<p><code>$ perl -lane 'printf ("%-16s %-12s %-4s %s\n", $1, $2, $4, $3) if /^(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\s\-\s\-\s\[([^:]+)[^"]+"([^"]+)"\s(\d{3})/' ./access.log |sort -u|sort -k5,5</code></p>

<pre><code>198.20.69.74     02/Feb/2015  400  quit
176.58.116.39    02/Feb/2015  200  GET / HTTP/1.1
185.5.51.50      01/Feb/2015  200  GET / HTTP/1.1
185.5.51.50      01/Feb/2015  301  GET / HTTP/1.1
198.20.69.74     02/Feb/2015  403  GET / HTTP/1.1
207.145.97.131   02/Feb/2015  400  GET / HTTP/1.1
216.218.206.66   01/Feb/2015  403  GET / HTTP/1.1
23.23.38.251     01/Feb/2015  200  GET / HTTP/1.1
23.23.38.251     01/Feb/2015  301  GET / HTTP/1.1
23.239.196.71    01/Feb/2015  301  GET / HTTP/1.1
31.202.241.242   02/Feb/2015  301  GET / HTTP/1.0
61.240.144.66    01/Feb/2015  200  GET / HTTP/1.0
77.37.231.208    01/Feb/2015  301  GET / HTTP/1.1
78.133.20.10     01/Feb/2015  200  GET / HTTP/1.1
108.166.85.126   01/Feb/2015  499  GET /admin/config.php HTTP/1.0
176.58.116.39    02/Feb/2015  304  GET /assets/css/screen.css?v=cfc2d462d6 HTTP/1.1
185.5.51.50      01/Feb/2015  200  GET /assets/css/screen.css?v=cfc2d462d6 HTTP/1.1
185.5.51.50      01/Feb/2015  200  GET /assets/fonts/casper-icons.woff HTTP/1.1
176.58.116.39    02/Feb/2015  304  GET /assets/js/index.js?v=cfc2d462d6 HTTP/1.1
185.5.51.50      01/Feb/2015  200  GET /assets/js/index.js?v=cfc2d462d6 HTTP/1.1
176.58.116.39    02/Feb/2015  304  GET /assets/js/jquery.fitvids.js?v=cfc2d462d6 HTTP/1.1
185.5.51.50      01/Feb/2015  200  GET /assets/js/jquery.fitvids.js?v=cfc2d462d6 HTTP/1.1
173.252.110.115  01/Feb/2015  301  GET /content/images/2015/01/ghost_login_owner.png HTTP/1.1
173.252.110.119  01/Feb/2015  200  GET /content/images/2015/01/ghost_login_owner.png HTTP/1.1
54.82.74.230     02/Feb/2015  200  GET /content/images/2015/01/ghost_login_owner.png HTTP/1.1
54.82.74.230     02/Feb/2015  301  GET /content/images/2015/01/ghost_login_owner.png HTTP/1.1
69.171.237.116   01/Feb/2015  200  GET /content/images/2015/01/ghost_login_owner.png HTTP/1.1
69.171.237.116   01/Feb/2015  301  GET /content/images/2015/01/ghost_login_owner.png HTTP/1.1
176.58.116.39    02/Feb/2015  304  GET /content/images/2015/01/lars.jpg HTTP/1.1
185.5.51.50      01/Feb/2015  200  GET /content/images/2015/01/lars.jpg HTTP/1.1
78.133.20.10     01/Feb/2015  404  GET /favicon.ico HTTP/1.1
175.44.8.98      01/Feb/2015  301  POST /ghost_linux_init_script/ HTTP/1.1
54.160.249.160   02/Feb/2015  200  GET /host-your-own-blog-with-ghost-nginx-on-linux/ HTTP/1.1
54.197.168.201   02/Feb/2015  200  GET /host-your-own-blog-with-ghost-nginx-on-linux/ HTTP/1.1
185.5.51.50      01/Feb/2015  200  GET /identifying-services-needing-restart-after-updating-linux-packages/ HTTP/1.1
54.89.61.205     02/Feb/2015  200  GET /identifying-services-needing-restart-after-updating-linux-packages/ HTTP/1.1
199.180.112.34   01/Feb/2015  404  GET /muieblackcat HTTP/1.1
111.251.50.236   01/Feb/2015  400  CONNECT mx0.mail2000.com.tw:25 HTTP/1.0
199.180.112.34   01/Feb/2015  404  GET //myadmin/scripts/setup.php HTTP/1.1
199.180.112.34   01/Feb/2015  404  GET //MyAdmin/scripts/setup.php HTTP/1.1
207.145.97.131   02/Feb/2015  400  GET //Net_work.xml HTTP/1.1
199.180.112.34   01/Feb/2015  404  GET //phpmyadmin/scripts/setup.php HTTP/1.1
199.180.112.34   01/Feb/2015  404  GET //phpMyAdmin/scripts/setup.php HTTP/1.1
199.180.112.34   01/Feb/2015  404  GET //pma/scripts/setup.php HTTP/1.1
176.58.116.39    02/Feb/2015  304  GET /public/jquery.min.js?v=cfc2d462d6 HTTP/1.1
185.5.51.50      01/Feb/2015  200  GET /public/jquery.min.js?v=cfc2d462d6 HTTP/1.1
198.20.69.74     02/Feb/2015  403  GET /robots.txt HTTP/1.1
207.34.25.76     01/Feb/2015  301  GET /robots.txt HTTP/1.1
54.160.249.160   02/Feb/2015  301  GET /robots.txt HTTP/1.1
54.197.168.201   02/Feb/2015  200  GET /robots.txt HTTP/1.1
54.89.61.205     02/Feb/2015  200  GET /robots.txt HTTP/1.1
186.15.3.50      02/Feb/2015  400  GET /tmUnblock.cgi HTTP/1.1
71.11.195.254    01/Feb/2015  400  GET /tmUnblock.cgi HTTP/1.1
</code></pre>

<p><strong>Output in CSV format, for import into a spreadsheet application like Excel.</strong></p>

<p><code>$ perl -lane 'printf "$1,$2,$4,$3\n" if /^(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\s\-\s\-\s\[([^:]+)[^"]+"([^"]+)"\s(\d{3})/' ./access.log |sort -u &gt; access.log.csv</code></p>

<p><strong>That's it!</strong></p>

<p>Hope you enjoyed it!</p>

<p>Lars Bjaerris</p>]]></content:encoded></item><item><title><![CDATA[Needy-Restart; Service Restart After Linux Package Update]]></title><description><![CDATA[<h5 id="whentorestartservicesafterpackageupdating">When to restart services after package updating.</h5>

<p>When updating Linux packages with a package manager it is occasionally necessary to identify services running, having file(s) open that have been unlinked from the directory tree, i.e. deleted. <br>
This is most commonly caused by a process having a shared library(</p>]]></description><link>https://bjaerris.com/identifying-services-needing-restart-after-updating-linux-packages/</link><guid isPermaLink="false">848c2b56-5ca3-454c-9a74-84cad15a20d7</guid><category><![CDATA[needy-restart]]></category><category><![CDATA[updating packages]]></category><category><![CDATA[yum]]></category><category><![CDATA[bash scripting]]></category><category><![CDATA[needs-restarting]]></category><category><![CDATA[unlinked files]]></category><category><![CDATA[deleted files]]></category><dc:creator><![CDATA[Lars Bjaerris]]></dc:creator><pubDate>Wed, 28 Jan 2015 11:47:12 GMT</pubDate><content:encoded><![CDATA[<h5 id="whentorestartservicesafterpackageupdating">When to restart services after package updating.</h5>

<p>When updating Linux packages with a package manager it is occasionally necessary to identify services running, having file(s) open that have been unlinked from the directory tree, i.e. deleted. <br>
This is most commonly caused by a process having a shared library(.so) file open that has been updated to a newer version.</p>

<p>Consider the case of the recent Openssl Heartbleed vulnerability. <a href="http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2014-0160">http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2014-0160</a> <br>
Updating Openssl via your package manager without restarting running services on your system relying on “libssl.so.*”, will most likely result in the system still being vulnerable to attacks.</p>

<p>In this post I’ll demonstrate a couple of ways to identify running services with open and unlinked files and automatically restarting them.</p>

<p>Finally I will share my application/script "needy-restart", for automatically identifying and restarting these services.</p>

<p>The following should work as is, on <br>
RHEL, Oracle Linux, SL, CentOS Linux distributions.</p>

<p>For other Linux distributions, minor adjustments should be expected.</p>

<h5 id="heresfromarecentsystemupdate">Here’s from a recent system update</h5>

<p>Part of the “yum-utils” package has a python script to output processes that needs restarting, called “needs-restarting” <br>
It is very useful for getting info helping determining what processes will need to be restarted.</p>

<p>This is what the script outputs on this system:</p>

<pre><code># needs-restarting 
1869 : /sbin/dhclient-1-q-cf/etc/dhcp/dhclient-eth0.conf-lf/var/lib/dhclient/dhclient-eth0.leases-pf/var/run/dhclient-eth0.pideth0
2125 : /usr/bin/mimedefang-multiplexor-p/var/spool/MIMEDefang/mimedefang-multiplexor.pid-m2-x10-y0-Udefang-b600-l-s/var/spool/MIMEDefang/mimedefang-multiplexor.sock
2214 : crond
</code></pre>

<p>Unfortunately the script will not restart the services for us, so if we want that functionality we have to roll our own script.</p>

<p>In the following we are only interested in matching and restarting these services:</p>

<pre><code>2125 : /usr/bin/mimedefang-multiplexor-p/var/spool/MIMEDefang/mimedefang-multiplexor.pid-m2-x10-y0-Udefang-b600-l-s/var/spool/MIMEDefang/mimedefang-multiplexor.sock 
2214 : crond
</code></pre>

<p>The following should be restarted and verified by doing a system reboot.</p>

<pre><code>1869 : /sbin/dhclient-1-q-cf/etc/dhcp/dhclient-eth0.conf-lf/var/lib/dhclient/dhclient-eth0.leases-pf/var/run/dhclient-eth0.pideth0
</code></pre>

<p><strong># If you’re feeling brave, issuing “service network restart” will do, but is *not* recommended on a production system without scheduled downtime.</strong></p>

<h5 id="letsmakeourownscript">Let’s make our own script.</h5>

<p>Check for processes with open files that has been unlinked/deleted using “lsof” </p>

<p>We want to identify “DEL” or “deleted” in the 5th column of the output from lsof and filter out “/dev/zero” &amp; “async I/O (AIO)”</p>

<pre><code># lsof |grep "DEL\|deleted" | grep -v "/dev/zero\|\[aio\]"

dhclient   1869       root  txt       REG              202,0   572256      18270 /sbin/dhclient (deleted)
dovecot    2006       root  122u      REG              202,0        0      41056 /var/run/dovecot/login-master-notifye2e4a04bb9b2d719 (deleted)
dovecot    2006       root  125u      REG              202,0        0      42553 /var/run/dovecot/login-master-notify4804d9d068f2d611 (deleted)
mimedefan  2125     defang  DEL       REG              202,0               24600 /lib64/libfreebl3.so
crond      2214       root  DEL       REG              202,0               24600 /lib64/libfreebl3.so
imap-logi 17666   dovenull    4u      REG              202,0        0      42553 /var/run/dovecot/login-master-notify4804d9d068f2d611 (deleted)
imap-logi 17671   dovenull    4u      REG              202,0        0      42553 /var/run/dovecot/login-master-notify4804d9d068f2d611 (deleted)
imap-logi 17708   dovenull    4u      REG              202,0        0      42553 /var/run/dovecot/login-master-notify4804d9d068f2d611 (deleted)
imap-logi 18376   dovenull    4u      REG              202,0        0      42553 /var/run/dovecot/login-master-notify4804d9d068f2d611 (deleted)
imap-logi 18383   dovenull    4u      REG              202,0        0      42553 /var/run/dovecot/login-master-notify4804d9d068f2d611 (deleted)
imap-logi 18385   dovenull    4u      REG              202,0        0      42553 /var/run/dovecot/login-master-notify4804d9d068f2d611 (deleted)
imap-logi 18387   dovenull    4u      REG              202,0        0      42553 /var/run/dovecot/login-master-notify4804d9d068f2d611 (deleted)
</code></pre>

<p>We are interested in the two services having an unlinked shared library file open. In this case:</p>

<pre><code>mimedefan  2125     defang  DEL       REG              202,0               24600 /lib64/libfreebl3.so
crond      2214       root  DEL       REG              202,0               24600 /lib64/libfreebl3.so
</code></pre>

<p>Let’s check what package was updated.</p>

<pre><code># yum whatprovides /lib64/libfreebl3.so |grep -B3 -A3 "installed"

nss-softokn-freebl-3.14.3-19.el6_6.x86_64 : Freebl library for the Network
                                          : Security Services
Repo        : installed
Matched from:
Other       : Provides-match: /lib64/libfreebl3.so
</code></pre>

<p>So the package “nss-softokn-freebl-3.14.3-19.el6<em>6.x86</em>64” was updated and two services need to be restarted to use the new shared library provided by this package.</p>

<p>But in the interest of coding a script to identifying and restarting these services, we’ll wait.</p>

<p>Let’s tighten the output so we can use it in a script.</p>

<pre><code># lsof |grep -v "/dev/zero\|\[aio\]" | perl -lane 'print $F[0] if ($F[3] =~ /DEL|deleted/ &amp;&amp; !$seen{$F[0]}++)'
mimedefan        &lt;--- OUTPUT
crond            &lt;--- OUTPUT 
</code></pre>

<p>Here we still filter out “/dev/zero” &amp; “async I/O (AIO)” with the grep command, pipe the output to perl and print the first column if column “4” matches “DEL” or “deleted”. The last part is a perl emulation of “sort -uk1,1” showing only unique first column matches.</p>

<p>Much cleaner output and more suitable for use in a script. <br>
Unfortunately the output can not be used directly to wrap in a for loop to restart, as “mimedefan” service restart is done like this: “service mimedefang restart”</p>

<p>Let’s grab the running service names from /var/lock/subsys/ and compare them to our previous output.</p>

<pre><code># proc=$(lsof |grep -v "/dev/zero\|\[aio\]" | perl -lane 'print $F[0] if ($F[3] =~ /DEL|deleted/ &amp;&amp; ! $seen{$F[0]}++)') ; \
if ! [ -z "$proc" ]; then ls /var/lock/subsys/ |grep "$proc"; else echo "No service needs restart."; fi

crond             &lt;--- OUTPUT
mimedefang        &lt;--- OUTPUT
</code></pre>

<p>Now we have process names we can use to restart with the service command. <br>
Let’s test it by issuing “service &lt;servicename&gt; status” on these processes first.</p>

<pre><code># proc=$(lsof |grep -v "/dev/zero\|\[aio\]" | perl -lane 'print $F[0] if ($F[3] =~ /DEL|deleted/ &amp;&amp; ! $seen{$F[0]}++)') ; \
 if ! [ -z "$proc" ]; then ls /var/lock/subsys/ |grep "$proc"| xargs -I{} service {} status; else echo "No service needs restart."; fi

crond (pid  2214) is running...                     &lt;--- OUTPUT
mimedefang (pid  2142) is running...                &lt;--- OUTPUT
mimedefang-multiplexor (pid  2125) is running...    &lt;--- OUTPUT
  0/10 .......... 0                                 &lt;--- OUTPUT
</code></pre>

<p>Now we have a working one-liner that clearly is begging to be put in a script. <br>
Let’s construct a bash script to identify and restart service(s) if needed after updating of packages.</p>

<h5 id="needyrestartscript">needy-restart script</h5>

<p>Below is the finished script.</p>

<pre><code>#!/bin/sh
# Version 0.6
# needy-restart: Identifying and restarting running services
# with open files unlinked/deleted from the directory tree.
#
# Copyright (C) 2015 Lars Bjaerris &lt;lars@bjaerris.com&gt;
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
# You should have received a copy of the GNU General Public License
# along with this program.  If not, see &lt;http://www.gnu.org/licenses/&gt;.
# -------------------------------------------------------------------

proc=$(lsof |grep -v "/dev/zero\|\[aio\]" | perl -lane 'print $F[0] if ($F[3] =~ /DEL|deleted/ &amp;&amp; ! $seen{$F[0]}++)')
fini=0
while [ "$#" -gt 0 ]
do
    case "$1" in
        -r)
           if ! [ -z "$proc" ] # Check that we're not grep'ing with an empty string
           then
               echo "Restarting needy restart(s):"; ls /var/lock/subsys/ |grep "$proc" |xargs -I{} service {} restart
           else    
               echo "No needy restart(s)"
               fini=1 #
           fi
           ;;&amp;    # Continue with the last check of "-v" unless $fini -eq 1.
     -v|-r)
           if [ "$fini" -eq 1 ]
             then
             exit
           elif ! [ -z "$proc" ] 
           then
               echo "Checking for needy restart(s):"; ls /var/lock/subsys/ |grep "$proc"
               echo "If processes are stil listed below after running "needy-restart -r","
               echo "you should evaluate them and consider if a reboot is necessary."
               echo "-------------------------------------------------------------------------------"
               lsof |grep -v "/dev/zero\|\[aio\]" | perl -lane 'print $F[0] if ($F[3] =~ /DEL|deleted/ &amp;&amp; !$seen{$F[0]}++)'
           else 
               echo "No needy restart(s)"
           fi
           ;;
    -*)
            echo &gt;&amp;2 "usage: $0 [-v] [-r] -v: Check for needy restart(s) or -r: Restart needy restart(s)"
        exit 1;;
     *)  
            echo &gt;&amp;2 "usage: $0 [-v] [-r] -v: Check for needy restart(s) or -r: Restart needy restart(s)"
            break;;    # terminate while loop
    esac
    shift
done
</code></pre>

<p><strong>Let's test it</strong></p>

<pre><code># vi needy-restart
</code></pre>

<p>Paste!</p>

<p>Set the execute bit.</p>

<pre><code>#chmod +x needy-restart
</code></pre>

<p>Execute.</p>

<pre><code># ./needy-restart -v
Checking for needy-restart(s):    &lt;--- OUTPUT
crond                             &lt;--- OUTPUT
mimedefang                        &lt;--- OUTPUT
Checking for needy-restart(s):
If processes are stil listed below after running needy-restart -r,
you should evaluate them and consider if a reboot is necessary.
-------------------------------------------------------------------------------
</code></pre>

<p>Give the restart service(s) option.</p>

<pre><code># ./needy-restart -r
Restarting needy-restart(s):
Stopping crond: [  OK  ]
Starting crond: [  OK  ]
Shutting down mimedefang: [  OK  ]
Shutting down mimedefang-multiplexor: [  OK  ]
Waiting for daemons to exit
Checking filter syntax: OK
Starting mimedefang-multiplexor: [  OK  ]
Starting mimedefang: [  OK  ]
Checking for needy-restart(s):
If processes are stil listed below after running needy-restart -r,
you should evaluate them and consider if a reboot is necessary.
-------------------------------------------------------------------------------
</code></pre>

<p>Final check.</p>

<pre><code># ./needy-restart -v
Checking for needy-restart(s):
If processes are stil listed below after running needy-restart -r,
you should evaluate them and consider if a reboot is necessary.
-------------------------------------------------------------------------------
</code></pre>

<h5 id="edit">EDIT:</h5>

<p><strong>Here is an example using "needy-restart after updating: glibc-* &amp; kernel-headers.</strong></p>

<pre><code># ./needy-restart -r
Restarting needy-restart(s):
Stopping auditd: [  OK  ]
Starting auditd: [  OK  ]
Shutting down system logger: [  OK  ]
Starting system logger: [  OK  ]
Stopping Dovecot Imap: [  OK  ]
Starting Dovecot Imap: [  OK  ]
Stopping crond: [  OK  ]
Starting crond: [  OK  ]
Stopping sshd: [  OK  ]
Starting sshd: [  OK  ]
Stopping nginx: [  OK  ]
Starting nginx: [  OK  ]
Stopping mysqld:  [  OK  ]
Starting mysqld:  [  OK  ]
Stopping mysqld:  [  OK  ]
Starting mysqld:  [  OK  ]
Stopping php-fpm: [  OK  ]
Starting php-fpm: [  OK  ]
Shutting down ntpd: [  OK  ]
Starting ntpd: [  OK  ]
Stopping fail2ban: [  OK  ]
Starting fail2ban: [  OK  ]
Checking for needy-restart(s):
auditd
sshd
If processes are stil listed below after running needy-restart -r,
you should evaluate them and consider if a reboot is necessary.
-------------------------------------------------------------------------------
init
udevd
dhclient
auditd
portreser
mimedefan
agetty
mingetty
npm
node
sshd
bash
pickup
master
qmgr
tlsmgr
</code></pre>

<p>In this case we have to reboot the server.</p>

<p><strong>After Reboot testing</strong></p>

<pre><code># ./needy-restart -v
Checking for needy-restart(s):
If processes are stil listed below after running needy-restart -r,
you should evaluate them and consider if a reboot is necessary.
-------------------------------------------------------------------------------
</code></pre>

<p>Hope you enjoyed it! <br>
Lars Bjaerris</p>]]></content:encoded></item><item><title><![CDATA[Ghost Blog Sys-V init Linux Startup Script]]></title><description><![CDATA[Ghost blog Sys-V init start script for RHEL, CentOS, Oracle Linux, Fedora Linux . Node JavaScript 
nodejs init script]]></description><link>https://bjaerris.com/ghost_linux_init_script/</link><guid isPermaLink="false">63ad3734-f110-4490-8d40-3b19b23fca97</guid><category><![CDATA[nodejs]]></category><category><![CDATA[SysV init]]></category><category><![CDATA[ghost blog]]></category><category><![CDATA[Linux]]></category><category><![CDATA[init script]]></category><category><![CDATA[bash]]></category><dc:creator><![CDATA[Lars Bjaerris]]></dc:creator><pubDate>Sun, 25 Jan 2015 11:39:22 GMT</pubDate><content:encoded><![CDATA[<h5 id="installingnodejsinitscriptforghostblog">Installing nodejs init script for Ghost Blog.</h5>

<p>If you followed my guide on: <a href="https://bjaerris.com/host-your-own-blog-with-ghost-nginx-on-linux">Hosting your own blog with Ghost &amp; Nginx on Linux</a></p>

<p>The following Linux init script should work as is, on <br>
RHEL, Oracle Linux, SL, CentOS Linux distributions.</p>

<p>For other Linux distributions, minor adjustments should be expected.</p>

<p><strong>Installing Ghost Blog init script</strong></p>

<pre><code>$ sudo vi /etc/init.d/ghostblog
</code></pre>

<p>^Add the below script to the file and save.  </p>

<p>Ghost Blog Linux init script:</p>

<pre><code>#!/bin/sh
#
# ghostblog   Startup script for nodejs Ghost Blog
#
# Copyright (C) 2015 Lars Bjaerris lars&lt;at&gt;bjaerris.com
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
# You should have received a copy of the GNU General Public License
# along with this program.  If not, see &lt;http://www.gnu.org/licenses/&gt;.
# -------------------------------------------------------------------
# chkconfig: 2345 80 10
#
### BEGIN INIT INFO
# Provides: ghostblog
# Required-Start: $local_fs $remote_fs $network
# Required-Stop: $local_fs $remote_fs $network
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Short-Description: start and stop ghostblog
### END INIT INFO

# Source function library.
. /etc/rc.d/init.d/functions

user="ghostblog"
prog="ghostblog"
exec="/usr/bin/npm"
run_env="NODE_ENV=production"
option="start"
daemon_dir="/home/ghostblog/ghost/"
piddir="/home/$user/ghost/run/"
pidfile="$piddir$prog.pid"
prockill='node index'

[[ -d $piddir ]] || su $user -c " mkdir -p $piddir"

[ -e /etc/sysconfig/$prog ] &amp;&amp; . /etc/sysconfig/$prog

lockfile=/var/lock/subsys/$prog

start() {
    [ -x $exec ] || exit 5
    echo -n $"Starting $prog: "
    su $user -c "cd $daemon_dir; export $run_env; nohup $exec $option &gt; /dev/null 2&gt;&amp;1 &amp;"  
    retval=$?
    echo
    [ $retval -eq 0 ] &amp;&amp; touch $lockfile &amp;&amp; sleep 2;echo $(/usr/bin/pgrep -u ghostblog -f 'node index') &gt; $pidfile
    return $retval
}

stop() {
    echo -n $"Stopping $prog: "
    pkill -u $user -f "$prockill"
    retval=$?
    echo
    [ $retval -eq 0 ] &amp;&amp; rm -f $lockfile &amp;&amp; rm -f $pidfile 
    return $retval
}

restart() {
    stop
    start
}

reload() {
    restart
}

force_reload() {
    restart
}

rh_status() {
status $prockill
}

rh_status_q() {
    rh_status &gt;/dev/null 2&gt;&amp;1
}

case "$1" in
    start)
        rh_status_q &amp;&amp; exit 0
        $1
        ;;
    stop)
        rh_status_q || exit 0
        $1
        ;;
    restart)
        $1
        ;;
    reload)
        rh_status_q || exit 7
        $1
        ;;
    force-reload)
        force_reload
        ;;
    status)
        rh_status
        ;;
    condrestart|try-restart)
        rh_status_q || exit 0
        restart
        ;;
    *)
        echo $"Usage: $0 {start|stop|status|restart|condrestart|try-restart|reload|force-reload}"
        exit 2
esac
exit $?
</code></pre>

<p>-</p>

<pre><code> $ sudo chmod 0750 /etc/init.d/ghostblog
</code></pre>

<p>^Make it executable and set permissions.</p>

<pre><code>$ sudo chkconfig --add /etc/init.d/ghostblog
</code></pre>

<p>^Add the init scipt to services</p>

<pre><code>$ sudo chkconfig ghostblog on
</code></pre>

<p>^Set the startup flag to on.</p>

<p>-</p>

<p><strong>Testing:</strong></p>

<pre><code>$ sudo service ghostblog start
Starting ghostblog: 
$ sudo service ghostblog status
node (pid 24909) is running...
$ sudo service ghostblog stop
Stopping ghostblog: 
$ sudo service ghostblog start
Starting ghostblog: 
$ sudo service ghostblog status
node (pid 24963) is running...
$
</code></pre>

<p>ghostblog is configured to start and stop as a service.</p>

<p>Hope you enjoyed it! <br>
Lars Bjaerris</p>]]></content:encoded></item><item><title><![CDATA[Host Your Own Blog With Ghost & Nginx On Linux]]></title><description><![CDATA[This is a Setup Guide/How-to, getting the Ghost Blog platform up and running on a RHEL/Oracle Linux/CentOS/Scientific Linux Linux server.]]></description><link>https://bjaerris.com/host-your-own-blog-with-ghost-nginx-on-linux/</link><guid isPermaLink="false">f2b01ba5-444d-4907-b454-73d49e3132f4</guid><dc:creator><![CDATA[Lars Bjaerris]]></dc:creator><pubDate>Fri, 23 Jan 2015 16:54:04 GMT</pubDate><content:encoded><![CDATA[<h2 id="selfhostingyourghostblogonlinux">Self hosting your Ghost blog on Linux.</h2>

<p>This Blog runs on the excellent Ghost blog platform. <br>
Official website: <a href="http://ghost.org/">http://ghost.org/</a></p>

<p>This is a Setup Guide/How-to, getting the Ghost Blog platform up and running on a RHEL/Oracle Linux/CentOS/Scientific Linux server. <br>
(For other Linux flavors, minor adjustments to this guide should be expected.) </p>

<p>We will serve it thru a Reverse Proxy using another excellent open source application; the <a href="http://nginx.org/">NGINX</a> Web-Server/Reverse-Proxy/Load-Balancer/Http-Cache.  </p>

<hr>

<h5 id="prerequisites">Prerequisites</h5>

<ol>
<li>Linux server. (RHEL, Scientific Linux, Oracle Linux, CentOS)</li>
<li>System user account with sudoers privilege or root access.</li>
<li>Domain name with DNS A record pointing to the public IP of the server.</li>
<li>A console or if remote, terminal and SSH.</li>
<li>If you have "2." I'll presume you're comfortable with the commandline and your editor of choice. :)</li>
</ol>

<hr>

<h5 id="initialinstallation">Initial Installation</h5>

<p>Let's add the blog system user/process owner and install the blog software, along with the required software dependencies.</p>

<pre><code> $ sudo yum list installed epel-release || yum -y install epel-release
 $ sudo yum -y install nodejs npm
 $ sudo useradd ghostblog
 $ sudo su - ghostblog
</code></pre>

<p><strong>Above commands explained:</strong></p>

<ol>
<li>Check if we have the "Extra Packages for Enterprise Linux repository" (epel-release) and if not, install it.  </li>
<li>Install "nodejs", "npm" and dependencies.  </li>
<li>Add a system user that will own the core blog files and processes.  </li>
<li>Switch to our new system user and environment.  </li>
</ol>

<hr>

<h5 id="ghostblogsoftwareinstallation">Ghost blog software installation</h5>

<pre><code>$ curl -LO https://ghost.org/zip/ghost-latest.zip 
$ unzip -uo ghost-latest.zip -d ghost &amp;&amp; rm -f ghost-latest.zip
$ cd /home/ghostblog/ghost/
$ npm install --production
$ nohup npm start &gt; ghost.log 2&gt;&amp;1 &amp;
</code></pre>

<p><strong>Above commands explained:</strong></p>

<ol>
<li>Get the latest Ghost source files.  </li>
<li>Unzip the files into the "ghost" directory (Create it if it doesn’t exist) and remove the zip file if the previous command succeeded.  </li>
<li>Change into directory "/home/ghostblog/ghost/".  </li>
<li>Install the Ghost platform.  </li>
<li>Start it and detach it from current terminal. (We will construct a proper SysV-init script later for proper system and application startup/shutdown.)</li>
</ol>

<p><strong>The last command will generate output similar to this:</strong></p>

<blockquote>
  <p>[ghostblog@gw4 ghost]$ nohup npm start > ghost.log 2>&amp;1 &amp;</p>
  
  <p>[1] 10430</p>
</blockquote>

<p>Issue the "jobs" command to check that we are up and running. </p>

<pre><code>$ jobs
</code></pre>

<p><strong>Above command explained:</strong></p>

<ol>
<li>List running and suspended jobs in the current bash environment.</li>
</ol>

<p><strong>Output:</strong></p>

<blockquote>
  <p>[1]+  Running                 nohup npm start > ghost.log 2>&amp;1 &amp;</p>
  
  <p>[ghostblog@gw4 ghost]$</p>
</blockquote>

<hr>

<h5 id="ghostconfiguration">Ghost configuration</h5>

<p>Configure "/home/ghostblog/ghost/config.js"</p>

<pre><code>$ vi /home/ghostblog/ghost/config.js
</code></pre>

<p><strong>Change "url:" to your domain:</strong></p>

<pre><code>// When running Ghost in the wild, use the production environment
// Configure your URL and mail settings here
production: {
    url: 'http://example.com', // #&lt;--- Your domain here
</code></pre>

<p><strong>And here:</strong></p>

<pre><code>    // ### Development **(default)**
development: {
    // The url to use when providing links to the site, E.g. in RSS and email.
    // Change this to your Ghost blogs published URL.
    url: 'http://example.com', // #&lt;--- Your domain here
</code></pre>

<p><strong>Restart Ghost into production environment:</strong></p>

<pre><code>$ pkill -u ghostblog -f 'node index'
$ NODE_ENV=production nohup npm start &gt; ghost.log 2&gt;&amp;1 &amp;
</code></pre>

<p><strong>Above commands explained:</strong></p>

<ol>
<li>Kill process owned by "ghostblog" matching "node index"  </li>
<li>Start Ghost in prodution environment, background and detach from terminal.  </li>
</ol>

<hr>

<h5 id="nginxinstallation">Nginx Installation</h5>

<p>Installation and Configuration of Nginx as a reverse proxy, to the Ghost Blog locally running instance.</p>

<pre><code>$ exit
$ sudo yum list installed nginx || yum -y install nginx
$ sudo /etc/init.d/nginx start || chkconfig nginx on
$ sudo vi /etc/nginx/nginx.conf
</code></pre>

<p><strong>Example nginx.conf file</strong></p>

<pre><code>user  nginx;
worker_processes  2;
error_log  /var/log/nginx/error.log warn;
pid        /var/run/nginx.pid;

events {
    worker_connections  1024;
}

http {
    include       /etc/nginx/mime.types;
    default_type  application/octet-stream;

    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    access_log off;
    sendfile        on;
    keepalive_timeout  65;

    include /etc/nginx/conf.d/*.conf;
}
</code></pre>

<p><strong>Creating the Ghost Blog Nginx configuration file</strong></p>

<pre><code>$ sudo vi /etc/nginx/conf.d/ghostblog.conf 
</code></pre>

<p><strong>Add the following and change to your domain and ip address marked with "#&lt;---"</strong></p>

<pre><code># Reverse proxy server
upstream ghost  {
    server 127.0.0.1:2368;
}

server {

       listen       80;
       server_name  example.com www.example.com; #&lt;--- Your domain goes here.

       ## send all traffic to the back-end
       location / {
            proxy_pass        http://ghost;
            proxy_set_header Host      $host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_redirect    off;
            proxy_hide_header X-Powered-By;

            location ~* \.(html?|css|jpg|gif|ico|js|woff)$ {
                    proxy_cache_path /tmp/cache levels=1:2 keys_zone=cache:60m max_size=1G;
                    proxy_cache          cache;
                    proxy_cache_key      $host$uri$is_args$args;
                    proxy_cache_valid    200 301 302 30m;
                    proxy_cache_valid    404 1m;
                    expires              30m;
                    ### proxy-buffers ###
                    proxy_buffering         on; # Has to be on for cache to work
                    proxy_buffer_size       8k;
                    proxy_buffers           256 8k;
                    proxy_busy_buffers_size    64k; proxy_temp_file_write_size 64k;
                    proxy_pass  http://ghost;
            }

            # Temporary access control below until we have set up an owner/admin for the Ghost Blog.
            allow 127.0.0.1; # &lt;--- Your public server ip address goes here.
            deny    all;
       }
}
</code></pre>

<hr>

<h5 id="testnginxconfigurationandreload">Test Nginx Configuration and Reload</h5>

<pre><code>$ sudo nginx -t &amp;&amp; /etc/init.d/nginx reload
</code></pre>

<p><strong>Above command explained:</strong></p>

<ol>
<li>Test nginx configuration and reload if ok.</li>
</ol>

<p><strong>Output:</strong></p>

<blockquote>
  <p>nginx: the configuration file /etc/nginx/nginx.conf syntax is ok</p>
  
  <p>nginx: configuration file /etc/nginx/nginx.conf test is successful</p>
  
  <p>Reloading nginx:</p>
</blockquote>

<hr>

<h5 id="creatingtheblogowneryou">Creating the Blog Owner (You!)</h5>

<p>Start your favorite browser and connect to your new blog on:</p>

<p><a href="http://example.com/ghost/">http://example.com/ghost/</a> #&lt;-- Substitute for your own domain.</p>

<p><strong>Output:</strong> (Hopefully! ;)
<img src="https://bjaerris.com/content/images/2015/01/ghost_login_owner.png" alt="Ghost Initial Login"></p>

<hr>

<h5 id="removingtheaccessrestriction">Removing the access restriction.</h5>

<p>Once you are ready to let the world read your blog, you will have to edit the "/etc/nginx/nginx.conf" once more.</p>

<pre><code>$ sudo vi /etc/nginx/conf.d/ghostblog.conf
</code></pre>

<p><strong>Find this part again:</strong></p>

<pre><code># Get your ip address from Google: Search for "my ip"
 allow 127.0.0.1; #&lt;--- Your ip address from above search goes here.
 deny    all;
</code></pre>

<p><strong>Change it to:</strong></p>

<pre><code># Get your ip address from Google: Search for "my ip"
# allow 127.0.0.1; #&lt;--- Your ip address from above search goes here.
# deny    all;
</code></pre>

<p><strong>Test and reload the Nginx configuration:</strong></p>

<pre><code>$ sudo nginx -t &amp;&amp; /etc/init.d/nginx reload
</code></pre>

<p><strong>Above command explained:</strong></p>

<ol>
<li>Test nginx configuration and reload if OK.  </li>
</ol>

<hr>

<h5 id="yourblogisreadyandworldreadable">Your blog is ready and world readable.</h5>

<hr>

<p>Next: <a href="https://bjaerris.com/ghost_linux_init_script">Installing a Ghost Blog Sys-V init Linux Startup Script</a></p>

<p>Hope you enjoyed it! <br>
Lars Bjaerris    </p>]]></content:encoded></item><item><title><![CDATA[What's This?]]></title><description><![CDATA[Linux system configurations, tips and tricks. Learn about Linux application administration and configurations. 

Lars Bjaerris - System Engineer.]]></description><link>https://bjaerris.com/lars_bjaerris/</link><guid isPermaLink="false">fc79acc9-c9f6-4839-a669-0ba2cc8b8174</guid><category><![CDATA[Linux howTo's]]></category><category><![CDATA[Linux Engineering]]></category><dc:creator><![CDATA[Lars Bjaerris]]></dc:creator><pubDate>Thu, 22 Jan 2015 07:10:00 GMT</pubDate><content:encoded><![CDATA[<h5 id="linuxguidestipsandtricks">Linux guides, tips and tricks.</h5>

<p>This is a collection of <strong>Linux/*nix</strong> Engineering how-to's, tips and tricks.</p>

<p>I hope you find them useful! <br>
<strong>Lars Bjaerris</strong> - System Engineer</p>]]></content:encoded></item></channel></rss>