Lighttpd Access Log Parsing
Posted by: Eric Hansen
If you’re like me, you’re always wanting to parse those pesky log files to make analyzing them easier. Thankfully, with my constant desire to make my jobs easier (read: make me lazier), and passion for programming, I’ve written up a quick Bash script to do just this.
Now, I’ll say this right off the bat…this isn’t pretty (i.e.: it’s straight forward, and quite bland currently), but I’m going to be writing a better one in PHP soon, using regex and all that other fancy stuff. But, this is something to get the engines roaring while I work on the script of the week tomorrow.
This is quite (read: completely) dependent on the format of your access log. My access log looks like this:
*ip* *domain* – [27/Sep/2011:22:20:11 -0400] “HEAD / HTTP/1.1″ 200 0 “-” “Mozilla/5.0+(compatible;)”
But with the “*…*” stuff filled in properly. Basically, what this calls for is cat’ing (or tail’ing) the log file, and printing stuff out using awk. You can also put a grep inbetween the two (or at the end) if you want, but I decided not to.
Here’s the single line of code to use:
tail /var/log/lighttpd/access.log | awk ‘{print “IP: “,$1,” Domain: “,$2,” Date: “,$4,$5,”Request: “,$6,$7,$8,” Result: “,$9}’
And the output looks like this:
IP: ### Domain: domains.are.cool Date: [27/Sep/2011:21:44:18 -0400] Request: “HEAD / HTTP/1.1″ Result: 200
This isn’t perfect by any means, but if you look at it, you can pretty much do in 1 line what AWStats and stuff do in 100+ lines. The only difference is that mine isn’t fancy…yet.
Lastly, for those wondering, this is what my Lighttpd access log entry in the config file looks like:
accesslog.format = “%h %V %u %t \”%r\” %>s %b \”%{Referer}i\” \”%{User-Agent}i\”"
Will this work for Apache and others? Maybe. Personally Apache isn’t my favorite web server, so your mileage may vary.




