After reading an article documenting the use case for RSS, I've decided that my own blog should have an RSS feed. I've wanted to offer an RSS feed for a while now, because I think RSS is useful for staying up to date on things, and it really is perfect for blogs. But this post is not to explain the merits of RSS, it is to explain how I, with my plain-text blog, managed to implement it.
RSS is a very simple format. I'm sure it gets a lot more complicated, but in its basic form, it simply provides some metadata on a blog or other content provider, and then it lists entries of content. It is XML, which is easy to parse and easy to generate, and that's rather helpful for me. I already have a blog-publish.sh script in my blog directory that allows me to quickly push up blog posts, so whatever solution I find, I'll end up adding it on to that script so that when I push a post to my server, the RSS feed is updated.
The trickiest part is figuring out what programming language to use for this project. Being an OpenBSD user, my options are pretty much C, Shell, or Perl. I've never touched Perl in my life up to this point, so that leaves me with C or Shell. To decide which I should use for any given project, I always ask myself this question: can it be done in Shell? If the answer is yes, then I use Shell. If the answer is no, then I use C. Simple.
I can see this going both ways though. I feel like this would be a task best suited for a shell script, but at the same time, the problem is a trivial one to solve in C. However, for the sake of speed and simplicity, I'm opting to write this one in shell script. Maybe at some point I'll port it to C. I'm a lot quicker with the shell, since I use it daily. I am also very comfortable in C, but there's no question that it takes longer to write, because a lot more has to be done manually. I understand that there's something to be said about rapid development. In the business world, there has to be a lot of velocity: things always have to be moving along, otherwise you don't make money. Of course, this blog isn't really making me money, but the concept still applies because the more time I spend writing code, the less time I have for other, possibly more important things.
So, 37 lines of shell script later, I have a working RSS feed. And,
though not perfect, the solution is relatively elegant. Using the
heredoc syntax, I can generate massive blocks of text, and embed
variables in them. With find(1)
, I can list all of my posts and
then iterate over them with a simple while loop. Generating the
metadata is easy because all of my posts follow a standard format:
the date is in the file name, and can be easily extracted with
cut(1)
, and the title is the very first line of the file, which is
easily extracted with head(1)
.
The ID is just the file name, and the link is easy to generate,
because I just declare a BASE_URL
at the top of my script. The next
part is the actual post content. That's pretty easy. At first I just
used cat(1) to dump the post contents into the XML stream. But most
RSS readers render entries as HTML, so it squashed all my formatting.
The next thing I did was wrap the post in 'pre' tags. That was really
easy, but it felt like a hack, and it also looked horrible on my phone's
RSS reader. So I had to convert these plain text files into HTML
somehow. I settled on using sed(1)
to inject paragraph tags on each
blank line. That seems to allow me to keep basic formatting of my posts,
while still looking good in a variety of readers.
This little experiment goes to show just how valuable plain text is. When I started this blog, I purposely wanted it to be stupidly simple, and so far that is proving to pay off, because I can easily add things like an RSS feed with just a few lines of shell. And I still have zero dependencies outside of my operating system's built-in tools, which is the greatest part. I'd highly encourage anyone that writes on a computer to do so in plain text because it is very easy to manipulate, and also very efficient to store and read. It is also much less distracting. Sometimes computers have so many shiny graphical elements that it is easy to get distracted, but with plain text, you can type on a virtual terminal that feels like it came right out of the early 80s. Literally all my laptop does right now is process plain text.
Back in the days of my previous blog, I wrote about how plain text is the superior format for storing information. Even now, well after I had that realization, I'm still finding it to be true. Sure, there aren't many GUI tools to help you process it, but I don't use the GUI for much anyway. The UNIX-style tools provided with any POSIX-like operating system are more than enough for basic text processing. They are definitely worth learning if are serious about plain text. And you should be serious about plain text. I was able to implement an RSS feed from scratch in less than 10 minutes, on my blog that I've built from the ground up. I'm becoming rather proud of this little blog.
At the moment, I don't have an internet connection, but I should move these blog posts into a CVS repository, or at least make the scripts public. I'm not doing anything particularly revolutionary, but maybe I'll build out a little text-based CMS managed by some shell scripts.