BLOT: (06 Sep 2010 - 10:18:59 PM)
I have been playing around with Google Reader today. I know that some of my friends use it a lot, and it offers me an alternative feed reader when I am on the road or at work, so it interests me. However, it had one big flaw that I could see. I have a small handful, say about 10, friends who post semi-regular, protected updates to Livejournal. The obvious, and probably expected, answer was simply to check my Livejournal Friends page while on the road or at work, and that works; but part of me balked at the limitation and wanted to find a way around it. After an hour or two of tinkering, I came up with this idea (this shouldn't have taken this long, but I'll explain the process).
My first thought, once I discovered the issue, was to try and find some way to call the feed that allowed it to authenticate "on the fly", one of the old http://user:firstname.lastname@example.org sort of solutions. That did not work. I looked around online for a solution, and the first thing I found was Yahoo! Pipes which probably would work if they weren't so odd and strange a program. You can set up fairly complex inputs and outputs and connect them various mash-up systems. Theoretically, I could set up an authenticated feed reader on pipes, and then have it output the feed's contents as another feed, which could be read by Google Reader. My first and only attempt at working that through did not work, and for all I know, published by password to the internet.
This got me thinking, though, why not chew up the feed with another feed reader and then spit out a parsed xml file that I could do something with. I use Akregator so I first went into its archives and found them in mk4 file format, which is a Metakit type. I was able to use the viewer to see the contents of the archive, and even read them if I wanted, but did not quite feel like compiling some sort of command line system that would directly interact with the mk4 files. It seemed possible, maybe even preferable, but eh. Not quite what I was looking to do. The next idea was a continuation of this, I thought about getting another feed reader. Something simple, maybe, like Rawdog, which had the added benefit of converting the feeds read into an HTML file via Python, which was about one step away from doing what I needed to do with them. Reading through the config files, though, it was missing a certain finesse. Not Rawdog itself, but the solution. Installing a new application just so that I could write something to fit on top of it that would do something else with its outputs? Finally, my brain went, "Oh, yeah, CURL [however you want to uppercase that]!" Well, it went "WGet!" and then I remembered that CURL was better at that sort of thing, and...anyway.
If you want to use my solution to this problem, you'll need the following bits:
Also note the following caveats, which I would rank of fair importance, considering.
Ok, now that that is taken care of, this is how I set it up.
First, I created a bash script that had a series of cURL commands. Each one reads something like /usr/bin/curl --silent --digest --u username:password http://lj-user.livejournal.com/data/atom?auth=digest -o /path/to/hosted/file/lj-user.rss. One per line. The silent seemed to help, but may not be necessary. You'll need to toss the "--digest" in, or it doesn't seem to work (I have to admit I am not 100% sure what that means). The name of the file is whatever you want it to be, but keep in mind the first caveat, above. If you make it too obvious, then others might be able to access it and that's no good. Since it doesn't matter what the rss is named, you might could aim for something like a random sixteen digit code, that you change from time to time.
Once you get that file up and running (and don't, like me, forget the #!/bin/bash at the beginning) you'll stick that just about anywhere you please and then you'll want to crontab -e and make a line that reads something like 0 * * * * /path/to/script. If you are not doing this directly on the server, you'll then need to set up a second stage which automatically uploads the files to a server. I'll avoid going into that, but there are various options out there.
The next step, and essentially the last, is to go into Google Reader and then tell it the location of the files. Then you do all the various things you want to do like tag them or sort them. Google Reader seems to have it's own little schedule dictating when it updates. While I get it to show entries this way, and it's kind of cool because it brings over the journal title and such, it takes a few minutes before it checks again. I bet there is some sort of Google algorithm that won't check sites that aren't super popular super often, so expect a *gasp* 10 or so minute lag time on the hour (or however often you have it set up). That's what I've been getting so far, but let me give it a day or two and I'll update to let you know if it is actually worse than this over a longer scale.
TAGS: Linux Tricks
Written by Doug Bolden
For those wishing to get in touch, you can contact me in a number of ways
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
The longer, fuller version of this text can be found on my FAQ: "Can I Use Something I Found on the Site?".
"The hidden is greater than the seen."