New bot: @78_sampler, serving up old records

The Internet Archive hosts an incredible collection of over 25,000 professionally digitized 78rpm records. The great thing about a catalog that large is that, if you know what you want, you’re likely to find it. On the other hand, if you just want to browse it can be overwhelming and even intimidating. Each item could possibly be a delight, but it’s difficult to even think about individual records in the face of such a huge archive.

In that sense, would-be browsers face similar challenges with the Great 78 Project as they do with the Pomological Watercolor Collection—an archive I’ve worked with a lot. Sensing that similarity, I decided to build a tool like @pomological to help surface individual records.

@78_sampler tweets every two hours with a randomly selected record from the Archive’s collection. It was important to me that the audio fit smoothly and natively into a Twitter timeline, so I decided to render each tune into a video file using the Archive’s still image of the record as the visual. Twitter limits videos to 2:20—exactly 140 seconds, cute—which is shorter than most 78 tunes, so while rendering the video I truncate the clip at that point with a short audio fade at the end.

The code to do all this is a short Python script which I’ve posted online. It relies on ffmpeg to do the video encoding. Crafting ffmpeg commands is famously convoluted, and it’s a little frustrating to format those commands to be called from Python. Maybe that’s something I’ll do differently in the future but, for now, this works and I can dip my cup into the deep Archive well with a little more ease than before.

He did the monster mosh: automated datamoshing with tweaked GOP lengths

After I posted yesterday about my automated datamoshing experiment, I got a nice message on Twitter from developer and botmaker Ryan Bauman who was able to run my script on his own videos. We talked for a bit about how it could be improved, and I had a major realization that I had to test immediately.

In yesterday’s post I mentioned that the relative prevalence of P-Frames precluded my preferred effect of stills from one video and movement from another. Too much image data was being drawn by the P-Frames for it to really get unrecognizable. I didn’t and still don’t know how to reduce the number of P-Frames per se, but the big realization was I could change the ratio of I-Frames to P-Frames by ratcheting the “Group Of Pictures” size way down. That variable determines how often a video includes an I-Frame, and I could set it easily in ffmpeg while re-encoding the source videos.

A normal default GOP size might be 250 frames — which is to say, no more than 10 seconds of video go by without a full I-Frame render. Poking around, I tried changing that number to a few different values to see what works best. Experimentally, a GOP size maximum of 48 frames (once every two seconds) seems to do the trick for two video sources. An excerpt of that video looks like this:

What are the constraints on the GOP size? Here, as in many other moments of this project, my considerations are very different from most people’s. In most cases, you want I-Frames to happen frequently enough that cutting and seeking through the video can be relatively precise, but rare enough that the filesize can stay low. Since every I-Frame has to contain a full frame of image data, they can be large.

Because I’m swapping I-Frames at the byte level, and truncating or padding each frame to fit, every I-Frame insertion is a potential source of trouble. You can see in the video above, my encoder struggles in certain places. And given that I’m doing substitution, the closer my GOP gets to 1, the more my mosh-up starts to just look like a slideshow.

For single source videos, where I-Frames are likely to be more similar to each other, I was able to use an even shorter GOP length. Here’s a datamoshed version of the Countdown video with a GOP length of 25 frames.

So, in conclusion: messing with GOP sizes means a different number of I-Frames which means more opportunities for hijinks in mangling the videos.

Making mosh-ups: automated datamoshing from multiple video sources

Datamoshing is a glitch art technique applied to videos to intentionally create “pixel bleeding” and other digital motion artifacts. It became popular several years ago when it was used in near-simultaneous music videos by Chairlift and Kanye West. In those cases, and in the tutorials and techniques documented since then, the glitches are typically introduced to a single edited video, and done manually in a visual editing program.

My goal for this project was to use two separate video sources — to make a “mosh-up,” har har — and to completely automate the merger. The holy grail would be to use all the motion from one video over all the stills of another, to make sort of an animated Magic Eye effect, but without the eye focus requirement. (Side note: it’s possible and awesome to actually create animated Magic Eyes, but that’s beside the point.)

As you’ll see, I fell a little short of that stretch goal, but still managed to make something that looks pretty cool.

Moshed-up vids

The script I wrote can create two kinds of datamoshes. The first takes a single video as a source and rearranges some key frames to glitch it out. Here’s an example of one such video, glitching up Beyoncé’s incredible Countdown video with itself. I’ve muted the audio in this upload, but as output from the script it still sounds pretty good.

The second (and more exciting) kind of data moshes takes a certain kind of key frame from one video and replaces the same kind of frame in another video. All of the motion and some of the “re-drawings” of subsequent frames are pulled out of context, creating an effect that is a little surreal and unsettling. It’s not as precise as the pros do with their manual edits, but it also can automatically combine two sources in a way I’ve never seen before. (Here, I used Countdown again, and moshed it together with Formation.)

Again, I’ve muted the audio, but in this case it would normally play back Partition without any noticeable flaws.

How it works

This moshing technique relies on some facts about how the H.264 spec compresses and stores its data. It really is a remarkable standard, and if you’re not familiar it’s absolutely worth reading the tribute that is “H.264 is Magic”. The gist is that only a very small portion of frames, dubbed I-Frames, contain all the image data necessary to draw a full screen. The other kinds of frames, P- and B-Frames, have partial screens and “motion” data.

Other popular datamoshing techniques include removing I-Frames altogether or duplicating P-Frames so the same motion is re-applied to an image. In this example, instead, I’m replacing I-Frames with other I-Frames, and I’m doing it simply by copying and pasting the bytes from one into the place of another, truncating or padding out the data so it fits exactly.

The reason I can’t get only the motion from one video and only the “textures” from another is that the P-Frames blend those two types of data together into a single frame. If I could figure out some way to isolate the image data in the frame, or to re-encode a video to consist entirely of I- and B-Frames, I could probably get a wilder effect.

As it stands, the output of this script is a video that plays, but is pretty badly mangled. If you try to play it back in a client that shows you errors, you’ll see a lot of complaints. For compatibility’s sake, I’ve manually transcoded these videos into another format and back.

LinkArchiver, a new bot to back up tweeted links

Twitter users who want to ensure that the Wayback Machine has stored a copy of the pages they link to can now sign up with @LinkArchiver to make it happen automatically. @LinkArchiver is the first project I’ve worked on in my 12-week stay at Recurse Center, where I’m learning to be a better programmer.

The idea for @LinkArchiver was suggested by my friend Jacob. I liked it because it was useful, relatively simple, and combined things I knew (Python wrappers for the Twitter API) with things I didn’t (event-based programming, making a process run constantly in the background, and more). I did not expect it to get as enthusiastic a reaction as it has, but that’s also nice.

The entire bot is one short Python script that uses the Twython library to listen to the Twitter User stream API. This is the first of my Twitter bots that is at all “interactive”—every previous bot used the REST APIs to post, but can not engage with things in their timeline or tweeted at them.

That change meant I had to use a slightly different architecture than I’ve used before. Each of my previous bots were small and self-contained scripts that produced a tweet or two each time they run. That design means I can trigger them with a cron job that runs at regular intervals. By contrast, @LinkArchiver runs all the time, listening to its timeline and acting when it needs to. It doesn’t have much interactive behavior—when you tweet at it directly, it can reply with a Wayback link, but that’s it—but learning this kind of structure will enable me to do much more interactive bots in the future.

It also required that I figure out how to “daemonize” the script, so that it could run in the background when I wasn’t connected and restart in case it crashed (or when I restart the computer). I found this aspect surprisingly difficult; it seems like a really basic need, but the documentation for how to do this was not especially easy to find. I host my bots on a Digital Ocean box running Ubuntu, so this script is running as a systemd service. The Digital Ocean documentation and this Reddit tutorial were both very helpful for my figuring it out.

Since launching the bot, I’ve gotten in touch with the folks at the Wayback Machine, and at their request added a custom user-agent. I was worried that the bot would get on their nerves, but they seem to really appreciate it—what a relief. After its first four days online, it’s tracking some 3,400 users and has sent about 25,000 links to the Internet Archive.