(If you aren't already familiar with recurrent neural networks, why not see Andrej Karpathy's excellent blog?)
(If you aren't already familiar with recurrent neural networks, why not see Andrej Karpathy's excellent blog?)
These days, thanks to The Wonders of Science[TM], we can train neural networks to imitate different styles of text by showing them some examples. Often the results are gibberish, but occasionally in this gibberish there is a nugget of... less gibberish. There are many fine Python libraries out there to let one run RNN experiments: I am using textgenrnn, and fine-tuning its stock model on data of my own whimsical fancy. Here is a selection of the most interesting, perplexing, or otherwise notable outputs.
In the past, we've seen examples of char-rnns learning to produce structured text by picking up repeated patterns in the training set, like a network trained on Sherlock Holmes titles churning out "The Adventure of the ..." time after time, or the Doctor Who network repeating "... of the Daleks". I was curious how well a network would pick up on an even stronger regularity in the training set. I also just happened to know of one that I've always wanted to play with: Randall Munroe's XKCD color name survey, where he isolated almost a thousand idiosyncratic color names and their corresponding RGB hex values based on his readers' feedback. I wondered, if I fed a network highly structured examples like his color names and hex values, how well would it learn to reproduce that structure? Would it learn that every example needs to end with a pound character, followed by exactly six hex characters?
Surprisingly, yes! Every example I generated based on that training set ended in a pound followed by several hex characters, and most of them had exactly six terminal hex characters, producing a valid RGB color. Now let us take a quiet moment to enjoy them together. (In the rare cases when there were fewer than six hex characters, I padded the hex string out with 0s so I could display it. If there were extra characters, I truncated them. If there were invalid characters, I replaced them with zeros. The original machine-generated RGB code is preserved in the heading.)