Monday, January 16, 2017

Haiku Twitter Bot

Over summer 2016, I decided to make a Twitter bot just for fun. Here, I'll describe the bot's programming at a high level. But first, a little background...

I started undergraduate as an english major... then graduated with a degree in neuroscience. Although I have pretty firmly switched my career plans toward math and science, I still appreciate and respect art. As a reflection on my artistic pipedream, I designed my bot to write haikus. If you don't remember haikus from high school, hit that link in the previous sentence.

For connecting to Twitter, I used the tweepy library. I simply wrote a subclass of StreamListener to process incoming tweets and write them to a file called tweets.txt. You can read more about this process in the tweepy docs (linked above).

Each time I start my bot, most of the high-level functionality is located in one function, do_tweet().

For counting syllables and determining parts of speech, I used NLTK.

Counting syllables:


Choosing words for the haiku:

As you can see, the algorithm randomly chooses a word of syllable length sw, which is generated by the pick_syl() function. If this is hard to interpret, don't worry. It took me a while to come up with an algorithm that would randomly choose words but still adhere to the 5/7/5 haiku format. The conditions like if ix == 0 and line == 0 are there to determine which line of the haiku is being written. In this case, the first word of the first line is being chosen. Then the lines:

sdict1 = dict([(sk, sv) for sk, sv in sdict.items() if pos_tag(word_tokenize(sk),
tagset='universal')[0][1]=='NOUN'])
# Chooses a noun
gsyls = syl_of_size(sdict1, sw) # Chooses a noun of syllable length sw

Then, setting up the haiku in order and writing to the file haiku.txt:

So currently this bot would write haikus with randomly chosen words, which is cool... but we can do better! ;^)
Eventually when I have a little more time, I'll give meta ai ai haiku more smarts, but for now the bot uses a simple algorithm for determining the "best" tweets, then recycles words from these tweets. Dumb, I know... it's a work in progress.


The function get_best() chooses tweets with the highest "weight" to include in the bot's corpus for writing future haikus.

Here's a link to my Twitter bot meta ai ai haiku: robcapps.com/docs/haiku. Please feel free to ask questions in the comments section!