Unicode Poetry Bot
TLDR; I made a tiny bot that generates "poetry" in 5 lines of Python:
import random as r, unicodedata
coolwords = [unicodedata.name(chr(119556+i)).split()[-1] for i in range(83)]
for w in r.sample(coolwords, r.randrange(9, 15)):
print(w, end=r.choice(' \n'))
print()
This blog post explains how the bot works and how I got it hosted for cheap.
0 - The Unicode database
In order to explain how the bot works, I need to take a brief detour and tell you about the unicode
database, and in particular about Python's unicodedata standard library module.
I'm simplifying a lot (if you want more details, the Wikipedia page about Unicode is a pretty good starter), but basically the Unicode standard is a giant list that assigns a unique id number to all known letters and symbols.
For example the character B has the id 66, the symbol € is 8364, and the emoji
🐍 is 128013. In Python you can use the builtin function
ord() to get a character's
id:
>>> ord("B")
66
>>> ord("€")
8364
>>> ord("🐍")
128013
To get a character when you have its id, you use the
chr() builtin function:
>>> chr(66)
'B'
>>> chr(8364)
'€'
>>> chr(128013)
'🐍'
But characters don't just have a standard id in unicode, they also have a standard name, and that's
where things start to get a little more interesting. B is "LATIN CAPITAL LETTER B",
€ is "EURO SIGN", and 🐍 is ... "SNAKE". Neat, but not a huge surprise probably.
This is what the unicodedata module is for, and in particular
unicodedata.name(). It's part of the standard library and doesn't need to be installed, so you can import it anywhere you
have Python installed and try it out:
>>> import unicodedata
>>> unicodedata.name("B")
'LATIN CAPITAL LETTER B'
>>> unicodedata.name("€")
'EURO SIGN'
>>> unicodedata.name("🐍")
'SNAKE'
For some extra fun, you can try the following (some of the examples are sneaky so make sure you copy/paste):
unicodedata.name("&")
unicodedata.name("А")
unicodedata.name(" ")
unicodedata.name("😤")
1 - The poetry generator
While I was working on a silly side-project one day (which might come up in a future post), I ended up with
a text file listing all the characters known to unicodedata (their ids and their names). While
scrolling rapidly through it something caught my eye around index 119552, and that's how I learned about the
Taixuanjing. It's an ancient text written in a
special alphabet consisting of about 90 symbols. The interesting bit is that those symbols have some pretty
evocative names. In the Unicode database, those symbols are all grouped together and with some basic Python
code, we can quickly see a list of them:
>>> for index in range(119552, 119639):
... print(index, unicodedata.name(chr(index)))
...
119552 MONOGRAM FOR EARTH
119553 DIGRAM FOR HEAVENLY EARTH
119554 DIGRAM FOR HUMAN EARTH
119555 DIGRAM FOR EARTHLY HEAVEN
119556 DIGRAM FOR EARTHLY HUMAN
119557 DIGRAM FOR EARTH
119558 TETRAGRAM FOR CENTRE
119559 TETRAGRAM FOR FULL CIRCLE
119560 TETRAGRAM FOR MIRED
119561 TETRAGRAM FOR BARRIER
119562 TETRAGRAM FOR KEEPING SMALL
119563 TETRAGRAM FOR CONTRARIETY
119564 TETRAGRAM FOR ASCENT
119565 TETRAGRAM FOR OPPOSITION
119566 TETRAGRAM FOR BRANCHING OUT
119567 TETRAGRAM FOR DEFECTIVENESS OR DISTORTION
119568 TETRAGRAM FOR DIVERGENCE
119569 TETRAGRAM FOR YOUTHFULNESS
# ...
Once I had all these cool names, the rest came rather quickly.
I used a list comprehension to go through that list, selecting only the last word
(.split()[-1]).
Then I used random.sample() to pick between 9 and 15 unique words out of that list.
And finally I printed them out, using the end=r.choice(' \n') trick to print either a space
after the word (66% chance) or a new line (33% chance).
The results are interesting and often have a mystical and/or threatening vibe to them:
DARKENING HUMAN RESPONSE LABOURING SINKING PURITY LEGION STRENGTH ENCOUNTERS ACCUMULATION GATHERING CONTENTION MODEL GUARDEDNESS
3 - The bot
Some weeks ago I learned about the botsin.space Mastodon instance which is dedicated to hosting bots (who could have guessed?). I thought this little "poetry" generator would be a perfect excuse to try my hand at building one.
Because I already use them for other projects, I decided to host the bot on Digital Ocean (heads up: that's an affiliate link) and their "serverless" infrastructure.
This "serverless" setup lets me execute some Python at regular intervals (I chose daily but it supports a CRON-like syntax) and it's pretty cheap (my current usage is well within the free tier).
After an evening reading a mix of official documentation, blog posts, and code examples managed to cobble together a working bot.
May I present you: @[email protected]. If you follow it, you get a random "poem" every day.