Authors who write in 19th century Swedish, pre-1906 spelling reform and all, are regrettably far and few between these days. But fear not, Stroberock’s latest project, DeepTopelius, is here to save the day!
DeepTopelius is a text generator that produces text in Swedish in the style of the 19th century Finnish Swedish-language author Zachris (Zacharias) Topelius, specifically based on his works Fältskärns berättelser and Stjärnornas kungabarn.
The text generator has been implemented using machine learning methodologies (neural networks, more specifically using a model architecture with character embeddings and bidirectional LSTM layers) and the TensorFlow framework, and trained on the aforementioned works of Topelius as a character-based text predictor (for a given input sequence of characters, predict the next character in the sequence), while the DeepTopelius website itself has been implemented using the Vue framework.
The project was originally conceived as an experiment in text generation for Swedish, mainly to see how text generation methods frequently applied to English and other larger languages would work for Swedish. While working on the project and looking for source material, we became increasingly intrigued with the prospect of also trying it out on older Swedish texts, so we went and looked for suitable text sources on Project Runeberg, which hosts a large number of public domain works from Nordic authors and where several of Topelius’ works can be found. And so we set out to create DeepTopelius.
The texts produced by this first version of DeepTopelius, while semantically nonsensical (as would be expected – the primary goal was to mimic style, not substance after all) and fairly frequently sporting grammatical errors and words of its own invention (we’ll call it artistic license on behalf of DeepTopelius), mimic the style of the source material quite well, bringing in characters from the source stories and sometimes inventing new characters altogether. The text generator generally (not surprisingly) does better with shorter sentences. A fair amount of the produced texts are quite amusing, such as the proclamation “Pantsätt Öland! (“Pawn Öland”, Öland being an island in Sweden).
The project was purely experimental in nature and a key component of the project was the DeepTopelius website itself and the presentation of the output of the text generator; future iterations could involve for example training machine translation models, conversational user interfaces or more generally natural language understanding applications (accounting for the aforementioned spelling reform, semantic drift, etc.).
To experience DeepTopelius yourself, visit the DeepTopelius website (in Swedish).