The Wilhelmus Challenge
Finding the Author of the Dutch National Anthem
The Dutch national anthem, Wilhelmus, dates from the late sixteenth century and is the oldest national anthem in the world. The song stands out by its content. It is not about a country, juxtaposing ‘us’ and ‘we’. Instead it is written in the first person. It voices William of Orange, who invokes a revolt against Spanish rule.
The song is still shrouded in mystery. For centuries, researchers have tried to establish when it was written, if someone commissioned it, and who wrote it. In recent years, debate about its authorship has resurfaced thanks to computational stylometric research.
A handful of sixteenth-century people have been identified as candidates for the authorship of the Wilhelmus. These include Fruytiers, Houwaert, Coornhert and Marnix of St. Aldegonde. Traditionally, however, Marnix of St. Aldegonde, writer, diplomat and confidant of William of Orange, has taken pride of place on this list.
Recent computational stylometric research has resulted in another suspect: Petrus Datheen, who was added to the test as a control rather than a plausible contender. Datheen, a Calvinist theologian, had never really been considered as the creator of Wilhelmus. His image was hardly that of a skilled poet. Moreover, he fell out with Willem of Orange in 1578.
Map: Datheen’s travels
This stylometric result raises a new and exciting question. Is there, among all the persons who traditionally were not thought of as potential Wilhelmus authors, someone with an even higher stylometric score than Petrus Datheen? Is there someone we have overlooked? And is there a computational method other than stylometry that could help us find the author of Wilhelmus?
At the start of DH2019, a meeting with the contenders will be held.
An award ceremony will be held towards the end of the conference to pronounce the winner of the Wilhelmus Challenge.
Who is the author of Wilhelmus?
To address this question, we have compiled a large dataset of historical Dutch songs. The dataset is available here. The data consists of two subsets:
- the folder `train` contains approximately 22k songs with known and unknown (“Anonymous”) authors that can be used to ‘train’ a model for authorship classification / verification. All files are in XML. The “ tag holds the author information for each song.
- After training, the goal is to employ the authorship model on the songs in the folder `test`. The test folder contains two versions of Wilhelmus (5135.xml and 27063.xml). We will take the performance of your model on the remaining test files as a proxy for the reliability of your prediction for the author of Wilhelmus. To keep it a real challenge, we do not provide the author names for these approximately 5k songs. The output of your system should be a CSV file with two columns (please use a comma as the separator in your CSV file):
1. the first column (‘filename’) should provide the filename of a particular song
2. The second column (‘author’) should provide the name of the predicted author.
Enrollment: If you expect to participate in this challenge, we kindly ask you to send an e-mail to email@example.com.
- https://dh2017.adho.org/abstracts/079/079.pdf (abstract of Mike Kestemont’s DH2017 contribution on the authorship of Wilhelmus)
- https://computationalstylistics.github.io/ (site about stylometry)
- J.D. Vargas Quiros, ‘Information-theoretic anomaly detection and authorship attribution in literature’. MA thesis, Universiteit Utrecht, https://dspace.library.uu.nl/handle/1874/359587
Recent authorship attribution papers: