“We are not minority languages, we are minoritised. And we are the global majority” – Tura Arutura, Social Justice activist, creative artist and dancer.
At the end of September, I had the great good fortune to be invited to the Celtic Knot conference in Waterford, Ireland hosted by Wikimedia Ireland and Wikimedia UK. This conference focuses on the minority language Wikipedias (not all of the 345 language Wikipedias are as well supported or well developed as English Wikipedia, see the list of Wikipedias here) and allows a Venn diagram of participants from all kinds of backgrounds, ages and experiences to come together as a community of ‘language activists’ to showcase, discuss and advocate for how best to support minoritised languages around the world.
We held the first ever Celtic Knot conference at the University of Edinburgh back in July 2017 as a way to demonstrate our support for the Scots Gaelic Wikipedia residency at the National Library of Scotland (watch the video presentation here) and to see where we could add some significant value by helping shine a light on some incredibly worthwhile language projects that could do with the space and time to outline the particular challenges (and opportunities) that regional and minority languages face whether technical, socio-economic or political. Initially, the conference focused on bringing the Celtic languages together (Scots Gaelic, Irish Gaelic, Welsh, Breton, Cornish, Manx) to help form a strong bond or ‘Celtic knot’ through working together and sharing experiences but we quickly realised that there was much to be gained from expanding to include Basque, Catalan, Saami, the Romance languages and more. I hosted the first event at the University as a one day experiment with 50-60 attendees to see if there was value to such knowledge and cultural exchange and I was ecstatic to see in Waterford that the need and desire for the conference had not diminished. Indeed, despite the upheaval of the past few years from Brexit, Covid, the Ukraine War, war in the Middle East, the cost of living crisis and more since I last was able to attend the conference in 2018, the Celtic Knot under the auspices of Wikimedia UK had expanded to a three day event and included more minoritised languages from around the world than ever before including Dagbani, Indonesian, Amazigh and Tashelhit from Morocco and many more who had wanted to attend & present but were unfortunately denied visas owing to some bureaucratic red tape. It is heartening to see that the ‘strength in unity’ between the Celtic language participants and our original conference participants was still there and stronger than ever and that there were welcoming arms extended both by the conference organisers, and importantly, by the Wikimedia Foundation to exploring a larger more inclusive conference to support minoritised languages across the globe. It was also heartening to see attendees from the Wikimedia Research team attend and present on efforts to make the process of creating a new language Wikipedia much easier to move from incubation to graduation in much less than the c. 9-18 years it has historically taken, until now.
It was fitting also that the conference was held in Waterford, Ireland’s oldest city, and a place that was described to me as somewhere that had perhaps lost its way and/or fallen upon hard times in the latter part of the 20th century/early 21st as a rather depressed port area ignored by industry, retail and tourism and needing some love and support. But also now in recent times that its city officials had successfully rebranded and rejuvenated the city through embracing its rich Viking and medieval history and Waterford’s treasures. It also was not lost on me that Scotland-based Irish artist, Aoife Cawley, had created a special linoprint design depicting the marriage of Aoife MacMurrough (c. 1145 – 1188), a Princess of Leinster, being forced (against Irish law and tradition) in marriage to the English lord ‘Strongbow’, earl of Pembroke, in Christchurch Cathedral in Waterford as part of a pact between Strongbow (also known as Richard fitz Gilbert and Richard de Clare) and the King of Leinster, Dermot MacMurrough (c. 1110 – 1171), to help him reclaim his lands. This marriage on 25 August 1170 marked the first significant arrival of the English people (and the English language) becoming involved in Irish politics, history and culture with all that has ensued since.
Jason Evans, National Library of Wales Wikimedian and Open Data Manager was the conference’s opening keynote address and expounded on generating Welsh Wicipedia articles using AI generated summaries (checked by two humans for grammar and factual accuracy) to help create more knowledge shared in the Welsh language online. He outlined his work in public outreach at the National Library of Wales, and work with schools and universities in particular where he found translation tasks were exceedingly popular with students – they felt very motivated to share knowledge and address knowledge gaps online. Maristella Gatto further reinforced the motivation of students for translation work in a presentation sharing details on a University of Bari translation project where students chose their words carefully when translating articles about Irish historical events, such as Bloody Sunday, into Italian by using computational analysis of the vocabulary. They implicitly realised AI tools make use of Wikipedia so this can replicate problems in representation of topics if language and vocabulary used in articles was not chosen correctly. Words have meaning and they matter. Representation matters.
“Aithníonn ciaróg ciaróg eile” translates as “One beetle recognises another”.
This Irish saying (above) is a nod to the notion of comradeship, community and solidarity between people(s). I believe this is certainly true of conference participants who recognised, despite their different languages, that there was true commonality in their shared language activism. Activism that could sometimes lead to becoming political prisoners in the case of Martial Menard, namechecked in the talk by Dr. Tristan Loarer, Opening Sources in the Breton language: Offering the ‘Minoritised’ Language to the Majority”.
“We must take what we are entitled to, not hold out our hands.” – Breton activist and political prisoner Martial Menard (1951-2016)
Loarer discussed the availability (or lack thereof) of pragmatic tools for the Breton language and the need for feeding the A.I. ‘beast’ with quality assured Breton text whether from Breton transcriptions in Wikisource, the free and open wiki hyper library, or from the creation of the new DEVRI tool, offering free access to a dichroic dictionary of the Breton language.
Two particularly affecting sessions, for me, were on the Irish language. Nóirín Ní Bhraoin, a psychologist from Dublin, noted that when she walked the streets of Dublin she hardly ever heard Gaeilge, which she thought was astounding for the Republic of Ireland’s capital city. She wanted to see if the problem was down to “one Irish speaker not being able to recognise another” so wore a badge that said “Speak Irish to me” and invited shop staff at ten Dublin shops to wear these badges and record how often customers spoke to them in Irish each day. The results showed that on average 3.6 people spoke Gaeilge to the staff each day across the ten stores. This encouraged Nóirín Ní Bhraoin to work with a developer to create a mobile app called “Gaelgoer” (Gael as in Irish Gaeilge speaker and ‘go-er’ as in the English for someone to get up and go!) which would allow app users to view (1) upcoming Irish events happening near them or all around the world (2) businesses that had speakers happy to speak Irish to you, and (3) even geolocate Irish speakers on the map so you could start an online/sms chat with them, if both were happy to do so. NB: an extra ‘Tinder’ style dating function was considered and requested by surveyed Irish speakers but Nóirín Ní Bhraoin and her developer shelved that idea for now.
“This project underscores a powerful truth: knowledge belongs to everyone” Joe Kelly, Mayor of Waterford, speaking on the Wiki Women Erasmus+ Project.
The key event of the Conference was the Wiki Women Erasmus+ panel introduced by the Mayor of Waterford, Joe Kelly, who spoke of how genuinely impressed he had been by the initiative and the potential it had for expansion. He was followed by four impressive high school Irish students who took turns to present (both in Irish and with an English translation) on their experiences on the Wiki Women Erasmus project where this EU funded scheme allowed the students to attend the Basque country as part of a cross cultural language exchange with Basque and Friesland students and teachers with the ultimate goal to highlight the gender gap in content online and empower students in minority language communities (Gaeltacht regions, Basque, Friesland) to write Wikipedia articles about underrepresented women in their languages. Another goal of the project has been to produce a ‘teacher’s toolkit’ that could be translated and used in any language to support further work in other regional and minoritised languages.
“While working on this project, we also learned a lot about the history of women from our own country […] by the end we had a wealth of information […] we improved lots of skills during this project” – a student who participated in the Wiki Women Erasmus+ Project
Keynote speaker, and Irish Gaeilge Wikipedia editor, Dr. Kevin Scannell is a leading mind in tech for under-resourced languages and has revolutionised how Gaelic languages interact with modern tech. Scannell outlined some of the very real problems in the use of AI and the difference in distribution of knowledge (and power) between hegemonic languages like English and minoritised languages like Irish. If every word in Irish was committed to paper or computer and fed into a large language model, this would equate to 1 billion words or less. This equates to a knowledgebase 30,000 times smaller than Llama 3.1 LLM. Further, Irish data included in standard LLMs is of low quality with Wikipedias used as standard to train LLMS but minoritised language Wikipedias varying wildly in quality and other sources, such as CommonCrawl, heavily polluted with machine translation. The problem, Scannell asserted, was that big tech companies with non Irish-speaking researchers don’t care about the training data being ‘garbage in’ and thereby don’t care that this produces ‘garbage out’ so Scannell has started an Irish language corpus building project called Fiontar at Dublin City University where the 150 million words in it are being quality assured.
Further talks by Dresden University student researchers, Hannah Yule Heetmann and Joanna Dieckmann, on Unpacking Power Dynamics in Language Policy showed again how words and intentions matter through the analysis they had conducted of the language used in Irish Government’s 20-Year Strategy for the Irish language. Their fascinating findings highlighted how the words “going to” were entirely absent from the policy document, that timescales were almost never included, and that there was also a lack of specific actions and specific labelling of which government or non-government actors were actually to undertake those actions. They concluded with a series of recommendations to combat this for use in future policy documents so that any future Irish language strategy is truly fit for purpose, actionable, accountable and with specific tasks and timescales detailed.
When a language stops having the vocabulary to be able to speak about modern politics, socio-economics and technologies that affect and influence our daily lives then that language ceases to be useful and risks dying out so watching talks showing a range of initiatives, open education resources & toolkits, new ways of thinking about language activism (combining your passions to write about forensic science in Scots Gaelic for instance) and even ensuring that the word for a Wikipedia ‘edit-a-thon’ is now in Irish Gaeilge, gave me great hope that breathing new life into languages is possible and that new safe, open spaces (following the demise of Twitter) can be made to work to support language communities.
This pragmatic and inspiring ‘can do’ spirit, and the strength of feeling behind it coupled with the sheer pride being taken in every speaker’s linguistic heritage and its potential for the future in a global digital world, was the thing that impressed me most during the conference. The recognition that government policies can be advocated for and shaped, and that A.I. and other digital tools and initiatives can be harnessed and made to work to help and massively support languages, cultures, and histories being shared for the betterment of knowledge & cultural exchange and understanding across the world. As Nóirín Ní Bhraoin concluded (and I’m paraphrasing here) it’s about caring, and getting up off your backside to actual do something if you do care about your language, to say “Here we are”.
And if I may add, in a nod to the future of the Celtic Knot, “and here we remain.”
Onwards and upwards… and outwards! And here’s to a bigger, more inclusive Wikipedia language conference next time!
Thanks and Sláinte to Amy and Sophie, our wonderful Wikimedia Ireland hosts, and conference co-organisers, Lea, Richard and Daria, from Wikimedia UK. Thanks also to Tura for a wonderful display of traditional Irish dance.