Supporting the University of Edinburgh's commitments to digital skills, information literacy, and sharing knowledge openly

Tag: survey of scottish witchcraft

Teaching data literacy with real world (witchy) datasets

Our Digital Humanities award-winning interactive map (witches.is.ed.ac.uk) caught the public’s attention when it launched in September 2019 and has helped to change the way the stories of these women and men were being told with a campaign group, Witches of Scotland, successfully lobbying the Scottish Government into issuing a formal apology from the former First Minister, Nicola Sturgeon, for the grave wrong done to these persecuted women.(BBC News, 2022)

The Survey of Scottish Witchcraft

The map is built upon the landmark Survey of Scottish Witchcraft Database project. Led by Professor Julian Goodare the database collates historical records about Scotland’s accused witches (1563-1736) in one place. This fabulous resource began life in the 1990s before being realised in 2001-2003. It’s a dataset that has the power to fascinate.

However, since 2003, the Survey data has remained static in an MS Access database so I invited groups of students on the University of Edinburgh’s Design Informatics MFA/MA to consider at the course’s annual “Data Fair” in October 2017 what could be done if the data were exported into Wikipedia’s sister project, Wikidata, as machine-readable linked open data? Beyond this, what new insights & visualisations could be achieved if groups of students worked with this real-world dataset and myself as their mentor over a 6-7 week project?

Design Informatics students at the Suffer the Witch symposium at the Patrick Geddes centre displaying the laser-cut 3d map of accused witches in Scotland. CC-BY-SA, Ewan McAndrew

The implementation of Wikidata in the curriculum presents a huge opportunity for students, educators, researchers and data scientists alike. Especially when there is a pressing need for universities to meet the demands of our digital economy for a data literate workforce.

“A common critique of data science classes is that examples are static and student group work is embedded in an ‘artificial’ and ‘academic’ context. We look at how we can make teaching data science classes more relevant to real-world problems. Student engagement with real problems…has the potential to stimulate learning, exchange, and serendipity on all sides.” (Corneli, Murray-Rust and Bach, 2018)

The ‘success of the Data Fair’ model, year on year, prompted questions as to what more could be done over an even more extended project. So I lobbied senior managers for a new internship dedicated to geographically locating the places recorded in the database as linked open data as the next logical step.

Recruiting the ‘Witchfinder General’

Geography student Emma Carroll worked closely under my mentorship and supervision for three months in Summer 2019 with her detective work geolocating historic placenames involving colleagues from the National Library of Scotland, the Scottish Studies Archive, the Scottish Place-Name Society. The website creation itself involved my working with the creativity and expertise of the university’s e-learning developers.

Geography undergraduate student, Emma Carroll, our first ‘Witchfinder General’ intern in Summer 2019.

Since the map’s launch, this project has gained media coverage across Scotland and the world in allowing users to explore, for the first time, where these accused women resided, local to them, and learn all about their stories in a tremendously powerful way. It also shows the potential of engaging with linked open data to help the teaching of data science and to fuel discovery through exploring the direct and indirect relationships at play in this semantic web of knowledge, enabling new insights. There is always more to do and we have since worked with another four student interns on this project since 2022.

Our latest, Ruby Imrie, will be returning following her exams and a Summer break on 15th July to continue her work quality-assuring the vast amount of Scottish witchcraft data in Wikidata and creating new features, new visualisations, fixing any bugs and generally making our Map of Accused Witches in Scotland website as useful, as engaging and as user-friendly as possible so that when it is ready for relaunch in Autumn/Winter 2024 we have something that truly does justice in respecting all the work that has gone before and all the individual women and men persecuted during the Scottish witch trials.

Ruby Imrie and Professor Julian Goodare, Project Director of the Survey of Scottish Witchcraft at University of Edinburgh Library 23 August 2023

Almost five years on – the legacy of the project

The legacy of the project is that our students, year-on-year, are highly engaged and motivated to learn important histories from Scotland’s dark past AND the important data skills required for Scotland’s future digital economy. Many of our colleagues at the University (and beyond) also seek our advice on how to meet research grant stipulations that they make their research outcomes open both in terms of producing open access papers and releasing their data as open data. Lukas Engelmann, History of Medicine, is using Wikidata to document the history of 20th century epidemiology. Dr. Chris Langley and Asst. Prof. Mikki Brock have worked with myself to create a similar website, Mapping the Scottish Reformation, (as a proof-of-concept Project B to our Project A) and have shared their experiences with other similar projects such as: the Argyll and Sutherland Highlanders Military Museum in Stirling; Faversham Local History Group, Places of Worship in Scotland database team and more.

 

References

1. “Nicola Sturgeon apologises to people accused of witchcraft”. BBC News. 2022-03-08.
2. Corneli, J, Murray-Rust, D & Bach, B 2018, Towards Open-World Scenarios: Teaching the Social Side of Data Science.

Media Responses

Student Ruby Imrie onstage at the McEwen Hall, University of Edinburgh, receiving her award for Student Staff member of the Year 2023

Some wicked wiki news for Halloween

3rd year Computer Science student, Ruby Imrie, has just won Student Staff Member of the Year at the 2023 University of Edinburgh’s ISG (Information Services Group) Staff Recognition Awards on Tuesday 24th October 2023. As a central support service for the University and one of the largest tech employers in Scotland, and with over a hundred student workers being employed each year, this a real celebration of the work Ruby has been doing in opening up research datasets and helping people around the world understand what happened in the Scottish witch trials.

Ruby has worked for us full-time from Monday 5th June until Friday 25th August 2023 in the role of Witchfinder General: Data Visualisation intern and has happily agreed to continue working one day a week during her studies from 14 Sept 2023 until May 2024 to complete her exemplary work as there is always so much more to do.

Ruby has worked incredibly hard to help illuminate what happened in the Scottish witch hunts of 1563 to 1736 by focusing on opening up the rich historical data in the University’s landmark Survey of Scottish Witchcraft database (an MS Access 97 database created in late 1990s and completed in 2003) and turning it into linked open data in Wikipedia’s sister project, Wikidata, a knowledgebase of structured linked open data.

Importantly, Ruby has been quality checking and consistency checking the data using newly developed quality assurance methods in R Studio created by another student intern, Claire Panella, earlier this year that can be reused for many years to come. Ruby’s other focus has been on taking the extremely rich) data on all the 3,816 full witchcraft investigations (encompassing the initial or supposed denunciation, the arrest, the interrogation, the trial and the recorded trial outcomes) recorded in Scotland from 1563-1736 and embedding those new interactive visualisations on our Map of Accused Witches in Scotland website using a Javascript and the Vue.js framework. By also identifying and addressing bug fixes, conducting rigorous user testing sessions and using the feedback received to address areas for site improvement and action planning for developing new features she has helped to show how the data, and the individual human stories behind the data, can be better visualised, explored and interrogated as never before.

Ruby has also taken a personal interest in learning more about the Scottish witchcraft panics by attending talks about Scottish witches at the Edinburgh Book Festival 2023, reading the Jenni Fagan book, Hex, and attending the play, Prick, at the Edinburgh Fringe. She has also contributed blog articles documenting her work so that anyone can understand the variety of skills she has had to learn and then build from her prior learning.

Ruby Imrie and Professor Julian Goodare, Project Director of the Survey of Scottish Witchcraft at University of Edinburgh Library 23 August 2023

The work above shows the sheer variety of technical and analytical skills she had to employ. Attention to detail has been paramount so that the data was never misunderstood or misrepresented. Ruby analysed the original MS Access 97 database (2003), exported its contents in bespoke Access queries and .csv forms, manipulated and processed the data using Sparql, Python and R Studio which she had to first learn how to use, created a video about her work, learned new web developing skills using Javascript frameworks she was hitherto unfamiliar with, collaborated with developer colleagues, Wikidata experts, historian colleagues and other interns.

She has been enthusiastic both about learning technical skills and really motivated to learn about the Scottish witch hunts (even devouring books about it in her own time). Ruby took ownership and responsibility really seriously to represent the data correctly and the need to help public understand about how these women were persecuted. A seriously hard worker who takes her work seriously… but also an incredibly sunny and personable team player who always asks the right, most pertinent questions of the work in order to progress it in the right way. Which in of itself is really impressive to see in such an early career colleague. Ultimately, she’s analysed, curated and quality assured a VAST amount of data in Wikidata about the Scottish witch hunts in a short Summer internship and worked independently much of the time to do so. She has been an extremely dedicated, constructive and collaborative colleague to work with and has also contributed vastly to discussions about the future of the site and what that should be. So having this opportunity to celebrate and acknowledge her work, and the work of all our past student interns and developer colleagues is a wonderful thing to see.

Ruby in closeup holding her trophy for Student Staff Member of the Year

Ruby holding her trophy for Student Staff Member of the Year

Especially because we are working hard and not far away at all from launching version 2.0 of our Map of Accused Witches in Scotland website and the new elements will include:

  • Rich historical data on all the 3,816 witchcraft investigations in Scotland including:
    • Dates of investigations (filterable into panic and non-panic periods of time).
    • The primary  and secondary characteristics of investigations
    • Who the people investigating were – judges, expert witnesses, prosecutors etc.
    • Accusations of shape-shifting: “the magical transformation of a human into an animal. This was mainly a popular belief, but educated demonologists accepted it. In the Scottish witch trials, some accused witches confessing to having taken animal form, presumably through coercive interrogation. Less often, neighbours or victims testified that they had seen the witch in animal form. The animal was most often a cat, but we also find transformations into a dog, a ‘corbie’ (raven or crow), or other creatures. For more information about the shape-shifting terms mentioned below please refer to the Survey’s glossary of terms here: https://witches.hca.ed.ac.uk/glossary/”
    • Accusations of the ritual objects supposedly used by accused  – “Two different types of rituals appear in accused witches’ records. First, there were real rituals, mostly carried out by magical practitioners, for healing and other beneficial purposes. Second, there were imaginary rituals, which the accusers thought that witches carried out when they met the Devil; accused witches were forced to confess to these under torture. Each type of ritual could use magical objects. Thus, a ‘belt’ or a ‘sword’ could be used in healing rituals, whereas ‘corpse powder’ appeared in confessions to demonic rituals. For more information about the ritual objects mentioned below please refer to the Survey’s glossary of terms here: https://witches.hca.ed.ac.uk/glossary/
    • A customisable timeline with slider option and calendar style layout option.
    • Select all/deselect all filter options.
    • A modern map layer and a 1750 historic Dorret map layer from National Library of Scotland.
    • A ‘name search‘ option in the Histropedia timeline so you can type in the accused name and read their Wikipedia page and Survey of Scottish Witchcraft page.
    • Age of accused (where recorded) filter in our new Histropedia timeline feature.
    • All the unnamed accused witches on the Histropedia timeline.
    • Who named/denounced who and a network analysis over time.
    • Details of supposed witches’ meetings (locations and how they meetings were characterised and what supposedly happened at them)
    • Types of pacts with the devil – “Descriptions of meeting the Devil and entering a pact with him feature in the majority of records that have detailed information. This relationship with the Devil was crucial to the church and the law in proving someone was guilty. 90% of those whose records show demonic features were women. Many people were tortured into confessing to Devil-worship. For more information about the types of pact please refer to the Survey’s glossary of terms here: https://witches.hca.ed.ac.uk/glossary/”
    • Types of property damage supposedly caused by the accused.
    • A new contact form so anomalies and suggestions can be reviewed and addressed.
    • A new map of witch memorials and sites of interest (coming soon) – being collated and illustrated with images and links.
    • A Curious Edinburgh tour of Edinburgh locations associated with the Scottish witch trials. (coming soon) – more research to be done.

Histropedia timeline with Survey pages embedded in each entry

Year on year, we are building on the work that has gone before and striving to respect and honour the legacy of that work with the human stories of the accused women always at the heart of what we do. Our ‘Witchfinder General‘ student interns in 2022 (Maggie Lin and Josep Garcia-Reyero) added so much data on the witchcraft investigations and now Ruby Imrie in 2023 have helped to turn this data into something quality assured, parseable and implementable on our website: adding dates of witchcraft investigations so we can explore timelines; creating filters to explore individual aspects of the investigations; creating network analyses of who named/denounced who, and demonstrating how the hysteria of the witch trials spread across Scotland in space and time during panic and non panic periods so we can better understand and illuminate this dark period of Scottish history.

If curious about how this all started then you can watch our very first student ‘Witchfinder General’ student intern, Emma Carroll, talking about her 2019 work hunting for the places of residence of all the accused witches in Scotland so they could be geolocated on a map and their individual stories discovered, remembered and brought home to their local communities.

Witchy Wikidata – a 6th birthday celebration event for Halloween

Wikidata is turning 6 years old at the end of October 2018“the source for open structured data on the web and for facts within Wikipedia.” so we are hosting a birthday celebration on Wednesday 31st October 2018 in time for Halloween in Teaching Studio LG.07, David Hume Tower, University of Edinburgh.

Wikidata is a free and open data repository of the world’s knowledge that anyone can read & edit. Wikidata’s linked database acts as central storage for the structured data of its Wikimedia sister projects.

Using Wikidata, information on Wikipedia can be queried & visualised as never before. The sheer versatility of how this data can be used is only just beginning to be understood & explored.

In this session we will explain why Wikidata is so special, why its users are so excited by the possibilities it offers, why it may overtake Wikipedia in years to come as the project to watch and how it is quietly on course to change the world.

Pumpkinpedia

What will the session include?

  • An introduction to Wikidata: what it is, why it is useful and all the amazing things that can be done with structured, linked, machine-readable open data.
  • A practical activity using the Survey of Scottish Witchcraft database where you will learn the ‘nuts & bolts’ of how to use and edit Wikidata (manually and in bulk) and help shape the future of open knowledge!
  • A practical guide to querying Wikidata using the SPARQL Query Service.
  • Cake and Wikidata swag to take home.

Who should attend?

Absolutely anyone can use Wikidata for something, so people of all disciplines and walks of life are encouraged to attend this session. Basic knowledge of using the internet will be needed for the practical activity, but there are no other pre-requisites.

Anyone interested in open knowledge, academic research, application development or data visualisation should come away buzzing with exciting new ideas!

NB: Please bring a laptop with you OR email ewan.mcandrew@ed.ac.uk at least 24 hours ahead of the event if you need to borrow one.

Please also create a Wikidata account ahead of the event.

Programme

  • 10:45 – 11:00: Welcome, Tea/Coffee, Registration
  • 11:00 – 11:30: Introduction to Wikidata – what is it, and why is it useful? – Dr. Sara Thomas, Scotland Programme Co-ordinator for Wikimedia UK.
  • 11:30 – 12:30: Introduction to SPARQL queries – Delphine Dallison (Wikimedian at the Scottish Library and Information Council).
  • 12:30 – 13:00: Break for lunch
  • 13:00 – 14:30: Witchy data session – Ewan McAndrew (Wikimedian in Residence at the University of Edinburgh).
    • Manual edits practical – adding data from the Survey of Scottish Witchcraft database to Wikidata.
    • Mass edits practical – adding data in bulk from the Survey of Scottish Witchcraft database to Wikidata.
    • Visualising the results
  • 14:30 – 14:45:– Close and thanks.

Book here to attend.

If coming from outside the University of Edinburgh then book your place via Eventbrite here.

North Berwick witches – the logo for the Survey of Scottish Witchcraft database (Public Domain, via Wikimedia Commons)

Wikidata in the Classroom and the WikiCite project

The following post was presented by Wikimedian in Residence, Ewan McAndrew, at the Repository Fringe Conference 2018 held on 2nd & 3rd July 2018 at the Royal Society of Edinburgh.

 

Hi, my name’s Ewan McAndrew and I work at the University of Edinburgh as the Wikimedian in Residence.

My talk’s in two parts;

The first is part is on teaching data literacy with the Survey of Scottish Witchcraft database and Wikidata.

Contention #1:  since the City Region deal is there is a pressing need for implementing data literacy in the curriculum to produce a workforce equipped with the data skills necessary to meet the needs of Scotland’s growing digital economy and that this therefore presents a massive opportunity for educators, researchers, data scientists and repository managers alike.

Wikidata is the sister project of Wikipedia and it the backbone to all the Wikimedia projects, a centralised hub of structured, machine-readable, multilingual linked open data. An introduction to Wikidata can be found here.

I was invited along with 13 other ‘problem holders’ to a ‘Data Fair’ on 26 October 2017 hosted by course leaders on the Data Science for Design MSc. We were each afforded just five minutes to pitch a dataset for the 45 students on the course to work on in groups as a five week long project.

The ‘Data Fair’ held on 26 October 2017 for Data Science for Design MSc students. CC-BY-SA, own work.

Two groups of students were enthused to volunteer to help surface the data from the Survey of Scottish Witchcraft database, a fabulous piece of work at the University of Edinburgh from 2001-2003 chronicling information about accused witches in Scotland from the period 1563-1736, their trials and the individuals involved in those trials (lairds, sheriffs, prosecutors etc.) which remained somewhat static and unloved in an Microsoft Access database since the project concluded in 2003. So students at the 2017 Data Fair were invited to consider what could be done if the data was exported into Wikidata with attribution, linking back to the source database to provide verifiable provenance, given multilingual labels and linked to other complementary datasets? Beyond this, what new insights & visualisations of the data could be achieved?

There were several areas of interest: course leaders on the Data Science for Design MSc were keen for the students to work with ‘real world’ datasets in order to give them practical experience ahead of their dissertation projects.

 “A common critique of data science classes is that examples are static and student group work is embedded in an ‘artificial’ and ‘academic’ context. We look at how we can make teaching data science classes more relevant to real-world problems. Student engagement with real problems—and not just ‘real-world data sets’—has the potential to stimulate learning, exchange, and serendipity on all sides, and on different levels: noticing unexpected things in the data, developing surprising skills, finding new ways to communicate, and, lastly, in the development of new strategies for teaching, learning and practice.”

Towards Open-World Scenarios: Teaching the Social Side of Data Science by Dave Murray Rust, Joe Corneli and Benjamin Bach.

Beyond this, there were other benefits to the exercise. Tim Berners-Lee, the inventor of the Web, has suggested a 5-star deployment scheme for Open Data (illustrated in the picture and table below). Importing data into Wikidata makes it 5 star data!

By Michael Hausenblas, James G. Kim, five-star Linked Open Data rating system developed by Tim Berners-Lee. (http://5stardata.info/en/) [CC0], via Wikimedia Commons

Number of stars Description Properties Example format
make your data available on the Web (whatever format) under an open license
  • Open license
PDF
★★ make it available as structured data (e.g., Excel instead of image scan of a table)
  • Open license
  • Machine readable
XLS
★★★ make it available in a non-proprietary open format (e.g., CSV instead of Excel)
  • Open license
  • Machine readable
  • Open format
CSV
★★★★ use URIs to denote things, so that people can point at your stuff
  • Open license
  • Machine readable
  • Open format
  • Data has URIs
RDF
★★★★★ link your data to other data to provide context
  • Open license
  • Machine readable
  • Open format
  • Data has URIs
  • Linked data
LOD

Importing data into Wikidata makes it 5 star data!

Open data producers can use Wikidata IDs as identifiers in datasets to make their data 5 star linked open data. As of June 2018, Wikidata featured in the latest Linked Open Data cloud diagram on lod-cloud.net as a dataset published in the linked data format containing over 5,100,000,000 triples.

Over a series of workshops, the Wikidata assignment also afforded the students the opportunity to develop their understanding of, and engagement with, issues such as: data completeness; data ethics; digital provenance; data analysis; data processing; as well as making practical use of a raft of tools and data visualisations. It also motivated student volunteers to surface a much-loved repository of information as linked open data to enable further insights and research. A project that the students felt proud to take part in and found “very meaningful”. (The students even took the opportunity to consult with professors of History at the university in order to gain even more of an understanding of the period in which these witch trials took place, such was their interest in the subject).

Feedback from students at the conclusion of the project included:

  • “After we analysed the data, we found we learned the stories of the witches and we learned about European culture especially in the witchhunts.”
  • “We had wanted to do a happy project but finally we learned much more about these cultures so it was very meaningful for us.”
  • “In my opinion, it’s quite useful to put learning practice into the real world so that we can see the outcome and feel proud of ourselves… we learned a lot.”
  • “Thank you for inviting us and appreciating our video. It’s an unforgettable experience in my life. Thank you so much.”

As a  result of the students’ efforts, we now have 3219 items of data on the accused witches in Wikidata (spanning 1563 to 1736). We also now have data on 2356 individuals involved in trying these accused witches. Finally we have 3210 witch trials themselves. This means we can link and enrich the data further by adding location data, dates, occupations, places of residence, social class, marriages, and penalties arising from the trial.

The fact that Wikidata is also linked open data means that students can help connect to and leverage from a variety of other datasets in multiple languages; helping to fuel discovery through exploring the direct and indirect relationships at play in this semantic web of knowledge.

 

Descendents of King James VI and I, king during union of English and Scottish crowns

And we can see an example of this semantic web of related entities, or historical individuals in this case, here in this visualisation of the descendants of King James I of England and James VI of Scotland (as shown in the pic above but do click on the link for a live rendering).

We can also see the semantic web at play in the below class level overview of gene ontologies (505,000 objects) loaded into Wikidata, and linking these genes to items of data on related proteins and items of data on related diseases, which, in turn, have related chemical compounds and pharmaceutical products used to treat these diseases. Many of these datasets have been loaded into Wikidata, or are maintained by, the GeneWiki initiative – around a million Wikidata items of biomedical data – but, importantly, they are also leveraging from other datasets imported from the Centre for Disease Control (CDC) among other sources. This allows researchers to add to and explore the direct and, perhaps more importantly, the indirect relationships at play in this semantic web of knowledge to help identify areas for future research.

 

Using Wikidata as an open, community-maintained database of biomedical knowledge – CC-BY: Andrew Su, Professor at The Scripps Research Institute.

Which brings me onto…

Contention #2 – Building a bibliographical repository: the sum of all citations

Sharing your data to Wikidata, as a linking hub for the internet, is also the most cost-effective way to surface your repository’s data and make it 5 star linked open data. As a centralised hub for linked open data on the internet, it enables you to leverage from many other datasets while you can still have  your own read/write applications on top of Wikidata. (Which is exactly what the GeneWiki project did to encourage domain experts to contribute to knowledge gaps on Wikidata through providing a user-friendly read/write interface to enable the “consumption and curation” of gene annotation data using the Wiki Genome web application).

Within Wikidata, we have biographical data, geographical data, biomedical data, taxomic data and importantly, bibliographic data.

The WikiCite project are building a bibliographic repository of sources within Wikidata.

“How does the Wikimedia movement empower individuals to assess reliable sources and arm them with quality information so they can make decisions based in facts? This question is relevant not only to Wikipedia users​ but to consumers of media around the globe. Over the past decade, the Wikimedia movement has come together to answer that question. Efforts to design better ways to support sourcing have begun to coalesce around Wikidata – the free knowledgebase that anyone can edit. With the creation of a rich, human-curated, and machine-readable knowledgebase of sources, the WikiCite initiative is crowdsourcing the process of vetting information​ and its provenance.” – WikiCite Report 2017

Wikidata tools can be used to create Wikidata items on scholarly papers automatically from scraping source metadata from:

  • DOIs,
  • PMIDs,
  • PMCIDs
  • ORCIDs (NB: Multiple items of data can be created simultaneously to represent multiple scholarly papers using one ORCID identifier input in the Orcidator tool).

Indeed, 1 out of 4 items of data in Wikidata represents a creative work. Wikidata currently includes 10 million entries about citable sources, such as books, scholarly papers, news articles and over 75 million author string statements and 84 million citation links in Wikidatas between these authors and sources. 17 million items with a Pubmed ID and 12.4 million items with a DOI.

Mike Bennett, our Digital Scholarship Developer at the University of Edinburgh, is working to develop a tool to translate the Edinburgh Research Archives’ thesis collection data from ALMA into a format that Wikidata can accept but there are ready-made tools that Wikidatans have developed that will automatically create a Wikidata item of data for scholarly papers scraping the source metadata from DOIs, Pubmed IDs and ORCID identifiers, allowing for a bibliographic record of scholarly papers and their authors to be generated as structured, machine-readable, multilingual linked open data.

Why does this matter?

Well…​the Initiative for Open Citations (I4OC) is a new collaboration between scholarly publishers, researchers, and other interested parties to promote the unrestricted availability of scholarly citation data. Over 150 publishers have now chosen to deposit and open up citation data. As a result, the fraction of publications with open references has grown from 1% to more than 50% out of 38 million articles with references deposited with Crossref.

Citations are the links that knit together our scientific and cultural knowledge. They are primary data that provide both provenance and an explanation for how we know facts. They allow us to attribute and credit scientific contributions, and they enable the evaluation of research and its impacts. In sum, citations are the most important vehicle for the discovery, dissemination, and evaluation of all scholarly knowledge.”

Once made open, the references for individual scholarly publications may be accessed within a few days through the Crossref REST API.  Open citations are also available from the OpenCitations Corpus that is progressively and systematically harvesting citation data from Crossref and other sources. An advantage of accessing citation data from the OpenCitations Corpus is that they are available i n machine-readable RDF format which is systematically being added to Wikidata.

Because this is data on scholars, scholarly papers and citations is stored as linked data on Wikidata, the citation data can be linked to, and leverage from, other complementary datasets enabling the direct and indirect relationships to be explored in this semantic web of knowledge.

This means we can parse the data to answer a range of queries such as:

  • Show me all works which cite a New York Times article/Washington Post article/Daily Telegraph article etc. (delete as appropriate).
  • Show me the most popular journals cited by statements of any item that is a subclass of economics/archaeology/mathematics etc. (delete as appropriate).
  • Show me all statements citing the works of Joseph Stiglitz/Melissa Terras/James Loxley/Karen Gregory etc. (delete as appropriate).
  • Show me all statements citing journal articles by physicists at Oxford University in 1960s/1970s/1980s etc. (delete as appropriate).
  • Show me all statements citing a journal article that was retracted.

And much more besides.

Screengrab of the Scholia profile for the developmental psychologist, Uta Frith, generated from the structured linked data in Wikidata.

 

Like the WikiGenome web application already mentioned, other third party applications can be built with user-friendly UIs to read/write from Wikidata. For instance, the Scholia Web service creates on-the-fly scholarly profiles for researchers, organizations, journals, publishers, individual scholarly works, and research topics. Leveraging from information in Wikidata, Scholia displays information on total number of publications, co-authors, citation statistics in a variety of visualisations. Another way of helping to demonstrate the impact and reach of your research.

Citation statistics for developmental psychologist Uta Frith, visualised on the Scholia web service and generated from the citation data in Wikidata.

Co-author graph for Polly Arnold, Professor of Chemistry at the University of Edinburgh in the School of Chemistry visualised in the Scholia Web Service and generated from bibliographic data in Wikidata. Professor Arnold is the Crum Brown Chair of Chemistry at the University of Edinburgh.

To  conclude, the many benefits and power of linked open data to aid the teaching of data literacy and to help share knowledge between different institutions and different repositories, between geographically and culturally separated societies, and between languages is a beautiful empowering thing. Here’s to more of it and entering a brave new world of linked open data. Thank you.

By way of closing I’d like to show you the video presentations the students on the Data Science for Design MSc students came up with as the final outcome of their project to import the Survey of Scottish Witchcraft database into Wikidata.

Here are two data visualisation videos they produced:

Further reading

 3 steps to better demonstrate your institution’s commitment to Open Knowledge and Open Science.

  1. Allocate time/buy out time for academics & postdoctoral researchers to add university research (backed up with citations) to Wikipedia in existing/new pages. Establishing relevance is the most important aspect of adding university research so an understanding of the subject matter is important along with ensuring the balance of edits meets the ethos of Wikipedia so that any possible suggestion of promotion/academic boosterism is outweighed by the benefit of subject experts paying knowledge forward for the common good. At least three references are required for a new article on Wikipedia so citing the work of fellow professionals goes some way to ensuring the article has a wider notability and helps pay it forward. Train contributors prior to editing to ensure they are aware of Wikipedia’s policies & guidelines and monitor their contributions to ensure edits are not reverted.
  2. Identify the most cited works by your university’s researchers which are already on Wikipedia using Altmetric software. Once identified, systematically add in the Open Access links to any existing (paywalled) citations on Wikipedia and complete the edit by adding in the OA symbol (the orange padlock) using the {{open access}} template. Also join WikiProject Open Access.
  3. Help build up a bibliographic repository of structured machine-readable (and multilingual) linked open data on both university researchers AND research papers in Wikidata using the easy-to-use suite of tools available.

Wikidata in the Classroom – Data Literacy for the next generation

Summary:

The University of Edinburgh are looking to support the development of a data-literate workforce over the next ten years to support Scotland’s growing digital economy. This therefore represents a huge opportunity for educators, researchers and data scientists to support students in this aim. The first Wikidata in the Classroom assignment at the university is taking place this semester on the Data Science for Design MSc course and two groups of students are working on a project to import the Survey of Scottish Witchcraft database into Wikidata to see what possibilities surfacing this data as structured linked open data can achieve.

Wikidata in the Classroom

The New York Times described this current era as an ‘era of data but no facts’. Data is increasingly valuable as a key driver of the 21st century economy and is certainly abundant with 90% of the world’s data reportedly created in the last two years. Yet, it has never been more difficult to find ‘truth in the numbers’ with over 60 trillion pages to navigate and terabytes of unstructured data to (mis)interpret.

The way forward is clear.

  • “We need to increase the reputational consequences and change the incentives for making false statements… right now, it pays to be outrageous, but not to be truthful.”(Nyhan in the Economist, 2016)
  • ”This challenge is not just for school librarians to prepare the next generation to be informed but for all librarians to assist the whole population.”(Abram, 2016)

Issues at the heart of the information age have been exposed: there exists a glut of information & a sea of data to navigate with little formalised guidance as to how to find our way through it. For the beleaguered student, this glut makes it near impossible to find ‘truth in the numbers’. Therefore there are huge areas of convergence in developing information & data literacy in the next generation and developing Wikidata as a linked hub of verifiable data; fueling discovery and surfacing open knowledge through Google’s Knowledge Graph but, importantly, providing the digital provenance so it can be checked.

Meeting the information & data literacy needs of our students

The Edinburgh and South East Scotland City Region has recently secured a £1.1bn City Region deal from the UK and Scottish Governments. Out of this amount, the University of Edinburgh will receive in the region of £300 million towards making Edinburgh the ‘data capital of Europe’ through developing data-driven innovation. Data “has the potential to transform public and private organisations and drive developments that improve lives.” More specifically, the university is being trusted with the responsibility of delivering a data-literate workforce of 100,000 young people over the next ten years; a workforce equipped with the data skills necessary to meet the needs of Scotland’s growing digital economy.

The implementation of Wikidata in the curriculum therefore presents a massive opportunity for educators, researchers and data scientists alike; not least in honouring the university’s commitment to the creating, curating & dissemination of open knowledge. A Wikidata assignment allows students to develop their understanding of, and engagement with, issues such as: data completeness; data ethics; digital provenance; data analysis; data processing; as well as making practical use of a raft of tools and data visualisations. The fact that Wikidata is also linked open data means that students can help connect to & leverage from a variety of other datasets in multiple languages; helping to fuel discovery through exploring the direct and indirect relationships at play in this semantic web of knowledge. This real-world application of teaching and learning enables insights in a variety of disciplines; be it in open science, digital humanities, cultural heritage, open government and much more besides. Wikidata is also a community-driven project so this allows students to work collaboratively and develop the online citizenship skills necessary in today’s digital economy.

Data Science for Design MSc – Importing the Survey of Scottish Witchcraft database into Wikidata

Packed house at the Data Fair for the Data Science for Design MSc course – 26 October 2017 (Own work, CC-BY-SA)


At the University of Edinburgh, we have begun our first Wikidata in the Classroom assignment this semester on the Data Science for Design MSc course. At the course’s Data Fair on 26th October 2017, researchers from across the university presented the 45 masters students in Design Informatics with approximately 13 datasets to choose from to work on in groups of three. Happily, two groups were enthused to import the university’s Survey of Scottish Witchcraft database into Wikidata (the choice of database to propose was suggested by a colleague). This fabulous resource began life in the 1990s before being realised in 2001-2003. It had as its aim to collect, collate and record all known information about accused witches and witchcraft belief in early modern Scotland (from 1563 to 1736) in a Microsoft Access database and to create a web-based user interface for the database. Since 2003, the data has remained static in the Access database and so students at the 2018 Data Fair were invited to consider what could be done if the data were exported into Wikidata, given multilingual labels and linked to other datasets? Beyond this, what new insights & visualisations of the data could be achieved?

The methodology

A similar methodology to managing Wikipedia assignments was employed; making the transition from managing a Wikipedia assignment to managing a Wikidata assignment an easy one. The two groups of students underwent a 1.5 hour practical induction on working with Wikidata and third party applications such as Histropedia, the timeline of everything, before being introduced to the Access database. They then discussed collaboratively how best to divide the task of analysing and exporting the data before deciding one group would work on (1) importing records for the 3,212 accused witches while the other group would work on (2) the import of the witch trial records and (3) the people associated with these trials (lairds, judges, ministers, prosecutors, witnesses etc).

At this current juncture, the groups have researched and now submitted their data models for review. Now the proposed data model has been checked and agreed upon, the students are ready to process the data from the Access database into a format Wikidata can import (making use of the Wikidata plug-in on Google Spreadsheets). Once this stage is complete, the students can then choose how to visualise the linked data in a number of ways; such as maps, timelines, graphs, bubble charts and more. The students are to complete their project by presenting their insights and data visualisations in an engaging way of their choice on the 30th of November 2017.

North Berwick witches – the logo for the Survey of Scottish Witchcraft database (Public Domain, via Wikimedia Commons)

The way forward

The hope is that this project will aid the students’ understanding of data literacy through the practical application of working with a real-world dataset and help shed new light on a little understood period of Scottish history. This, in turn, may help fuel discoveries by dint of surfacing this data and linking it with other related datasets across the UK, across Europe and beyond. As the Survey of Scottish Witchcraft’s website states itself Our list of people involved in the prosecution of witchcraft suspects can now be used as the basis for further inquiry and research.“

The power of linked open data to share knowledge between different institutions, between geographically and culturally separated societies, and between languages is a beautiful thing. Here’s to many more Wikidata in the Classroom assignments.

Powered by WordPress & Theme by Anders Norén