For Repo-Fringe 2016, myself and Histropedia’s Navino Evans will be co-presenting a showcase of two of Wikipedia’s sister projects: Wikisource, the free content library, and Wikidata, the structured data knowledge base. With both projects, it is not about what they hold in their repositories so much as what that knowledge means to the user able to access it; be it the experience of being able to commune with the past through Wikisource for those authentic ‘shiver-inducing’ moments of digital contact with library & archival materials or being able to manipulate & visualise structured data through Wikidata, actually querying & utilising information on Wikipedia, as never before in myriad ways. The possibilities for both projects are endless and highlight the importance of curating & safeguarding repositories of open knowledge such as these.
Hence our showcase event, as part of Repository Fringe 2016 on 2nd August at the John McIntyre Conference Centre in Edinburgh, will focus on this and provide practical demonstrations of how to engage with the past, present & future with these two projects.
Consequently, the English teacher part of me has opted for a title which attempts to sum this up:
“It’s not what you do. It’s what it does to you.”
Wikidata and Wikisource Showcase – 2nd August 2016
Engaging with the past, present & future with Wikipedia’s sister projects.
This is a nod to Simon Armitage’s poem, ‘It ain’t what you do, it’s what it does to you‘, a hymn of praise to the experiential.
It ain’t what you do, it’s what it does to you
I have not bummed across America
with only a dollar to spare, one pair
of busted Levi’s and a bowie knife.
I have lived with thieves in Manchester.
I have not padded through the Taj Mahal,
barefoot, listening to the space between
each footfall picking up and putting down
its print against the marble floor. But I
skimmed flat stones across Black Moss on a day
so still I could hear each set of ripples
as they crossed. I felt each stone’s inertia
spend itself against the water; then sink.
I have not toyed with a parachute cord
while perched on the lip of a light-aircraft;
but I held the wobbly head of a boy
at the day centre, and stroked his fat hands.
And I guess that the tightness in the throat
and the tiny cascading sensation
somewhere inside us are both part of that
sense of something else. That feeling, I mean.
The famous aviation poem written in 1941 by 19-year-old Pilot Officer John Gillespie Magee Jr, three months before he was killed.
Oh! I have slipped the surly bonds of earth,
And danced the skies on laughter-silvered wings;
Sunward I’ve climbed, and joined the tumbling mirth
Of sun-split clouds, –and done a hundred things
You have not dreamed of –Wheeled and soared and swung
High in the sunlit silence. Hov’ring there
I’ve chased the shouting wind along, and flung
My eager craft through footless halls of air…
Up, up the long, delirious, burning blue
I’ve topped the wind-swept heights with easy grace
Where never lark or even eagle flew —
And, while with silent lifting mind I’ve trod
The high untrespassed sanctity of space,
Put out my hand, and touched the face of God.
I was pleased we were able to host a week themed on ‘Wikimedia & Open Knowledge’ as part of the University of Edinburgh’s Postgraduate Certificate of Academic Practice.
Participants on the course were invited to think critically about the role of Wikipedia in academia.
In particular, to read, consider, contrast and discuss four articles:
The first by Dr. Martin Poulter, Wikimedian in Residence at the University of Oxford, is highly recommended in terms of articulating Wikipedia & its sister projects role in allowing digital ‘shiver-inducing’ contact with library & archival material;
This was my response to the reading (and some additional reading).
Search failure: the challenges facing information retrieval in an age of information explosion.
This article takes, as its starting point, the news that Wikipedia were reportedly developing a ‘Knowledge Engine’ and focuses on the most dominant web search engine, Google, to examine the “consecrated status” (Hillis, Petit & Jarrett, 2013) it has achieved and its transparency, reliability & trustworthiness for everyday searchers.
The purpose of this article is to examine the pitfalls of modern information retrieval & attempts to circumnavigate them, with a focus on the main issues surrounding Google as the world’s most dominant search engine.
“Commercial search engines dominate search-engine use of the Internet, and they’re employing proprietary technologies to consolidate channels of access to the Internet’s knowledge and information.” (Cuthbertson, 2016)
On 16th February 2016, Newsweek published a story entitled ‘Wikipedia Takes on Google with New ‘Transparent’ Search Engine’. The figure applied for, and granted by the Knight Foundation, was a reported $250,000 dollars as part of the Wikimedia Foundation’s $2.5 million programme to build ‘the Internet’s first transparent search engine’.
The sum applied for was relatively insignificant when compared to Google’s reported $75 billion revenue in 2015 (Robinson, 2016). Yet, it posed a significant question; a fundamental one. Just how transparent is Google?
Two further concerns can be identified from the letter to Wikimedia granting the application: “supporting stage one development of the Knowledge Engine by Wikipedia, a system for discovering reliable and trustworthy public information on the Internet.”(Cuthbertson, 2016). This goes to the heart of the current debate on modern information retrieval: transparency, reliability and trustworthiness? How then are we faring in these three measures?
Defining Information Retrieval
Informational Retrieval is defined as “a field concerned with the structure, analysis, organisation, storage, searching, and retrieval of information.” (Salton in Croft, Metzler & Strohman, 2010, p.1).
Croft et al (2010) identify three crucial concepts in information retrieval:
Relevance – Does the returned value satisfy the user searching for it.
Evaluation – Evaluating the ranking algorithm on its precision and recall.
Information Needs – What needs generated the query in the first place.
Today, since the advent of the internet, this definition needs to be understood in terms of how pervasive ‘search’ has become. “Search is the way we now live.” (Darnton in Hillis, Petit & Jarrett, 2013, p.5). We are all now ‘searchers’ and the act of ‘searching’ (or ‘googling’) has become intrinsic to our daily lives.
Dominance of one search engine
“When you turn on a tap you expect clean water to come out and when you do a search you expect good information to come out” (Swift in Hillis, Petit & Jarrett, 2013)
With over 60 trillion pages (Fichter and Wisniewski, 2014) and terabytes of unstructured data to navigate, the need for speedy & accurate responses to millions of queries has never been more important.
Navigating the vast sea of information present on the web means the field of Information Retrieval necessitates wrestling with, and constantly tweaking, the design of complex computer algorithms (determining a top 10 list of ‘relevant’ page results through over 200 factors).
Google, powered by its PageRank algorithm, has dominated I.R. since the early 1990s, indexing the web like a “back-of-the-book” index (Chowdhury, 2010, p.5). While this oversimplifies the complexity of the task, modern information retrieval, in searching through increasingly multimedia online resources, has necessitated the addition of newer more sophisticated models. Utilising ‘artificial intelligence’ & semantic search technology to complement the PageRank algorithm, Google now navigates through the content of pages & generates suggested ‘answers’ to queries as well as the 10 clickable links users commonly expect.
According to 2011 figures in Hillis, Petit & Jarrett (2013), Google processed 91% of searches internationally and 97.4% of the searches made using mobile devices. This undoubted & sustained dominance has led to accusations of abuse of power in two recent instances.
Nicas & Kendall (2016) report that the Federal Trade Commission along with European regulators are examining claims that Google has been abusing its position in terms of smartphone companies feeling they had to give Google Services preferential treatment because of Android’s dominance.
In addition, Robinson (2016) states that the Authors Guild are petitioning the Supreme Court over Google’s alleged copyright-infringement; going back a decade ago when over 20 million library books were digitised without compensation or author/publisher permission. The argument is that the content taken has since been utilised by Google for commercial gain to generate more traffic, more advertising money and thus confer on them market leader status. This echoes the New Yorker article’s response to Google’s aspiration to build a digital universal library: “Such messianism cannot obscure the central truth about Google Book Search: it is a business” (Toobin in Hillis, Petit & Jarrett, 2013).
Google’s business is powered, like every search engine, by its ranking algorithm. For Cahill et al (2009), Google’s “PageRank is a quantitative rather than qualitative system”. PageRank works by ranking pages in terms of how well linked a page is, how often it is clicked on and the importance of the page(s) that links to it. In this way, PageRank assigns importance to a page.
Other parameters are taken into consideration including, most notably, the anchor text which provides a short descriptive summary of the page it links to. However, the anchor text has been shown to be vulnerable to manipulation, primarily from bloggers, by the process known as ‘Google bombing’. Google bombing is defined as “the activity of designing Internet links that will bias search engine results so as to create an
inaccurate impression of the search target” (Price in Bar-Ilan, 2007). Two famous examples include when Microsoft came as top result for the query ‘More evil than Satan’ and when President Bush ranked as first result for ‘miserable failure’. Bar-Ilan (2007) suggests google bombs come about for a variety of reasons: ‘fun, ‘personal promotion’, ‘commercial’, ‘justice’, ‘ideological’ and ‘political’.
Although reluctant to alter search results, the reputational damage google bombs were having necessitated a response. In the end, Google altered the algorithm to defuse a number of google bombs. Despite this, “spam or joke sites still float their way to the top.”(Cahill et al, 2009) so there is a clear argument to be had about Google, as a private corporation, continuing to ‘tinker’ with the results delivered by its algorithm and how much its coders should, or should not, arbitrate access to the web in this way. After all, the algorithm will already bear hallmarks of their own assumptions without any transparency on how these decisions are arrived at. Further, Google Bombs, Byrne (2004) argues, empower those web users whom the ranking system, for whatever reason, has disenfranchised.
Just how reliable & trustworthy is Google?
“Easy, efficient, rapid and total access to Truth is the siren song of Google and the culture of search. The price of access: your monetizable information.”(Hillis, Petit & Jarrett, 2013, p.7)
For Cahill et al (2009), Google has made the process of searching too easy and searchers have becoming lazier as a result; accepting Google’s ranking at face value. Markland in van Dijck (2010) makes the point that students favouring of Google means they are dispensing with the services libraries provide. The implication being that, despite library information services delivering a more relevant & higher quality search result, Google’s quick & easy ‘fast food’ approach is hard to compete with.
This seemingly default trust in the neutrality of Google’s ranking algorithm also has a ‘funnelling effect’ according to Beel & Gipp (2009); narrowing the sources clicked upon 90% of the time to just the first page of results with a 42% click through on the first choice alone. This then creates a cosy consensus in terms of the fortunate pages clicked upon which will improve their ranking while “smaller, less affluent, alternative sites are doubly punished by ranking algorithms and lethargic searchers.” (Pan et al. in van Dijck, 2010)
While Google would no doubt argue that all search engines closely guard how their ranking algorithms are calibrated to protect them from aggressive competition, click fraud and SEO marketing, the secrecy is clearly at odds with principles of public librarianship. Further, Van Dijck (2010) argues that this worrying failure to disclose is concealing how knowledge is produced through Google’s network and the commercial nature of Google’s search engine. After all, search engines greatest asset is the metadata each search leaves behind. This data can be aggregated and used by the search engine to create profiles of individual search behaviour and collective profiles which can then be passed on to other commercial companies for profit. That is not to say it always does but there is little legislation to stop it in an area that is largely unregulated. The right to privacy does not, it seems, extend to metadata and ‘in an era in which knowledge is the only bankable commodity, search engines own the exchange floor.’ (Halavais in van Dijck, 2010)
Scholarly knowledge and the reliability of Google Scholar
When considering the reliability, transparency & trustworthiness of Google and Google Scholar it is pertinent to look at its scope and differences with other similar sites. Unlike Pubmed and Web of Science, Google Scholar is not a human-curated database but is instead an internet search engine therefore its accuracy & content varies greatly depending on what has been submitted to it.Google Scholar does have an advantage is that it searches the full text of articles therefore users may find searching easier on Scholar compared to WoS or Pubmed which are limited to searching according to the abstract, citations or tags.
Where Google Scholar could be more transparent is in its coverage as some notable publishers have been known, according to van Dijck (2010), to refuse to give access to their databases. Scholar has also been criticised for the lack of completeness of its citations, as well as its covering of social science and humanities databases; the latter an area of strength for Wikipedia according to Park (2011). But the searcher utilising Google Scholar would be unaware of these problems of scope when they came to use it.
Further, Beel & Gipp (2009) state that the ranking system on Google Scholar, leads to articles with lots of citations receiving higher rankings, and as a result, receive even more citations because of this. Hence, while the digitization of sources on the internet opens up new avenues for scholarly exploration, ranking systems can be seen to close ranks on a select few to the exclusion of others.
As Van Dijck (2010) points out: “Popularity in the Google-universe has everything to do with quantity and very little with quality or relevance.” In effect, ranking systems determine which sources we can see but conceal how this determination has come about. This means that we are unable to truly establish the scope & relevance of our search results. In this way, search engines cannot be viewed as neutral, passive instruments but are instead active “actor networks” and “co-producers of academic knowledge.” (van Dijck, 2010).
Further, it can be argued that Google decides which sites are included in its top ten results. With so much to gain commercially, from being discoverable on Google’s first page of results, the practice of Search Engine Optimising (SEO), or manipulating the algorithm to get your site in the top ten search results, has become widespread. SEO techniques can be split into ‘white hat’ (legitimate businesses with a relevant product to sell) and ‘black hat’ (sites who just want clicks and tend not to care about the ‘spamming’ techniques they employ to get them). As a result, PageRank has to be constantly manipulated, as with Google bombs, to counteract the effects of increasingly sophisticated ‘black hat’ techniques. Hence, the need for an improved vigilance & critical evaluation of the searches returned by Google has become a crucial skill in modern information retrieval.
The solution: Google’s response to modern information retrieval – Answer Engines
Google is the great innovator and is always seeking newer, better ways of keeping users on its sites and improving its search algorithm. Hence, the arrival of Google Instant in 2010 to autofill suggested keywords to assist searchers. This was followed by Google’s Knowledge Graph (and its Microsoft equivalent Bing Snapshot). These new services seek not just to provide the top ten links to a search query but also to ‘answer’ it by providing a number of the most popular suggested answers on the page results screen (usually showing an excerpt of the related Wikipedia article & images along the side panel), based on, & learning from, previous users’ searches on that topic.
Google’s Knowledge Graph is supported by sources including Wikipedia & Freebase (and the linked data they provide) along with a further innovation, RankBrain, which utilises artificial intelligence to help decipher the 15% of queries Google has not seen before. As Barr (2016) recognises: “A.I. is becoming increasingly important to extract knowledge from Google’s sea of data, particularly when it comes to classifying and recognizing patterns in videos, images, speech and writing.”
Bing Snapshot does much the same. The difference being that Bing provides links to the sources it uses as part of the ‘answers’ it provides. Google provides information but does not attribute it. Without this, it is impossible to verify their accuracy. This seems to be one of the thorniest issues in modern information retrieval; link decay and the disappearing digital provenance of sources. This is in stark contrast to Wikimedia’s efforts in creating Wikidata: “an open-license machine-readable knowledge base” (Dewey 2016) capable of storing digital provenance & structured bibliographic data. Therefore, while Google Knowledge Panels are a step forward, there are issues again over its transparency, reliability & trustworthiness.
Moreover, the 2014 EU Court ruling on ‘the right to be forgotten’, which Google have stated they will honour, also muddies the waters on issues of transparency & link decay/censorship:
“Accurate search results are vanishing in Europe with no public explanation, no real proof, no judicial review, and no appeals process…the result is an Internet riddled with memory holes — places where inconvenient information simply disappears.”(Fioretti, 2014).
The balance between an individual’s “right to be forgotten” and the freedom of information clearly still has to be found. At the moment, in the name of transparency, both Google and Wikimedia are posting notifications to affected pages that they have received such requests. For those wishing to be ‘forgotten’ this only highlights the matter & fuels speculation unnecessarily.
The solution: Wikipedia’s ‘transparent’ search engine: Discovery
Since the setup of the ‘Discovery’ team in April 2015 and the disclosure of the Knight Foundation grant, there have been mixed noises from Wikimedia with some claiming that there was never any plan to rival Google because a newer ‘internal’ search engine was only ever being developed in order to integrate Wikimedia projects through one search portal.
Ultimately, a lack of consultation between the board and the wider Wikimedia community members reportedly undermined the project & culminated in the resignation of Lila Tretikov, Executive Director of the Wikimedia Foundation, at the end of February and the plans for Discovery were shelved.
However, Sentance (2016) reveals that, in their leaked planning documents for Discovery, the Foundation were indeed looking at the priorities of proprietary search engines, their own reliance on them for traffic and how they could recoup traffic lost to Google (through Google’s Knowledge Graph) at the same time as providing a central hub for information from across all their projects through one search portal. Wikipedia results, after all, regularly featured in the top page of Google results anyway – why not skip the middle man?
Quite how internet searchers may have taken to a completely transparent, non-commercial search engine we’ll possibly never know. However, it remains a tantalizing prospect.
The solution: Alternatives Engines
An awareness of the alternative search engines available for use and their different strengths and weaknesses is a key component of the information literacy needed to navigate this sea of information. Bing Snapshot, for instance, makes greater use of providing the digital provenance for its sources than Google at present.
Notess (2016) serves notice that computational searching (e.g. Wolfram Alpha) continues to flourish along with search engines geared towards data & statistics (e.g. Zanran, DataCite.org and Google Public Data Explorer).
However, knowing about the existence of these differing search engines is one thing but knowing how to successfully navigate them is quite another as Notess (2016) himself concludes where “Finding anything beyond the most basic of statistics requires perseverance and experimenting with a variety of strategies.”
Information literacy, it seems, is key.
The solution: The need for information literacy
Given that electronic library services are maintained by information professionals, “values such as quality assessment, weighed evaluation & transparency” (van Dijck, 2010) are in much greater evidence than in commercial search engines. That is not to say that there aren’t still issues in library OPAC systems: whether it be in terms of the changes in the classification system used over time or the differing levels of adherence by staff to these classification protocols; or the communication to users of best practice in utilising the system.
The use of any search engine, requires literacy among the user group. The fundamental problem remains the disconnect between what a user inputs and what they can feasibly expect at the results stage. Understanding the nature of the search engine being used (proprietary or otherwise) a critical awareness of how knowledge is formed through its network and the type of search statement that will maximise your chances of success are all vital. As van Dijck (2010) states “Knowledge is not simply brokered (‘brought to you’) by Google or other search engines… Students and scholars need to grasp the implications of these mechanisms in order to understand thoroughly the extent of networked power”(Dijck, 2010).
Educating users of this broadens the search landscape, and defuses SEO attempts to circumvent our choices. Information literacy cannot be left to academics or information professionals alone, though they can play a large part in its dissemination. As mentioned at the beginning, we are all ‘searchers’. Therefore, it is incumbent on all of us to become literate in the ways of ‘search’ and pass it on, creating our own knowledge networks. Social media offers us a means of doing this; allowing us to filter information as never before and filtering is “transforming how the web works and how we interact with our world.” (Swanson, 2012)
Google may never become any more transparent. Hence, its reliability & trustworthiness will always be hard to judge. Wikipedia’s Knowledge Engine may have offered a distinctive model more in line with these terms but it is unlikely, at least for now, to be able to compete as a global crawler search engine.
Therefore, it is incumbent on searchers not to presume neutrality or assign any kind of benign munificence on any one search engine. Rather by educating themselves as to the merits & drawbacks of Google and other search engines, users will then be able to formulate their searches, and their use of search engines, with a degree of information literacy. Only then can they hope the returned results will match their individual needs with any degree of satisfaction or success.
Arnold, A. (2007). Artificial intelligence: The dawn of a new search-engine era. Business Leader, 18(12), pp. 22.
In the article, Dewey points out that “people’s ability to verify information….is something I think we really need to study and process as a society.”
Link decay and lack of digital provenance is a real issue following Google’s introduction of ‘Knowledge Panels’ and one which Wikidata, as an open-license, machine-readable knowledge base, seeks to correct.
Indeed “the primary issue with Google’s knowledge panels is that they aren’t terribly knowledgeable: They provide information but often leave out any context on where that information came from.”
That being said here are some unattributed facts gleaned from Google searches about the teams competing in Euro 2016 beginning tomorrow (in order of their FIFA ranking)…. with no possibility of their accuracy being verified as a result. Thanks goodness for Wikipedia!
Citation Needed – “Facts” about teams competing in Euro 2016.
Belgium – produces the greatest variety of bricks in the world. The word “gas” was proposed by Flemish chemist Jan Baptist van Helmont (1577-1644) as a phonetic spelling of his Dutch pronunciation of the Greek word “chaos”.
Germany – In Germany, the soft drink, Fanta, is actually the star ingredient of a popular dessert called Fantakuchen. Which translates to Fanta cake.
Spain – The official name of Spain is the “Kingdom of Spain.” Spain has also given the world the mop and bucket.
Portugal – The largest producer of cork products in the world.
England – In 1886, Sarah Ann Henley threw herself off Bristol’s Clifton Suspension Bridge after a row with her boyfriend, falling 75 metres on to the mud bank below. She was saved by her billowing crinoline petticoats, which helped to slow her fall, and lived on into her eighties. The last ‘witches’ to be hanged in Britain were three women from Bideford in Devon, in 1682. There was no evidence against them, but other villages accused them of sending the devil to their enemies’ houses, in the form of a magpie and a tabby cat.
Austria – The Austrian flag is one of the oldest national flag in the world. It dates from 1191, when Duke Leopold V fought in the Battle of Acre during the Third Crusade. The Austrian funeral industry is said to be largest per capita in Europe.
Turkey – One way of protecting a newborn baby in Turkey is the placing of a tortoise under a baby’s pillow at night. It is believed the tortoise will protect the child. Turkey is also the largest grower of hazelnuts in the world; responsible for 80% of the world’s hazelnut exports.
Switzerland – in 2007, Switzerland accidentally invaded its neighbor Liechtenstein. Switzerland’s Anti-PowerPoint Party, or APPP, actually works to decrease the number of PowerPoints used in professional presentation, claiming that Microsoft PowerPoint and its other software products are actually economically harmful. The goal of the APPP is to be the fourth largest political party in Switzerland, and their motto is “Finally do something!” Nicknamed the King of the Alps, Ulrich Inderbinen climbed the Matterhorn a staggering 370 times, the last at age 90. The Zermatt-born mountaineer was the oldest active mountain guide in the world when he retired at age 94.
Italy – In 1454, a real human chess game took place in Marostica, Italy. Rather than fight a bloody duel, the winner of the chess game would win the hand of a beautiful girl. To commemorate the event, each September in even-numbered years, the town’s main piazza becomes a life-sized chess board.
Hungary – Mangalitsa pigs are a unique breed of pig that resemble sheep as much as pigs and are a little known Hungarian breed. They are the result of a 19th century Austro-Hungarian experiment breeding wild boar with a pig bred especially for lard. Called the Kobe Beef of pork, they are prized for their well-marbled meat. There are 60,000 Mangalitsas worldwide, with 50,000 being in Hungary.
Romania – Romania has the largest bear population in Europe (with about 60% in Transylvania)
France – In 1386, a pig was hung in France for the murder of a child. There is only one stop sign in the entire French city of Paris. French President Charles de Gaulle is included in the Guinness Book of World Records as surviving more assassination attempts—32—than anyone in the world.
Ukraine – Ukraine is one of the largest grain exporters in the world. Commas are used as decimal points instead of periods.
Croatia – Ivan Vucetic, criminologist and anthropologist, was born on the island of Hvar and was the pioneer of scientific dactyloscopy (identification by fingerprints), and his methods of identification are used worldwide.
Wales also known as Cymru – is part of the United Kingdom and well-known for its rich history and culture. Famous people from Wales include Richard Burton, Sir Anthony Hopkins, Tom Jones, Catherine Zeta-Jones, Shirley Bassey, Timothy Dalton and Charlotte Church.
Northern Ireland – Belfast Zoo is home to the only group of purple-faced langurs in Europe.
Poland – In Poland, bananas are peeled from the blossom end not the stem end.
Russia – Russians love cloakrooms – don’t expect to get very far without being asked to put your coat and/or bag in a cloakroom. The best are efficiently run by teams of baboushkas. It’s considered wimpy to lower the ear flaps on your Ushanka (fur hat) unless the temperature drops below -20C. Mikhail Gorbachev recorded an album of romantic ballads. Putin has a judo DVD.
Czech Republic – They don’t go in for turkey in the Czech Republic. During Christmas they eat carp. Usually bought a few days before and left swimming in the bath to keep fresh for the big day.
Ireland – Ireland’s most famous musical export is U2.
Slovakia – is not only a member of the European Union but also belongs to the Eurozone countries. In 2009, the Slovak Koruna (SKK) was retired from circulation after 16 years of using and replaced by a new currency – Euro (EUR).
Sweden – As of 2004 you can pay your Swedish taxes by sending an SMS message from your cell phone. All employers (as of 2004) are required to provide free massage. In Sweden IKEA is a cheap store, not a trendy store.
Albania – King Zog of Albania (ruled 1928-39) was the only national leader in modern times to return fire during an assassination attempt.
Fun facts… but without any digital provenance, how can one know any of them are true?
Current Wiki Projects – tidying up source metadata, creating consistent citations and signalling Open Access sources.
Wikipedia is much more straightforward using the newVisual Editor interfacewhich makes editing Wikipedia nowas easy as using Microsoft Word. Students can be taught how to edit in approximately 60 minutes and thereafter can research and write, with academic rigour, brand new Wikipedia articles.
The video interview provided by the University of Edinburgh’s Dr. Chris Harlow illustrates the Wikipedia research session he ran in September 2015.
A practical example of engaging with Wikipedia in teaching and learning – watch Dr. Chris Harlow speak about his recent experiences introducing Wikipedia to his 3rd year Honours students to researching & writing a Wikipedia article.
Teaching with Wikipedia – Dr. Chris Harlow (Reproductive Biology research session)
In addition, UCL also ran a Wikipedia session for familiarising Year 1 undergraduates with using sources – making good use of the Wiki Education project dashboard to allow educators to manage & monitor class Wikipedia assignments & communicate with students from a central hub:https://prezi.com/apxnjcabgtdd/when-ucl-students-write-wikipedia.
This one also includes how Wikipedia work complements UCL’s educational strategic aims.
Telling the stories of rural England with Wikipedia – The University of Portsmouth. Dr Humphrey Southall, Reader in Geography, University of Portsmouth, written with Dr Martin Poulter, describe a Wikipedia-based assignment given to first-year students in Applied Human Geography and also looking at how academics can inform the widest public about their subject, and raise awareness of the reliable sources used in research.
In addition – Wiki Education resources
Wiki Education has a variety of materials which may be helpful.
You can also find two pdf brochures: Case Studies and Theories, the first of which contains descriptions of classes and learning outcomes, the second contains class descriptions alongside discussion questions and a bibliography for courses.
On Saturday 12th November 2016, the University’s Information Services team are partnering with the National Library of Scotland to run a Wikipedia edit-a-thon to celebrate Robert Louis Stevenson Day 2016. Full Wikipedia editing training will be given in the morning before a break for lunch. Thereafter the afternoon’s editathon will focus on improving the quality of articles about all things gothic.
Working together with liaison librarians, archivists & academic colleagues we will provide training on how to edit and participate in an open knowledge community. Participants will be supported to develop articles covering areas which could stand to be improved; gothic art, gothic architecture, gothic literature, gothic film, gothic music, gothic history etc.
We also invite participants from around the world with an interest in all things Gothic to join in & contribute remotely; either through supplying ideas for our hitlist of Wikipedia articles to create/improve prior to the event or through remote editing during the event or even arranging your own simultaneous editathon events.
Details to follow but keep the date and come along to learn about how Wikipedia works and contribute a greater understanding of Gothic history!
Dario Taraborelli, head of research at Wikimedia, passed a link to http://passingon.natematias.com/ to my colleague, Melissa Highton, Assistant Vice Principal of Online Learning and Director of Learning, Teaching & Web Services at the University of Edinburgh, as the University ran two Wikipedia editathons last year on ‘Women in Science & Scottish History’ and ‘Ada Lovelace Day – celebrating Women in STEM’.
We are endeavouring to keep the momentum going this year and have already run events on Women in Art for International Women’s Day and Women in Espionage for ‘Spy Week 2016’. All of these events are mentioned on my project page.
So the plan is to run an event, or a series of events, in the Autumn which would make use of the brilliant application which scrapes data from the annotated corpus of 25 years’ worth of New York Times articles to help identify missing Wikipedia articles about notable women; utilising these obituary records to help celebrate the lives of those recently passed on and changing Wikipedia’s representation of notable females in the process.
The tricky part will be whether the application could incorporate Scotland/UK based news obituaries e.g. scraped from the Scotsman newspaper or the Guardian newspaper for example.
This is just a gentle reminder that Ada Lovelace Day 2016 will be coming up on Tuesday 11th October 2016 and we will be looking to reconvene a working group to prepare for an Ada Lovelace day of events; incorporating a Wikipedia editathon celebrating the achievements of women in science, technology, engineering and maths (STEM).
Euro Stem Cell Editathon at Centre for Regenerative Medicine, Edinburgh. Editathon for UoE staff and Eurostemcell partner labs in Europe & at the Wellcome Library.
Wikidata (& WikiSource) Showcase (with Pauline Ward & Histropedia’s Navino Evans) at the John McIntyre Conference Centre JMCC – 1st & 2nd August 2016
Reproductive Medicine Edit-a-thon (with Dr. Chris Harlow) – 21 September and 28 September. Partnering with West Virginia University.
Vet School Wikipedia research session – Edit-a-thon event for Royal (Dick) School of Veterinary Studies students to research & create new Wikipedia articles on Veterinary Medicine. Proposed for October 5th 2016.
International Alumni project – Celebrating the international students who studied at Edinburgh University and gone on to have a huge impact abroad (including simultaneous editathons, hopefully, in Singapore & Hong Kong to create a global edit-a-thon). Mooted for early October 2016 for Black History Month.
Ada Lovelace Day – Tuesday 11th October 2016 – celebrating the achievements of Women in STEM with a particular focus on female mentors given that Mary Somerville will grace the new £10 note. Truly noteworthy.
Day of the Dead editathon – Monday 31st October 2016 – using the obituaries from Scottish & UK newspapers to recognise & celebrate the lives of those sadly passed away.
Edinburgh Gothic (agreed a partnership with the National Library of Scotland) – Saturday 12th November. Marking the day before Robert Louis Stevenson Day, the National Library of Scotland will join us to celebrate the best of Edinburgh Gothic, releasing Robert Louis Stevenson images into the public domain to Wikicommons (wherever possible) and any additional material not yet transcribed onto Wikisource. Looking to see if we can combine efforts in gothic art, gothic history, gothic costume design, gothic music, gothic film, gothic literature etc. to fill any gaps on Wikipedia… in the most macabre way.
The Kelvin Hall relaunch (in Glasgow) – mooted for late November / early December 2016 (again in collaboration with the National Library of Scotland). The idea is to create an edit-a-thon based on the Moving Image Archive by showing participants short films from the archive on the Video Wall there, creating Wikipedia articles for the films & filmmakers, and showing a longer film afterwards at the Hunterian cinema.
Translate-a-thon – Reaching out to bilingual and multi-lingual students to translate articles from English Wikipedia to their own native language Wikipedia (& vice versa) using Wikipedia’s new Content Translation tool.
Festival of Architecture 2016 – An architecture-themed editathon to celebrate the achievements of architects for the Festival of Architecture 2016.
And the whisky? It seems my less than unsubtle hints following my trip to Skye in April resulted in my getting a fair few bottles for my birthday.
Projects and whisky galore. Lots to be excited about and lots to get on with!