Author: Ewan McAndrew

Wikimedian in Residence at the University of Edinburgh. English & Media Teacher. Film, Travel & Open Knowledge enthusiast.

Wikipedia at 17 – Facts matter.

Wikipedia: the internet’s favourite website for information

As Wikipedia celebrates its 17th birthday this month, we are once again asking our colleagues to help share some fact-checked knowledge to Wikipedia as part of the global #1Lib1Ref campaign (1 Librarian adding 1 Reference) and help assert that facts, not alternative facts, matter.

The campaign runs from January 15th to February 3rd 2018. Everyone is welcome to participate (it is a global open platform after all).

Wikipedia is already the 5th most visited website, the largest reference work on the internet and the single greatest open education resource in existence today. And that’s with only 120,000 regular contributors. Of whom, only around 3455 are considered ‘very active‘ Wikipedians.

That’s the population of a village like Pitlochry curating the world’s knowledge.

  1. Imagine if the 13,000 staff and 36,000 at the University of Edinburgh all contributed a little of their time and expertise to improving the free encyclopedia.
  2. Imagine if ALL universities contributed.
  3. Imagine if ALL libraries contributed.

While Pitlochry is near the famous 18ftSoldier’s Leap’ at Killiecrankie (worth a visit) #1Lib1Ref is your invitation to take a small step to find out how everyone can help improve Wikipedia.  Simply add 1 citation to 1 fact on Wikipedia that has been tagged as needing verified with a ‘Citation Needed‘ tag between now and February 3rd 2018.

The Citation Hunt tool makes it so easy to help share fact-checked knowledge in 5 mins or less. Watch how you can take part (5 mins).

  1. Read more about #1Lib1Ref campaign.
  2. Learn about the Citation Hunt tool.
  3. Step by step guide to taking part from the Biodiversity Library

Oh and don’t forget to save your edits with an edit summary of #1Lib1Ref and #1Lib1RefEdUni if you’re participating at the University of Edinburgh so we can track how many edits are being made.

Let’s see if we can’t add 101 citations to Wikipedia by February 3rd!

Own work by Stinglehammer, CC-BY-SA.

Wikipedia at 17 – some facts

  • The world’s biggest encyclopedia turned 17 on January 15th 2018.
  • English Wikipedia has 5.5m articles (full list of all 299 language Wikipedias)
  • 500 million visitors per month
  • 1.5 billion monthly unique devices per month.
  • 17 billion pageviews per month.
  • More reliable than you think
  • Vandalism removed more quickly than you think (only 7% of edits are considered vandalism).
  • Used in schools & universities to teach information literacy & help combat fake news.
  • Guidelines around use of reliable sources, conflict of interest, verifiability, and neutral point of view.
  • Articles ‘looked after’ (monitored and maintained) by editors from 2000+ WikiProjects.
  • Includes a quality and ratings scale
  • 87.5% of students report using Wikipedia for their academic work.
  • Used by 90% of medical students and 50-75% of physicians.
  • It is the place people turn to orientate themselves on a topic.

Did Media Literacy backfire?

“Too many students I met were being told that Wikipedia was untrustworthy and were, instead, being encouraged to do research. As a result, the message that many had taken home was to turn to Google and use whatever came up first. They heard that Google was trustworthy and Wikipedia was not.” (Boyd, 2017)

Search is the way we live now” – Google and Wikipedia

  • Google depends on Wikipedia. Click through rate decreases by 80% if Wikipedia links are removed.
  • Wikipedia depends on Google. 84.5% of visits to Wikipedia are attributable to Google.
  • According to 2011 figures in Hillis, Petit & Jarrett (2013), Google processed 91% of searches internationally and 97.4% of the searches made using mobile devices.
  • Google’s ranking algorithm also has a ‘funnelling effect’ according to Beel & Gipp (2009); narrowing the sources clicked upon 90% of the time to just the first page of results with a 42% click through on the first choice alone.
  • This means that addressing knowledge gaps on Wikipedia will surface the knowledge to Google’s top ten results and increase clickthrough and knowledge-sharing. Wikipedia editing can therefore be seen as a form of activism in the democratisation of access to information.
  • Did you know that you can nominate Wikipedia pages to be included on Wikipedia’s front page (viewed 25 million times a day on average)? We did just that for the noted sociologist Mary Susan McIntosh‘s Wikipedia page which was created for International Women’s Day in March 2017. From not having a Wikipedia page at all to 7000 views in 1 single day.

More Did You Know facts about Wikipedia.

 

Don’t cite Wikipedia, write Wikipedia.

  • Wikipedia does not want you to cite it. It considers itself a tertiary resource; an online encyclopedia built from articles which in turn are based on reliable, published, secondary sources.
  • Wikipedia is relentlessly transparent. Everything on Wikipedia can be checked, challenged and corrected. Cite the sources Wikipedia uses, not Wikipedia itself.

Wikipedia does need more subject specialists to engage with it to improve its coverage, however. More eyes on a page helps address omissions and improves the content.

Feedback from staff and students who have engaged with editing Wikipedia:

Isn’t editing Wikipedia hard?

Maybe it was a little hard once but not now. It’s all dropdown menus now with the Visual Editor interface. So super easy, intuitive and “addictive as hell“!

Do you need a quick overview of what all the buttons and menu options on Wikimedia do? Luckily we have just the very thing for you.

Want to get started?

More reading

Wikidata in the Classroom – Data Literacy for the next generation

Summary:

The University of Edinburgh are looking to support the development of a data-literate workforce over the next ten years to support Scotland’s growing digital economy. This therefore represents a huge opportunity for educators, researchers and data scientists to support students in this aim. The first Wikidata in the Classroom assignment at the university is taking place this semester on the Data Science for Design MSc course and two groups of students are working on a project to import the Survey of Scottish Witchcraft database into Wikidata to see what possibilities surfacing this data as structured linked open data can achieve.

Wikidata in the Classroom

The New York Times described this current era as an ‘era of data but no facts’. Data is increasingly valuable as a key driver of the 21st century economy and is certainly abundant with 90% of the world’s data reportedly created in the last two years. Yet, it has never been more difficult to find ‘truth in the numbers’ with over 60 trillion pages to navigate and terabytes of unstructured data to (mis)interpret.

The way forward is clear.

  • “We need to increase the reputational consequences and change the incentives for making false statements… right now, it pays to be outrageous, but not to be truthful.”(Nyhan in the Economist, 2016)
  • ”This challenge is not just for school librarians to prepare the next generation to be informed but for all librarians to assist the whole population.”(Abram, 2016)

Issues at the heart of the information age have been exposed: there exists a glut of information & a sea of data to navigate with little formalised guidance as to how to find our way through it. For the beleaguered student, this glut makes it near impossible to find ‘truth in the numbers’. Therefore there are huge areas of convergence in developing information & data literacy in the next generation and developing Wikidata as a linked hub of verifiable data; fueling discovery and surfacing open knowledge through Google’s Knowledge Graph but, importantly, providing the digital provenance so it can be checked.

Meeting the information & data literacy needs of our students

The Edinburgh and South East Scotland City Region has recently secured a £1.1bn City Region deal from the UK and Scottish Governments. Out of this amount, the University of Edinburgh will receive in the region of £300 million towards making Edinburgh the ‘data capital of Europe’ through developing data-driven innovation. Data “has the potential to transform public and private organisations and drive developments that improve lives.” More specifically, the university is being trusted with the responsibility of delivering a data-literate workforce of 100,000 young people over the next ten years; a workforce equipped with the data skills necessary to meet the needs of Scotland’s growing digital economy.

The implementation of Wikidata in the curriculum therefore presents a massive opportunity for educators, researchers and data scientists alike; not least in honouring the university’s commitment to the creating, curating & dissemination of open knowledge. A Wikidata assignment allows students to develop their understanding of, and engagement with, issues such as: data completeness; data ethics; digital provenance; data analysis; data processing; as well as making practical use of a raft of tools and data visualisations. The fact that Wikidata is also linked open data means that students can help connect to & leverage from a variety of other datasets in multiple languages; helping to fuel discovery through exploring the direct and indirect relationships at play in this semantic web of knowledge. This real-world application of teaching and learning enables insights in a variety of disciplines; be it in open science, digital humanities, cultural heritage, open government and much more besides. Wikidata is also a community-driven project so this allows students to work collaboratively and develop the online citizenship skills necessary in today’s digital economy.

Data Science for Design MSc – Importing the Survey of Scottish Witchcraft database into Wikidata

Packed house at the Data Fair for the Data Science for Design MSc course – 26 October 2017 (Own work, CC-BY-SA)

At the University of Edinburgh, we have begun our first Wikidata in the Classroom assignment this semester on the Data Science for Design MSc course. At the course’s Data Fair on 26th October 2017, researchers from across the university presented the 45 masters students in Design Informatics with approximately 13 datasets to choose from to work on in groups of three. Happily, two groups were enthused to import the university’s Survey of Scottish Witchcraft database into Wikidata (the choice of database to propose was suggested by a colleague). This fabulous resource began life in the 1990s before being realised in 2001-2003. It had as its aim to collect, collate and record all known information about accused witches and witchcraft belief in early modern Scotland (from 1563 to 1736) in a Microsoft Access database and to create a web-based user interface for the database. Since 2003, the data has remained static in the Access database and so students at the 2018 Data Fair were invited to consider what could be done if the data were exported into Wikidata, given multilingual labels and linked to other datasets? Beyond this, what new insights & visualisations of the data could be achieved?

The methodology

A similar methodology to managing Wikipedia assignments was employed; making the transition from managing a Wikipedia assignment to managing a Wikidata assignment an easy one. The two groups of students underwent a 1.5 hour practical induction on working with Wikidata and third party applications such as Histropedia, the timeline of everything, before being introduced to the Access database. They then discussed collaboratively how best to divide the task of analysing and exporting the data before deciding one group would work on (1) importing records for the 3,212 accused witches while the other group would work on (2) the import of the witch trial records and (3) the people associated with these trials (lairds, judges, ministers, prosecutors, witnesses etc).

At this current juncture, the groups have researched and now submitted their data models for review. Now the proposed data model has been checked and agreed upon, the students are ready to process the data from the Access database into a format Wikidata can import (making use of the Wikidata plug-in on Google Spreadsheets). Once this stage is complete, the students can then choose how to visualise the linked data in a number of ways; such as maps, timelines, graphs, bubble charts and more. The students are to complete their project by presenting their insights and data visualisations in an engaging way of their choice on the 30th of November 2017.

North Berwick witches – the logo for the Survey of Scottish Witchcraft database (Public Domain, via Wikimedia Commons)

The way forward

The hope is that this project will aid the students’ understanding of data literacy through the practical application of working with a real-world dataset and help shed new light on a little understood period of Scottish history. This, in turn, may help fuel discoveries by dint of surfacing this data and linking it with other related datasets across the UK, across Europe and beyond. As the Survey of Scottish Witchcraft’s website states itself Our list of people involved in the prosecution of witchcraft suspects can now be used as the basis for further inquiry and research.“

The power of linked open data to share knowledge between different institutions, between geographically and culturally separated societies, and between languages is a beautiful thing. Here’s to many more Wikidata in the Classroom assignments.

Wikipedia in the Classroom – the Edinburgh Residency

Wikimedia at the University of Edinburgh
Reasons to engage in the conversation

With about 17 billion page views every month, it’s safe to say that most of us have heard of Wikipedia and maybe even use it on a regular basis. However, most people don’t realise that Wikipedia is the tip of the iceberg. Its sister sites include a media library (Wikimedia Commons), a database (Wikidata), a library of public domain texts (Wikisource), and even a dictionary (Wiktionary) – along with many others, these form the Wikimedia websites.

While the content is all crowd-sourced, the Wikimedia Foundation in the US maintains the hardware and software the websites run on. Wikimedia UK is one of dozens of sister organisations around the globe who support the mission of the Wikimedia websites to share the world’s knowledge.

Today, Wikipedia is the number one information site in the world, visited by 500 million visitors a month; the place that students and staff consult for pre-research on a topic. And considered, according to a 2014 Yougov survey, to be trusted more than the Guardian, BBC, Telegraph and Times. Perhaps because its commitment to transparency is an implicit promise of trust to its users where everything on it can be checked, challenged and corrected.

The University of Edinburgh and Wikimedia UK – shared missions.

Wikimedia at an ancient university

The Edinburgh residency

In January 2016, the University of Edinburgh and Wikimedia UK partnered to host a Wikimedian in Residence for twelve months. This residency marks something of a paradigm shift as the first in the UK in supporting the whole university as part of its commitment to skills development and open knowledge.

Background to the residency

The University of Edinburgh held its first editathon – a workshop where people learn how to edit Wikipedia and start writing – during the university’s midterm Innovative Learning Week in February 2015. Ally Crockford (Wikimedian in Residence at the National Library of Scotland) and Sara Thomas (Wikimedian in Residence at Museums & Galleries Scotland) came to help deliver the ‘Women, Science and Scottish History’ editathon series which celebrated the Edinburgh Seven; the first group of matriculated undergraduate female students at any British university.

Timeline of the Wikimedia residencies in Scotland to date. The University of Edinburgh residency was the first residency in the UK to have a university-wide remit. Martin Poulter was Wikimedian in Residence at the Bodleian Library before beginning a 2nd residency at the University of Oxford on a university-wide remit.

 

Melissa Highton, Assistant Principal for Online Learning at the University of Edinburgh.

“The striking thing for me was how quickly colleagues within the University took to the idea and began supporting each other in developing their skills and sharing knowledge amongst a multi-professional group. This inspired me to commission some academic research to look at the connections and networking amongst the participants and to explore whether editathons were a good investment in developing workplace digital skills.”Melissa Highton – Assistant Principal for Online Learning.

This research, conducted by Professor Allison Littlejohn, found that there was clear evidence of informal & formal learning going on. Further, that “all respondents reported that the editathon had a positive influence on their professional role. They were keen to integrate what they learned into their work in some capacity and believed participation had increased their professional capabilities.”

Since successfully making case for hosting a Wikimedian in Residence, the residency’s remit has been to advocate for knowledge exchange and deliver training events & workshops across the university which further both the quantity & quality of open knowledge and the university’s commitment to embedding information literacy & digital literacy in the curriculum.

Wikimedia UK and the University of Edinburgh – shared missions

Edinburgh was the first university to be founded with a ‘civic’ mission; created not by the church but by the citizens of Edinburgh for the citizens of Edinburgh in 1583. The mission of the university of Edinburgh is “the creation, curation & dissemination of knowledge”. Founded a good deal later, Wikipedia began on January 15th 2001; the free encyclopaedia is now the largest & most popular reference work on the internet.

Wikimedia’s vision is “imagine a world in which every single human being can freely share in the sum of all knowledge”. It is 100% funded by donations and is the only non-profit website in the top ten most popular sites.

Wikipedia – the world’s favourite site for information.

Addressing the knowledge gap

While Wikipedia is the free encyclopaedia that anyone can edit, not everyone does. Of the 80,000 or so monthly contributors to Wikipedia, only around 3000 are termed very active Wikipedians; meaning the world’s knowledge is often left to be curated by a population the size of a village (roughly the size of Kinghorn in Fife… or half of North Berwick). While 5.4 million articles in English Wikipedia is the largest of the 295 active language Wikipedias, it is estimated that there would need to be at least 104 million articles on English Wikipedia alone to cover all the notable subjects in the world. That means as of last month, English Wikipedia is missing approximately 99 million articles.

Less than 15% of women edit Wikipedia and this skews the content in much the same way with only 17.1% of biographies about notable women. The University of Edinburgh has a commitment to equality and diversity and our Wikimedia residency therefore has a particular emphasis on open practice and engaging colleagues in discussing why some areas of open practice do have a clear gender imbalance. In this way many of our Wikipedia events focused on addressing the gender gap as part of the university’s commitment to Athena Swan; creating new role models for young and old alike. Role models like Janet Anne Galloway, advocate for higher education for women in Scotland, Helen Archdale (journalist and suffragette), Mary Susan McIntosh (sociologist and LGBT campaigner) among many many more.

Pages created at Women in Red meetings at the University of Edinburgh editing sessions.

That’s why it is enormously pleasing that over the whole year, 65% of attendees at our events were female.

Sharing knowledge

The residency has, at its heart, been about making connections. Both across the university’s three teaching colleges and beyond; with the city of Edinburgh itself. Demonstrating how staff, students and members of the public can most benefit from and contribute to the development of the huge open knowledge resource that are the Wikimedia projects. And we made some significant connections over the last year in all of these areas.

Inviting staff & students from all different backgrounds and disciplines to contribute their time and expertise to the creation & improvement of Wikipedia articles in a number of events has worked well and engendered opportunities for collaborations and knowledge exchange across the university, with other institutions across the UK; and across Europe in the case of colleagues from the MRC Centre for Regenerative Medicine working with research partner labs.

Wikipedia in the Classroom – 3 assignments in Year One. Doubled in Year Two.

Ultimately, what you wanted attendees to get from the experience was this; the idea that knowledge is most useful when it is used; engaged with; built upon. Contributing to Wikipedia can also help demonstrate research impact as there is a lot of work going on to ensure that Wikipedia citations to scholarly works use the DOI. The reason being that Wikipedia is already the fifth largest referrer of traffic through the DOI resolver and this is thought to be an underestimate of its true position.

Not just Wikipedia

Knowledge doesn’t belong in silos. The interlinking of the Wikimedia projects for Robert Louis Stevenson.

Introducing staff and students to the work of the Wikimedia Foundation and the other 11 projects has been a key part of the residency with a Wikidata & Wikisource Showcase held during Repository Fringe in August 2016 which has resulted in some out-of-copyright PhD theses being uploaded to Wikisource, and linked to from Wikipedia, just one click away.

Wikisource is a free digital library which hosts out-of-copyright texts including: novels, short stories, plays, poems, songs, letters, travel writing, non-fiction texts, speeches, news articles, constitutional documents, court rulings, obituaries, and much more besides. The result is an online text library which is free to anyone to read with the added benefits that the text is quality assured, searchable and downloadable.

Sharing content to Wikisource, the free digital library, and linking to Wikipedia one click away.

Wikidata is our most exciting project with many predicting it will overtake Wikipedia in years to come as the dominant project. A free linked database of machine-readable knowledge, Wikidata acts as central storage for the structured data of all 295 different language Wikipedias and all the other Wikimedia sister projects.

Timeline of Female alumni of the University of Edinburgh generated from structured linked open data stored in Wikidata.

 “How can you trust Wikipedia when anyone can edit it?”

This is the main charge levelled against involvement with Wikipedia and the residency has been making the case for re-evaluating Wikipedia and for engendering a greater critical information literacy in staff & students. And that’s the thing. Wikipedia doesn’t want you to cite it. It is a tertiary source; an aggregator of articles built on citations from reliable published secondary sources. In this way it is reframing itself as the ‘front matter to all research.’

Wikipedia has clear policy guidelines to help ensure its integrity.

Verifiability – every single statement on Wikipedia needs to be backed up with a citation from a reliable published secondary source. So an implicit promise is made to our users that you can go on there and check, challenge and correct the verifiability of any statement made on Wikipedia.

 

No original research – while knowledge is created everyday, until it is published by a reliable secondary source, it should not be on Wikipedia. The presence of editorial oversight is a key consideration in source evaluation therefore, however well-researched, someone’s personal interpretation is not to be included.

 

Neutral point of view – many subjects on Wikipedia are controversial so can we find common truth in fact? The rule of thumb is you can cover controversy but don’t engage in it. Wikipedians therefore present the facts as they exist.

Automated programmes (bots) patrol Wikipedia and can revert unhelpful edits & copyright violations within minutes. The edit history of a page is detailed such that it is very easy to revert a page to its last good state and block IP addresses of users who break the rules.

What underlies Wikipedia, at its very heart, is this fundamental idea that more people want to good than harm, more people want to create knowledge than destroy, more people want to share than contain. At its core Wikipedia is about human generosity.” – Katherine Maher, Executive Director of the Wikimedia Foundation in December 2016.

This idea that more people want to good than harm has also been borne out by researchers who found that only seven percent of edits could be considered vandalism.

 

 

Wikipedia in the Classroom

Developing information literacy, online citizenship and digital research skills.

The residency has met with a great many course leaders across the entire university and the interactions have all been extremely fruitful in terms of understanding what each side needs to ensure a successful assignment and lowering the threshold for engagement.

Translation Studies MSc students have completed the translation of a Wikipedia article of at least 4000 words into a different language Wikipedia last semester and are to repeat the assignment this semester. This time asking students to translate in the reverse direction from last semester so that the knowledge shared is truly a two-way exchange.

 

The Translation MSc assignment

World Christianity MSc students undertook an 11-week Wikipedia assignment as part of the ‘Selected Themes in the Study of World Christianity’ class. This core course offers candidates the opportunity to study in depth Christian history, thought and practice in and from Africa, Asia and Latin America. The assignment comprised of writing a new article, following a literature review, on a World Christianity term hitherto unrepresented on Wikipedia.

When you hand in an essay the only people that generally read it are you and your lecturer. And then once they both read it, it kind of disappears and you don’t look at it again. No one really benefits from it. With a Wikipedia assignment, other people contribute to it, you put it out there for everyone to read, you can keep coming back to it, keep adding to it, other people can do as well. It becomes more of a community project that everyone can read and access. I really enjoyed it.”Nuam Hatzaw, World Christianity MSc student.

The World Christianity MSc assignment.

Reproductive Biology Honours students in September 2015 researched, synthesised and developed a first-rate Wikipedia entry of a previously unpublished reproductive medicine term: neuroangiogenesis. The following September, the next iteration was more ambitious. All thirty-eight students were trained to edit Wikipedia and worked collaboratively in groups to research and produce the finished written articles. The assignment developed the students’ research skills, information literacy, digital literacy, collaborative working, academic writing & referencing.

One particular deadly form of ovarian cancer, High grade serous carcinoma, was unrepresented on Wikipedia and Reproductive Biology student Áine Kavanagh took great care to thoroughly research and write the article to address this; even developing her own openly-licensed diagrams to help illustrate the article. Her scholarship has now been viewed over sixteen thousand times adding an important source of health information to the global Open Knowledge community.

It was a really good exercise in scientific writing and writing for a lay audience. As a student it’s a really good opportunity. It’s a really motivating thing to be able to do; to relay the knowledge you’ve learnt in lectures and exams, which hasn’t really been relevant outside of lectures and exams, but to see how it’s relevant to the real world and to see how you can contribute.” –Áine Kavanagh.

The Reproductive Biology Hons. assignment.

Following a successful multidisciplinary approach, including students and staff all collaborating in the co-creation & sharing of knowledge, the residency has been extended into a third year until January 2019. Twenty members of staff have also now been trained to provide Wikipedia training and advice to colleagues to help with the sustainability of the partnership in tandem with support from Wikimedia UK.

While also ensuring Wikipedia editing is both embedded in regular digital skills workshops, demystifying how to begin editing Wikipedia has been a core focus of the residency, utilising Wikipedia’s new easy-to-use Visual Editor interface. Over two hundred videos and video tutorials, lesson plans, case studies, booklets and handouts have been created & curated in order to lower the threshold for staff and students to be able to engage with the Wikimedia projects in the years ahead.

The way ahead

Ten years after Wikipedia first launched, the Chronicle of Higher Education published an article by the vice president of Oxford University of Press acclaiming that ‘Wikipedia had come of age’ and that it was time Wikipedia played a vital role in formal education settings. Since that article, the advent of ‘Fake News’ has engendered discussions around how best to equip students with a critical information literacy. For Wikipedia editors this is nothing new as they have been combatting fake news for years and source evaluation is one of the Wikipedian’s core skills.

In fact, there is increasing synchronicity in that the skills and experiences that universities and PISA are articulating they want to see students endowed with are ones that Wikipedia assignments help develop. The assignments we have run this year have all demonstrated this and are to be repeated as a result. The case for Wikipedia playing a vital role in formal education settings has never been stronger.

Is now the time for Wikipedia to come of age?

If not now, then when?

Course leaders at Edinburgh University

Postscript: All three assignments from 2016/2017 are continuing in 2017/2018 because of the positive feedback from staff and students alike.

These are being augmented with collaborations with:

  • two student societies; the History Society for Black History Month and the Translation Society on a Wikipedia project to give their student members much-needed published translation practice.
  • Library and University Collections to add source metadata from 27,000 records in the Edinburgh Research Archive to Wikidata and 20+ digitised theses to Wikisource
  • a further three in-curriculum collaborations in Digital Sociology MSc, Global Health and Anthropology MSc and Data Science for Design MSc.
  • the Fruitmarket Gallery and the university’s Centre for Design Informatics for a Scottish Contemporary Artists editathon.
  • A Litlong editathon as part of the AHRC ‘Being Human’ festival.
  • The School of Chemistry for Ada Lovelace Day to celebrate women in STEM.
  • the University Chaplaincy to mark the International Storytelling Festival.
  • Teeside University to run a ‘Regeneration’ themed editathon.

As we have shown, there are huge areas of convergence between the Wikimedia projects and higher education. The Edinburgh residency has demonstrated that collaborations between universities and Wikimedia are mutually beneficial and that Wikipedia plays a vitally important role in the development of information literacy, digital research skills and the dissemination of academic knowledge for the common good.

That all begins with engaging in the conversation. Building an informed understanding of the Wikimedia projects and the huge opportunities that working together create.

Planting the seed and watching it grow.
Reasons to engage in the conversation

Scotland loves Monuments 2017

Scotland has just been voted the most beautiful country in the world in a Rough Guide readers’ poll.

Perhaps I’m a tad biased but I’d tend to agree. There’s nowhere quite like it.

Yet, we who live and work here can take it for granted that our beautiful locations, listed buildings and monuments will always be there… something that can never be fully guaranteed. Political and economic tides change and forces of nature can have devastating effects as we have seen in recent days.

That’s why it’s so important that we take the opportunity to document our cultural heritage now for future generations before it is too late.

The world’s largest photo competition, Wiki Loves Monuments, takes place for the whole of September. Share your high quality pics of listed buildings and monuments to Wikimedia Commons and help preserve our cultural heritage online. After days out, weekend breaks and holidays at home & abroad, there will be gigabytes of pics taken in recent months and years. These could remain on your memory card or be shared to Commons and help illustrate Wikipedia for the benefit of all. Entry is free and the best pics will win a prize.

Aside from being great fun, Wiki Loves Monuments is a way of capturing a snapshot of our nation’s cultural heritage for future generations and documenting our country’s most important historic sites. See the rules and how to enter.

Ryries near Haymarket Station, Edinburgh. Own work by me via Wikimedia Commons for Wiki Loves Monuments 2017, CC-BY-SA

I used the handy Wiki Loves Monuments UK tool which shows you places near you, indicated with a red dot, that require a pic.

Wiki Loves Monuments – dynamic map of Edinburgh showing listed buildings requiring an image (in red).

You just take a quick look at the map, take a pic and upload. It takes seconds and is the easiest way to take part in this year’s competition. (There is also another WLM map tool if you want to search for addresses, either in UK or further afield).

I was surprised to see Ryries, a public house near Haymarket Station was a listed building on the Wiki Loves Monuments map; a building I pass every day so it was an easy one to snap and upload.

If each one of us took just 1 pic, we’d have this sewn up in a couple of weeks. Which is when Wiki Loves Monuments closes – end of September 2017.But if you can do more then great.

Don’t wait till it’s too late, do your bit today! Click here to view a map of your local area to get started.

#1picture1person #ScottishHeritage #WLMUK17

ps. Once the new pictures are uploaded then comes the additional fun part of adding those images to relevant Wikipedia pages so that millions around the world can enjoy a picture you have taken. If you fancy helping out with that then we are having a Wiki meetup 2pm to 5pm on Friday 29th September and you can drop-in at any point to add a pic to a Wiki page. Signup here.

If nothing else, let’s give our counterparts in Ireland and Wales a run for their money in terms of how many images we can upload. A little friendly rivalry never hurts, right?

You can check out the images uploaded so far for Wiki Loves Monuments in Scotland here.

 

Wikipedia's front page 11 May 2017

Did you know – Mary Susan McIntosh

Did you know that that sociologist, feminist, and campaigner for lesbian and gay rights Mary Susan McIntosh was deported from the U.S. in 1960 for speaking out against the House Un-American Activities Committee?

Mary Susan McIntosh (1936–2013) sociologist, feminist, political activist and campaigner for lesbian and gay rights in the UK. A 1974 colour photograph from her time as a Research Fellow at Nuffield College, Oxford. CC-BY-SA
Mary Susan McIntosh (1936–2013) sociologist, feminist, political activist and campaigner for lesbian and gay rights in the UK. A 1974 colour photograph from her time as a Research Fellow at Nuffield College, Oxford. CC-BY-SA

Yesterday this ‘Did You Know‘ fact was on Wikipedia’s front page. The front page is viewed, on average, 25 million times a day.

Mary’s page was only written in March during our International Women’s Day event here at the University of Edinburgh by one of our attendees, Lorna Campbell (read Lorna’s blog article on Mary here).

While her page has only been live on Wikipedia for two months, Mary’s page has now been viewed in excess of 7000 times because a) editors were motivated to address Wikipedia’s gender gap problem where less than 15% of editors are female and less than 17% of biographies are of notable women and b) we felt Mary’s story was important enough that it should be shared on Wikipedia’s front page and introduced to an audience of up to 25 million.

Did you know you could do that? Nominate a page newly created in the last seven days, or significantly expanded on, to be included on Wikipedia’s front page in this way?

View the guidelines for Did You Know here.

The Wikimedia residency at the University of Edinburgh has been as much about demystifying the largest reference work on the internet as anything else so here are some other things I feel are worth knowing in the spirit of ‘did you know‘?:

 

  • Did you know that Wikipedia works with Turnitin to address issues of plagiarism and copyright violation using the Copyvio tool and that the Dashboard for managing assignments now offers Authorship Highlighting of students’ edits thereby making it easier to visualize and evaluate student work.
  • Did you know that Wikipedia does not want you to cite it? It is a tertiary source; an aggregator of articles with facts backed up from reliable published secondary sources. You can’t cite Wikipedia but you can cite the references it uses. In this way it is reframed as the digital gateway to further research sources.
  • Did you know that Wikipedia editing teaches source evaluation as a core skill hence Wikipedia education assignments help students combat fake news?
  • Did you know that Dr. Alex Chow at the University of Edinburgh’s School of Divinity has developed a script to help assess the word count of Wikipedia articles for use with student assignments?
  • Did you know that only 7% of edits to Wikipedia areconsidered vandalism and that research has found that, unlike other parts of the internet, Wikipedia editing actually de-radicalises its editors of partisan political leanings?
  • Did you know you can learn:
  • Did you know that you can upload openly-licensed longer texts to Wikisource (the free content library) which are transcribed into 100% searchable HTML so that works such as Thomas Jehu’s digitised PhD thesis can be linked to, one click away, from his Wikipedia article or out-of-copyright texts such as Robert Louis Stevenson’s book on ‘Edinburgh’ (1914) can be enjoyed by new audiences?
  • Did you know that Wikidata, Wikimedia’s repository of structured open data, now has 3 million linked citations added to it which can be queried using the new Scholia tool – a tool to handle scientific bibliographic information? (The Scholia Web service creates on-the-fly scholarly profiles for researchers, organizations, journals, publishers, individual scholarly works, and for research topics. To collect the data, it queries the SPARQL-based Wikidata Query Service).
  • Did you know that you can now add automatically generated citations to millions of books on Wikipedia? Wikipedia editors can now draw on WorldCat, the world’s largest database of books, to generate citations on Wikipedia thanks to a collaboration between OCLC (Online Computer Library Center) and the Wikimedia Foundation’s Wikipedia Library program.
  • Did you know that the latest estimates by Crossref show that Wikipedia has risen from the 8th most prolific referrer to DOIs to the 5th. And this is thought to be a gross underestimate of its actual position?
  • Did you know that Altmetric include Wikipedia citations in their impact metrics and that Altmetric automatically picks up on citations through Wikipedia’s citation generator?
  • Did you know that Wikimedia has received a $3 million grant from the Alfred P. Sloan Foundation to make a ‘Structured Commons’ to make freely-licensed images accessible and reusable across the web?
  • Did you know that releasing images through Wikimedia Commons can result in a huge increase in views with detailed metrics about the number of views these images are accruing? E.g. Images released by the Bodleian Library have accrued 218,460,571 views to date.
  • Did you know about the WikiCite initiative? Tidying up the citations on Wikipedia to make a consistent, queryable bibliographic repository enhancing the visibility and impact of research.
  • Did you know that thanks to the new I4OC initiative (April 2017) there exists a collaboration between scholarly publishers, researchers, and other interested parties to promote the unrestricted availability of scholarly citation data? Before I4OC started, publishers releasing references in the open accounted for just 1% of citation metadata collected annually by Crossref. Following discussions over the past months, several subscription-access and open-access publishers have recently made the decision to release reference list metadata publicly. These include: American Geophysical Union, Association for Computing Machinery, BMJ, Cambridge University Press, Cold Spring Harbor Laboratory Press, EMBO Press, Royal Society of Chemistry, SAGE Publishing, Springer Nature, Taylor & Francis, and Wiley. These publishers join other publishers who have been opening their references through Crossref for some time.
  • Did you know that thanks to Wikidata you can now query, analyse & visualise the largest reference work on the internet? You can also add your research data to combine datasets on Wikidata.
  • Did you know that the University of Portsmouth have been running a Wikipedia assignment called Human Geography for the last five years where each student is assigned a different short stub article for a village in England and Wales, and asked to expand it to provide a rounded description of the place and, in particular, an account of its historical development?
  • Did you know that, so far, they have left Scotland untouched and so there will be many villages and towns in Scotland ripe to have articles created and improved?
  • Did you know that Wikivoyage is Wikipedia’s sister project and a Lonely Planet-esque travel guide? Students can write articles about their hometown area with bullet-pointed sections on ‘Things to do’, ‘Things to See’, ‘Things to Buy’, ‘Places to stay’ with Open Street Maps included and images added from Wikimedia Commons.
  • Did you know how students and staff at the University of Edinburgh have reacted to the Wikipedia in the Classroom assignments we have run this year? You can view a compilation of their feedback in this video.
  • Did you know that students can create entire textbooks, chapters of textbooks, on Wikipedia’s sister project, Wikibooks?
  • Did you know that every September the world’s largest photography competition takes place, Wiki Loves Monuments? Participants are encouraged to photograph and upload images of listed buildings and monuments to document our cultural heritage.
  • Did you know that the WikiShootme tool helps identify notable buildings in your area that require an image uploading?
  • Did you know that taking part in Wikimedia activities does not always require a heavy time component and that short, fun activities can also help: adding a citation through the Citation Hunt tool (“Whack-a-mole for citations”), playing the Wikidata game, adding images through WikiShootMe and FIST; taking part in fun Wiki Races (6 degrees of separation for Wiki links between articles).
  • Did you know that you can become a Wikipedia trainer with our new lesson plan and slide deck (available on Tes.com)?
  • Did you know that you can learn how to edit at our 90 minute training sessions and how to become a trainer at our 3 hour Train the Trainer events?
  • Did you know that I can deliver presentations and training as you require; be it on Wikisource (the free content library), Wikidata (the free and open respository of structured data), Wikimedia Commons (the free media respository), the Wikicite initiative, WikiVoyage (the free travel guide), writing articles for Wikipedia, adding your research to Wikipedia or something else entirely?

If you would like to find out more then feel free to contact me at ewan.mcandrew@ed.ac.uk

 

  • Want to become a Wikipedia editor?
  • Want to become a Wikipedia trainer?
  • Want to run a Wikipedia course assignment?
  • Want to contribute images to Wikimedia Commons?
  • Want to learn more about Wikisource?
  • Want to contribute your research to Wikipedia?
  • Want to contribute your research data to Wikidata?

Sifting fact from fake news – International Worker’s Day

CC BY 2.0-licensed photo by CEA+ | Artist: Nam June Paik, “Electronic Superhighway. Continental US, Alaska & Hawaii” (1995).

Woodward and Bernstein, the eminent investigative journalists involved in uncovering the Watergate Scandal, just felt compelled to assert that the media were not ‘fake news’ at a White House Correspondents Dinner the US President failed to attend. In the same week, Jimmy Wales, co-founder of Wikipedia, felt compelled to create a new site, WikiTribune, to combat fake news.

This is where we are this International Worker’s Day where the most vital work one can undertake seems to be keeping oneself accurately informed.

We live in the information age and the aphorism ‘one who possess information possesses the world’ of course reflects the present-day reality.” – (Vladimir Putin in Interfax, 2016).

 

Sifting fact from fake news

In the run up to the Scottish council elections, French presidential elections and a ‘strong and stable‘ UK General Election, what are we to make of the ‘post-truth’ landscape we supposedly now inhabit; where the traditional mass media appears to be distrusted and waning in its influence over the public sphere (Tufeckzi in Viner, 2016) while the secret algorithms’ of search engines & social media giants dominate instead?

The new virtual agora (Silverstone in Weichert, 2016) of the internet creates new opportunities for democratic citizen journalism but also has been shown to create chaotic ‘troll’ culture & maelstroms of information overload. Therefore, the new ‘virtual generation’ inhabiting this ‘post-fact’ world must attempt to navigate fake content, sponsored content and content filtered to match their evolving digital identity to somehow arrive safely at a common truth. Should we be worried what this all means in ‘the information age’?

Information Literacy in the Information Age

Facebook defines who we are, Amazon defines what we want

and Google defines what we think.”

(Broeder, 2016)

The information age is defined as “the shift from traditional industry that the Industrial Revolution brought through industrialization, to an economy based on computerization or digital revolution” (Toffler in Korjus, 2016). There are now 3 billion internet users on our planet, well over a third of humanity (Graham et al, 2015). Global IP traffic is estimated to treble over the next 5 years (Chaudhry, 2016) and a hundredfold for the period 2005 to 2020 overall. This internet age still wrestles with both geographically & demographically uneven coverage while usage in no way equates to users being able to safely navigate, or indeed, to critically evaluate the information they are presented with via its gatekeepers (Facebook, Google, Yahoo, Microsoft et al). Tambini (2016) defines these aforementioned digital intermediaries as “software-based institutions that have the potential to influence the flow of online information between providers (publishers) and consumers”. So exactly how conversant are we with the nature of their relationship with these intermediaries & the role they play in the networks that shape our everyday lives?

Digital intermediaries

Digital intermediaries such as Google and Facebook are seen as the new powerbrokers in online news, controlling access to consumers and with the potential even to suppress and target messages to individuals.” (Tambini, 2016)

 

Facebook’s CEO Mark Zuckerberg may downplay Facebook’s role as “arbiters of truth” (Seethaman, 2016) in much the same way that Google downplay their role as controllers of the library “card catalogue” (Walker in Toobin, 2015) but both represent the pre-eminent gatekeepers in the information age. 62% of Americans get their news from social media (Mint, 2016) with 44% getting their news from Facebook. In addition, a not insubstantial two million voters were encouraged to register to vote by Facebook, while Facebook’s own 2012 study concluded that it “directly influenced political self-expression, information seeking and real-world voting behaviour of millions of people.” (Seethaman, 2016)

 

image003Figure 1 Bodies of Evidence (The Economist, 2016)

This year has seen assertion after assertion made which bear, upon closer examination by fact-checking organisations such as PolitiFact (see Figure 1 above) absolutely no basis in truth. For the virtual generation, the traditional mass media has come to be treated on a par with new, more egalitarian, social media with little differentiation in how Google lists these results. Clickbait journalism has become the order of the day (Viner, 2016); where outlandish claims can be given a platform as long as they are prefixed with “It is claimed that…”

Now no one even tries proving ‘the truth’. You can just say anything. Create realities.” (Pomerantzev in the Economist, 2016)

The problem of ascertaining truth in the information age can be attributed to three main factors:

  1. The controversial line “people in this country have had enough of experts” (Gove in Viner, 2016) during the EU referendum demonstrated there has been a fundamental eroding of trust in, & undermining of, the institutions & ‘expert’ opinions previously looked up to as subject authorities. “We’ve basically eliminated any of the referees, the gatekeepers…There is nobody: you can’t go to anybody and say: ‘Look, here are the facts’” (Sykes in the Economist, 2016)
  2. The proliferation of social media ‘filter bubbles’ which group like-minded users together & filter content to them accordingly to their ‘likes’. In this way, users can become isolated from viewpoints opposite to their own (Duggan, 2016) and fringe stories can survive longer despite being comprehensively debunked elsewhere. In this way, any contrary view tends to be either filtered out or met with disbelief through what has been termed ‘the backfire effect’ (The Economist, 2016).
  3. The New York Times calls this current era an era of data but no facts’ (Clarke, 2016). Data is certainly abundant; 90% of the world’s data was generated in the last two years (Tuffley, 2016). Yet, it has never been more difficult to find ‘truth in the numbers’ (Clarke, 2016) with over 60 trillion pages (Fichter and Wisniewski, 2014) to navigate and terabytes of unstructured data to (mis)interpret.

The way forward

We need to increase the reputational consequences and change the incentives for making false statements… right now, it pays to be outrageous, but not to be truthful.”

(Nyhan in the Economist, 2016)

Original image by Doug Coulter, The White House (The White House on Facebook) [Public domain], via Wikimedia Commons. Modified by me.
Since the US election, and President Trump’s continuing assault on the ‘dishonest media’, the need for information to be verified has been articulated as never before with current debates raging on just how large a role Russia, Facebook & fake news played during the US election. Indeed, the inscrutable ‘black boxes’ of Google & Facebook’s algorithms constitute a real dilemma for educators & information professionals.

Reappraising information & media literacy education

The European Commission, the French Conseil d’Etat and the UK Government are all re-examining the role of ‘digital intermediaries’; with OfCom being asked by the UK government to prepare a new framework for assessing the intermediaries’ news distribution & setting regulatory parameters of ‘public expectation’ in place (Tambini, 2016). Yet, Cohen (2016) asserts that there is a need for greater transparency of the algorithms being used in order to provide better oversight of the digital intermediaries. Further, that the current lack of public domain data available in order to assess the editorial control of these digital intermediaries means that until the regulatory environment is strengthened so as to require these ‘behemoths’ (Tambini, 2016) to disclose this data, this pattern of power & influence is likely to remain unchecked.

Somewhere along the line, media literacy does appear to have backfired; our students were told that Google was trustworthy and Wikipedia was not (Boyd, 2016). The question is how clicking on those top five Google results instead of critically engaging with the holistic overview & reliable sources Wikipedia offers is working out?

A lack of privacy combined with a lack of transparency

Further, privacy seems to be the one truly significant casualty of the information age. Broeder (2016) suggests that, as governments focus increasingly on secrecy, at the same time the individual finds it increasingly difficult to retain any notions of privacy. This creates a ‘transparency paradox’ often resulting in a deep suspicion of governments’ having something to hide while the individual is left vulnerable to increasingly invasive legislation such as the UK’s new Investigatory Powers Act – “the most extreme surveillance in the history of Western democracy.” (Snowden in Ashok, 2016). This would be bad enough if their public & private data weren’t already being shared as a “tradeable commodity” (Tuffley, 2016) with companies like Google and Apple, “the feudal overlords of the information society” (Broeder, 2016) and countless other organisations.

The Data Protection Act (1998), Freedom of Information Act (2000) and the Human Rights Act (1998) should give the beleaguered individual succour but FOI requests can be denied if there is a ‘good reason’ to do so, particularly if it conflicts with the Official Secrets Act (1989), and the current government’s stance on the Human Rights Act does not bode well for its long-term survival. The virtual generation will also now all have a digital footprint; a great deal of which can been mined by government & other agencies without our knowing about it or consenting to it. The issue therefore is that a line must be drawn as to our public lives and our private lives. However, this line is increasingly unclear because our use of digital intermediaries blurs this line. In this area, we do have legitimate cause to worry.

The need for a digital code of ethics

  • “Before I do something with this technology, I ask myself, would it be alright if everyone did it?
  • Is this going to harm or dehumanise anyone, even people I don’t know and will never meet?
  • Do I have the informed consent of those who will be affected?” (Tuffley, 2016)

Educating citizens as to the merits of a digital code of ethics like the one above is one thing, and there are success stories in this regard through initiatives such as StaySafeOnline.org but a joined-up approach marrying up librarians, educators and instructional technologists to teach students (& adults) information & digital literacy seems to be reaping rewards according to Wine (2016). While recent initiatives exemplifying the relevance & need for information professionals assisting with political literacy during the Scottish referendum (Smith, 2016) have found further expression in other counterparts (Abram, 2016).

This challenge is not just for school librarians to prepare the next generation to be informed but for all librarians to assist the whole population.” (Abram, 2016)

Trump’s administration may or may not be in ‘chaos’ but recent acts have exposed worrying trends. Trends which reveal an eroding of trust: in the opinions of experts; in the ‘dishonest’ media; in factual evidence; and in the rule of law. Issues at the heart of the information age have been exposed: there exists a glut of information & a sea of data to navigate with little formalised guidance as to how to find our way through it. For the beleaguered individual, this glut makes it near impossible to find ‘truth in the numbers’ while equating one online news source to be just as valid as another, regardless of its credibility, only exacerbates the problem. All this, combined with an increasing lack of privacy and an increasing lack of transparency, makes for a potent combination.

There is a place of refuge you can go, however. A place where facts, not ‘alternate facts’, but actual verifiable facts, are venerated. A place that holds as its central tenets, principles of verifiability, neutral point of view, and transparency above all else. A place where every edit made to a page is recorded, for the life of that page, so you can see what change was made, when & by whom. How many other sites give you that level of transparency where you can check, challenge & correct the information presented if it does hold to the principles of verifiability?

image004

Now consider that this site is the world’s number one information site; visited by 500 million visitors a month and considered, by British people, to be more trustworthy than the BBC, ITV, the Guardian, the Times, the Telegraph according to a 2014 Yougov survey.

image006

While Wikipedia is the fifth most popular website in the world, the other internet giants in the top ten cannot compete with it for transparency; an implicit promise of trust with its users. Some 200+ factors go into constructing how Google’s algorithm determines the top ten results for a search term yet we have no inkling what those factors are or how those all-important top ten search results are arrived at. Contrast this opacity, and Facebook’s for that matter, with Wikimedia’s own (albeit abortive) proposal for a Knowledge Engine (Sentance, 2016); envisaged as the world’s first transparent non-commercial search engine and consider what that transparency might have meant for the virtual generation being able to trust the information they are presented with.

Wikidata (Wikimedia’s digital repository of free, openly-licensed structured data) represents another bright hope. It is already used to power, though not exclusively, many of the answers in Google’s Knowledge Graph without ever being attributed as such.

image009

Wikidata is a free linked database of knowledge that can be read and edited by both humans and machines. It acts as central storage for the structured data of its Wikimedia sister projects including Wikipedia, Wikivoyage, Wikisource, and others. The mission behind Wikidata is clear: if ‘to Google’ has come to stand in for ‘to search’ and “search is the way we now live” (Darnton in Hillis, Petit & Jarrett, 2013, p.5) then ‘to Wikidata’ is ‘to check the digital provenance’. And checking the digital provenance of assertions is pivotal to our suddenly bewildered democracy.

While fact-checking websites exist & more are springing up all the time, Wikipedia is already firmly established as the place where students and staff conduct pre-research on a topic; “to gain context on a topic, to orient themselves, students start with Wikipedia…. In this unique role, it therefore serves as an ideal bridge between the validated and unvalidated Web.” (Grathwohl, 2011)

Therefore, it is vitally important that Wikipedia’s users know how knowledge is constructed & curated and the difference between fact-checked accurate information from reliable sources and information that plainly isn’t.

Knowledge creates understanding – understanding is sorely lacking in today’s world. Behind every article on Wikipedia is a Talk page is a public forum where editors hash it out; from citations, notability to truth.” (Katherine Maher, Executive Director of the Wikimedia Foundation, December 2016)

The advent of fake news means that people need somewhere they can turn to where the information is accurate, reliable and trustworthy. Wikipedia editors have been evaluating the validity and reliability of sources and removing those facts not attributed to a reliable published source for years. Therefore engaging staff and students in Wikipedia assignments embeds source evaluation as a core component of the assignment. Recent research by Harvard Business School has also shown that the process of editing Wikipedia has a profound impact on those that participate in it; whereby editors that become involved in the discourse of an article’s creation with a particular slanted viewpoint or bias actually become more moderate over time. This means editing Wikipedia actually de-radicalises its editors as they seek to work towards a common truth. Would that were true of other much more partisan sectors of the internet.

Further, popular articles and breaking news stories are often covered on Wikipedia extremely thoroughly where the focus of many eyes make light work in the construction of detailed, properly cited, accurate articles. And that might just be the best weapon to combat fake news; while one news source in isolation may give one side of a breaking story, Wikipedia often provides a holistic overview of all the news sources available on a given topic.

Wikipedia already has clear policies on transparency, verifiability, and reliable sources. What it doesn’t have is the knowledge that universities have behind closed doors; often separated into silos or in pay-walled repositories. What it doesn’t have is enough willing contributors to meet the demands of the 1.5 billion unique devices that access it each month in ensuring its coverage of the ever-expanding knowledge is kept as accurate, up-to-date & representative of the sum of all knowledge as possible.

This is where you come in.

 

Conclusion

It’s up to other people to decide whether they give it any credibility or not,” (Oakeshott in Viner, 2016)

The truth is out there. But it is up to us to challenge claims and to help verify them. This is no easy task in the information age and it is prone to, sometimes very deliberate, obfuscation. Infoglut has become the new censorship; a way of controlling the seemingly uncontrollable. Fact-checking sites have sprung up in greater numbers but they depend on people seeking them out when convenience and cognitive ease have proven time and again to be the drivers for the virtual generation.

We know that Wikipedia is the largest and most popular reference work on the internet. We know that it is transparent and built on verifiability and neutral point of view. We know that it has been combating fake news for years. So if the virtual generation are not armed with the information literacy education to enable them to critically evaluate the sources they encounter and the nature of the algorithms that mediate their interactions with the world, how then are they to make the informed decisions necessary to play their part as responsible online citizens?

It is the response of our governments and our Higher Education institutions to this last question that is the worry.

image010

Postscript – Wikimedia at the University of Edinburgh

As the Wikimedia residency at the University of Edinburgh moves further into its second year we are looking to build on the success of the first year and work with other course leaders and students both inside and outside the curriculum. Starting small has proven to be a successful methodology but bold approaches like the University of British Columbia’s WikiProject Murder, Madness & Mayhem can also prove extremely successful. Indeed, bespoke solutions can often be found to individual requirements.

 

Time and motivation are the two most frequent cited barriers to uptake. These are undoubted challenges to academics, students & support staff but the experience of this year is that the merits of engagement & an understanding of how Wikipedia assignments & edit-a-thons operate overcome any such concerns in practice. Once understood, Wikipedia can be a powerful tool in an educator’s arsenal. Engagement from course leaders, information professionals and support from the institution itself go a long way to realising that the time & motivation is well-placed.

For educators, engaging with Wikipedia:

  • meets the information literacy & digital literacy needs of our students.
  • enhances learning & teaching in the curriculum
  • helps develop & share knowledge in their subject discipline
  • raises the visibility & impact of research in their particular field.

In this way, practitioners can swap out existing components of their practice in favour of Wikimedia learning activities which develop:

  • Critical information literacy skills
  • Digital literacy
  • Academic writing & referencing
  • Critical thinking
  • Literature review
  • Writing for different audiences
  • Research skills
  • Community building
  • Online citizenship
  • Collaboration.

This all begins with engaging in the conversation.

Wikipedia turned 16 on January 15th 2017. It has long been the elephant in the room in education circles but it is time to articulate that Wikipedia does indeed belong in education and that it plays an important role in our understanding & disseminating of the world’s knowledge. With Oxford University now also hosting their own Wikimedian in Residence on a university-wide remit, it is time also to articulate that this conversation is not going away. Far from it, the information & digital literacy needs of our students and staff will only intensify. Higher Education institutions must need formulate a response. The best thing we can do as educators & information professionals is to be vigilant and to be vocal; articulating both our vision for Open Knowledge & the pressing need for engagement in skills development as a core part of the university’s mission and give our senior managers something they can say ‘Yes’ to.

If you would like to find out more then feel free to contact me at ewan.mcandrew@ed.ac.uk

  • Want to become a Wikipedia editor?
  • Want to become a Wikipedia trainer?
  • Want to run a Wikipedia course assignment?
  • Want to contribute images to Wikimedia Commons?
  • Want to contribute your research to Wikipedia?
  • Want to contribute your research data to Wikidata?

References

Abram, S. (2016). Political literacy can be learned! Internet@Schools, 23(4), 8-10. Retrieved from http://search.proquest.com.ezproxy.is.ed.ac.uk/docview/1825888133?accountid=10673

Alcantara, Chris (2016).“Wikipedia editors are essentially writing the election guide millions of voters will read”. Washington Post. Retrieved 2016-12-10.

Ashok, India (2016-11-18). “UK passes Investigatory Powers Bill that gives government sweeping powers to spy”International Business Times UK. Retrieved 2016-12-11.

Bates, ME 2016, ‘Embracing the Filter Bubble’, Online Searcher, 40, 5, p. 72, Computers & Applied Sciences Complete, EBSCOhost, viewed 10 December 2016.

Blumenthal, Helaine (2016).“How Wikipedia is unlocking scientific knowledge”. Wiki Education Foundation. 2016-11-03. Retrieved 2016-12-10.

Bode, Leticia (2016-07-01). “Pruning the news feed: Unfriending and unfollowing political content on social media”. Research & Politics. 3 (3): 2053168016661873. doi:10.1177/2053168016661873ISSN 2053-1680.

Bojesen, Emile (2016-02-22). “Inventing the Educational Subject in the ‘Information Age’”. Studies in Philosophy and Education. 35 (3): 267–278. doi:10.1007/s11217-016-9519-2ISSN 0039-3746.

Boyd, Danah (2017-01-05). “Did Media Literacy Backfire?”. Data & Society: Points. Retrieved 2017-02-01.

Broeders, Dennis (2016-04-14). “The Secret in the Information Society”. Philosophy & Technology. 29 (3): 293–305. doi:10.1007/s13347-016-0217-3ISSN 2210-5433.

Burton, Jim (2008-05-02). “UK Public Libraries and Social Networking Services”. Library Hi Tech News. 25 (4): 5–7. doi:10.1108/07419050810890602ISSN 0741-9058.

Cadwalladr, Carole (2016-12-11). “Google is not ‘just’ a platform. It frames, shapes and distorts how we see the world”The GuardianISSN 0261-3077. Retrieved 2016-12-12.

Carlo, Silkie (2016-11-19). “The Government just passed the most extreme surveillance law in history – say goodbye to your privacy”The Independent. Retrieved 2016-12-11.

Chaudhry, Peggy E. “The looming shadow of illicit trade on the internet”. Business Horizons. doi:10.1016/j.bushor.2016.09.002.

Clarke, C. (2016). Advertising in the post-truth world. Campaign, Retrieved from http://search.proquest.com.ezproxy.is.ed.ac.uk/docview/1830868225?accountid=10673

Cohen, J. E. (2016). The regulatory state in the information age. Theoretical Inquiries in Law, 17(2), 369-414.

Cover, Rob (2016-01-01). Digital Identities. San Diego: Academic Press. pp. 1–27. doi:10.1016/b978-0-12-420083-8.00001-8ISBN 9780124200838.

Davis, Lianna (2016-11-21). “Why Wiki Ed’s work combats fake news — and how you can help”. Wiki Education Foundation. Retrieved 2016-12-10.

Derrick, J. (2016, Sep 26). Google is ‘the only potential acquirer’ of Twitter as social media boom nears end. Benzinga Newswires Retrieved from http://search.proquest.com.ezproxy.is.ed.ac.uk/docview/1823907139?accountid=10673

DeVito, Michael A. (2016-05-12). “From Editors to Algorithms”. Digital Journalism. 0 (0): 1–21. doi:10.1080/21670811.2016.1178592ISSN 2167-0811.

Dewey, Caitlin (2016-05-11). “You probably haven’t even noticed Google’s sketchy quest to control the world’s knowledge”. The Washington Post. ISSN 0190-8286. Retrieved 2016-12-10.

Dewey, Caitlin (2015-03-02). “Google has developed a technology to tell whether ‘facts’ on the Internet are true”. The Washington Post. ISSN 0190-8286. Retrieved 2016-12-10.

Duggan, W. (2016, Jul 29). Where social media fails: ‘echo chambers’ versus open information source. Benzinga Newswires Retrieved from http://search.proquest.com.ezproxy.is.ed.ac.uk/docview/1807612858?accountid=10673

Eilperin, Juliet (11 December 2016). “Trump says ‘nobody really knows’ if climate change is real”. Washington Post. Retrieved 2016-12-12.

Evans, Sandra K. (2016-04-01). “Staying Ahead of the Digital Tsunami: The Contributions of an Organizational Communication Approach to Journalism in the Information Age”. Journal of Communication. 66 (2): 280–298. doi:10.1111/jcom.12217ISSN 1460-2466.

Facts and Facebook. (2016, Nov 14). Mint Retrieved from http://search.proquest.com.ezproxy.is.ed.ac.uk/docview/1838637822?accountid=10673

Flaxman, Seth; Goel, Sharad; Rao, Justin M. (2016-01-01). “Filter Bubbles, Echo Chambers, and Online News Consumption”. Public Opinion Quarterly. 80 (S1): 298–320. doi:10.1093/poq/nfw006ISSN 0033-362X.

Fu, J. Sophia (2016-04-01). “Leveraging Social Network Analysis for Research on Journalism in the Information Age”. Journal of Communication. 66 (2): 299–313. doi:10.1111/jcom.12212ISSN 1460-2466.

Graham, Mark; Straumann, Ralph K.; Hogan, Bernie (2015-11-02). “Digital Divisions of Labor and Informational Magnetism: Mapping Participation in Wikipedia”. Annals of the Association of American Geographers. 105 (6): 1158–1178. doi:10.1080/00045608.2015.1072791ISSN 0004-5608.

Grathwohl, Casper (2011-01-07). “Wikipedia Comes of Age”. The Chronicle of Higher Education. Retrieved 2017-02-20.

Guo, Jeff (2016).“Wikipedia is fixing one of the Internet’s biggest flaws”. Washington Post. Retrieved 2016-12-10.

Hahn, Elisabeth; Reuter, Martin; Spinath, Frank M.; Montag, Christian. “Internet addiction and its facets: The role of genetics and the relation to self-directedness”. Addictive Behaviors. 65: 137–146. doi:10.1016/j.addbeh.2016.10.018.

Heaberlin, Bradi; DeDeo, Simon (2016-04-20). “The Evolution of Wikipedia’s Norm Network”. Future Internet. 8 (2): 14. doi:10.3390/fi8020014.

Helberger, Natali; Kleinen-von Königslöw, Katharina; van der Noll, Rob (2015-08-25). “Regulating the new information intermediaries as gatekeepers of information diversity”info17 (6): 50–71. doi:10.1108/info-05-2015-0034ISSN 1463-6697.

Hillis, Ken; Petit, Michael; Jarrett, Kylie (2012). Google and the Culture of Search. Routledge. ISBN9781136933066.

Hinojo, Alex (2015-11-25). “Wikidata: The New Rosetta Stone | CCCB LAB”CCCB LAB. Retrieved 2016-12-12.

Holone, Harald (2016-12-10). “The filter bubble and its effect on online personal health information”. Croatian Medical Journal. 57 (3): 298–301. doi:10.3325/cmj.2016.57.298ISSN 0353-9504PMC 4937233PMID 27374832.

The information age. (1995). The Futurist, 29(6), 2. Retrieved from http://search.proquest.com.ezproxy.is.ed.ac.uk/docview/218589900?accountid=10673

Jütte, Bernd Justin (2016-03-01). “Coexisting digital exploitation for creative content and the private use exception”. International Journal of Law and Information Technology. 24 (1): 1–21. doi:10.1093/ijlit/eav020ISSN 0967-0769.

Kim, Jinyoung; Gambino, Andrew (2016-12-01). “Do we trust the crowd or information system? Effects of personalization and bandwagon cues on users’ attitudes and behavioral intentions toward a restaurant recommendation website”. Computers in Human Behavior. 65: 369–379. doi:10.1016/j.chb.2016.08.038.

Knowledge, HBS Working. “Wikipedia Or Encyclopædia Britannica: Which Has More Bias?”. Forbes. Retrieved 2016-12-10.

Korjus, Kaspar. “Governments must embrace the Information Age or risk becoming obsolete”. TechCrunch. Retrieved 2016-12-10.

Landauer, Carl (2016-12-01). “From Moore’s Law to More’s Utopia: The Candy Crushing of Internet Law”. Leiden Journal of International Law. 29 (4): 1125–1146. doi:10.1017/S0922156516000546ISSN 0922-1565.

Mathews, Jay (2015-09-24). “Is Hillary Clinton getting taller? Or is the Internet getting dumber?”. The Washington Post. ISSN 0190-8286. Retrieved 2016-12-10.

Mims, C. (2016, May 16). WSJ.D technology: Assessing fears of facebook bias. Wall Street Journal Retrieved from http://search.proquest.com.ezproxy.is.ed.ac.uk/docview/1789005845?accountid=10673

Mittelstadt, Brent Daniel; Floridi, Luciano (2015-05-23). “The Ethics of Big Data: Current and Foreseeable Issues in Biomedical Contexts”. Science and Engineering Ethics. 22 (2): 303–341. doi:10.1007/s11948-015-9652-2ISSN 1353-3452.

Naughton, John (2016-12-11). “Digital natives can handle the truth. Trouble is, they can’t find it”. The Guardian. ISSN 0261-3077. Retrieved 2016-12-12.

Nolin, Jan; Olson, Nasrine (2016-03-24). “The Internet of Things and convenience”. Internet Research. 26 (2): 360–376. doi:10.1108/IntR-03-2014-0082ISSN 1066-2243.

Proserpio, L, & Gioia, D 2007, ‘Teaching the Virtual Generation’, Academy Of Management Learning & Education, 6, 1, pp. 69-80, Business Source Alumni Edition, EBSCOhost, viewed 10 December 2016.

Rader, Emilee (2017-02-01). “Examining user surprise as a symptom of algorithmic filtering”. International Journal of Human-Computer Studies. 98: 72–88. doi:10.1016/j.ijhcs.2016.10.005.

Roach, S 1999, ‘Bubble Net’, Electric Perspectives, 24, 5, p. 82, Business Source Complete, EBSCOhost, viewed 10 December 2016.

Interfax (2016). Putin calls for countering of monopoly of Western media in world. Interfax: Russia & CIS Presidential Bulletin Retrieved from http://search.proquest.com.ezproxy.is.ed.ac.uk/docview/1800735002?accountid=10673

Seetharaman, D. (2016). Mark Zuckerberg continues to defend Facebook against criticism it may have swayed election; CEO says social media site’s role isn’t to be ‘arbiters of truth’. Wall Street Journal (Online) Retrieved from http://search.proquest.com.ezproxy.is.ed.ac.uk/docview/1838714171?accountid=10673

Selinger, Evan C 2016, ‘Why does our privacy really matter?’, Christian Science Monitor, 22 April, Academic Search Premier, EBSCOhost, viewed 10 December 2016.

Sentance, Rebecca (2016). “Everything you need to know about Wikimedia’s ‘Knowledge Engine’ so far | Search Engine Watch”. Retrieved 2016-12-10.

Smith, L.N. (2016). ‘School libraries, political information and information literacy provision: findings from a Scottish study’ Journal of Information Literacy, vol 10, no. 2, pp.3-25.DOI:10.11645/10.2.2097

Sinclair, Stephen; Bramley, Glen (2011-01-01). “Beyond Virtual Inclusion – Communications Inclusion and Digital Divisions”. Social Policy and Society. 10 (1): 1–11. doi:10.1017/S1474746410000345ISSN 1475-3073.

Soma, Katrine; Onwezen, Marleen C; Salverda, Irini E; van Dam, Rosalie I (2016-02-01). “Roles of citizens in environmental governance in the Information Age — four theoretical perspectives”. Current Opinion in Environmental Sustainability. Sustainability governance and transformation 2016: Informational governance and environmental sustainability. 18: 122–130. doi:10.1016/j.cosust.2015.12.009

Stephens, W (2016). ‘Teach Internet Research Skills’, School Library Journal, 62, 6, pp. 15-16, Education Source, EBSCOhost, viewed 10 December 2016.

Tambini, Damian; Labo, Sharif (2016-06-13). “Digital intermediaries in the UK: implications for news plurality”info18 (4): 33–58. doi:10.1108/info-12-2015-0056ISSN 1463-6697.

“A new kind of weather”. The Economist. 2016-03-26. ISSN 0013-0613. Retrieved 2016-12-10.

Toobin, Jeffrey (29 September 2014). “Google and the Right to Be Forgotten”The New Yorker. Retrieved 2016-12-11.

Tuffley, D, & Antonio, A 2016, ‘Ethics in the Information Age’, AQ: Australian Quarterly, 87, 1, pp. 19-40, Political Science Complete, EBSCOhost, viewed 10 December 2016.

Upworthy celebrates power of empathy with event in NYC and release of new research study. (2016, Nov 15). PR Newswire Retrieved from http://search.proquest.com.ezproxy.is.ed.ac.uk/docview/1839078078?accountid=10673

Usher-Layser, N (2016). ‘Newsfeed: Facebook, Filtering and News Consumption. (cover story)’, Phi Kappa Phi Forum, 96, 3, pp. 18-21, Education Source, EBSCOhost, viewed 10 December 2016.

van Dijck, Jose. Culture of ConnectivityA Critical History of Social Media – Oxford Scholarshipdoi:10.1093/acprof:oso/9780199970773.001.0001.

Velocci, A. L. (2000). Extent of internet’s value remains open to debate. Aviation Week & Space Technology, 153(20), 86. Retrieved from http://search.proquest.com/docview/206085681?accountid=10673

Viner, Katherine (2016-07-12). “How technology disrupted the truth”The GuardianISSN 0261-3077. Retrieved 2016-12-11.

Wadhera, Mike. “The Information Age is over; welcome to the Experience Age”TechCrunch. Retrieved 2016-12-12.

Weichert, Stephan (2016-01-01). “From Swarm Intelligence to Swarm Malice: An Appeal”. Social Media + Society. 2 (1): 2056305116640560. doi:10.1177/2056305116640560ISSN 2056-3051.

Winkler, R. (2016). Business news: Big Silicon Valley voice cuts himself off — Marc Andreessen takes ‘Twitter break’ to the bewilderment of other techies.Wall Street Journal Retrieved from http://search.proquest.com.ezproxy.is.ed.ac.uk/docview/1824805569?accountid=10673

White, Andrew (2016-12-01). “Manuel Castells’s trilogy the information age: economy, society, and culture”. Information, Communication & Society. 19 (12): 1673–1678. doi:10.1080/1369118X.2016.1151066ISSN 1369-118X.

Wine, Lois D. (2016-03-28). “School Librarians as Technology Leaders: An Evolution in Practice”. Journal of Education for Library and Information Science. doi:10.12783/issn.2328-2967/57/2/12ISSN 2328-2967.

Yes, I’d lie to you; The post-truth world. 2016. The Economist, 420(9006), pp. 20.

Zekos, G. I. (2016). Intellectual Property Rights: A

Legal and Economic Investigation. IUP Journal Of Knowledge Management, 14(3), 28-71.

Search failure – Information Retrieval in an age of Infoglut

Search failure:

The challenges facing information retrieval in an age of information explosion.

 

Abstract:

This article takes, as its starting point, the news that Wikipedia were reportedly developing a ‘Knowledge Engine’ and focuses on the most dominant web search engine, Google, to examine the “consecrated status” (Hillis, Petit & Jarrett, 2013) it has achieved and its transparency, reliability & trustworthiness for everyday searchers.

A bit of light reading on information retrieval – Own work, CC-BY-SA.

“Commercial search engines dominate search-engine use of the Internet, and they’re employing proprietary technologies to consolidate channels of access to the Internet’s knowledge and information.” (Cuthbertson, 2016)

 

On 16th February 2016, Newsweek published a story entitled ‘Wikipedia Takes on Google with New ‘Transparent’ Search Engine’. The figure applied for, and granted by the Knight Foundation, was a reported $250,000 dollars as part of the Wikimedia Foundation’s $2.5 million programme to build ‘the Internet’s first transparent search engine’.

The sum applied for was relatively insignificant when compared to Google’s reported $75 billion revenue in 2015 (Robinson, 2016). Yet, it posed a significant question; a fundamental one. Just how transparent is Google?

 

Two further concerns can be identified from the letter to Wikimedia granting the application: “supporting stage one development of the Knowledge Engine by Wikipedia, a system for discovering reliable and trustworthy public information on the Internet.”(Cuthbertson, 2016). This goes to the heart of the current debate on modern information retrieval: transparency, reliability and trustworthiness? How then are we faring in these three measures?

 

  1. Defining Information Retrieval

Informational Retrieval is defined as “a field concerned with the structure, analysis, organisation, storage, searching, and retrieval of information.” (Salton in Croft, Metzler & Strohman, 2010, p.1).

Croft et al (2010) identify three crucial concepts in information retrieval:

  • Relevance – Does the returned value satisfy the user searching for it.
  • Evaluation  – Evaluating the ranking algorithm on its precision and recall.
  • Information Needs  – What needs generated the query in the first place.

Today, since the advent of the internet, this definition needs to be understood in terms of how pervasive ‘search’ has become. “Search is the way we now live.” (Darnton in Hillis, Petit & Jarrett, 2013, p.5). We are all now ‘searchers’ and the act of ‘searching’ (or ‘googling’) has become intrinsic to our daily lives.

By Typing_example.ogv: NotFromUtrecht derivative work: Parzi [CC BY-SA 3.0 (http://creativecommons.org/licenses/by-sa/3.0) or GFDL (http://www.gnu.org/copyleft/fdl.html)], via Wikimedia Commons
  1. Dominance of one search engine

 

When you turn on a tap you expect clean water to come out and when you do a search you expect good information to come out” (Swift in Hillis, Petit & Jarrett, 2013)

 

With over 60 trillion pages (Fichter and Wisniewski, 2014) and terabytes of unstructured data to navigate, the need for speedy & accurate responses to millions of queries has never been more important.

 

Navigating the vast sea of information present on the web means the field of Information Retrieval necessitates wrestling with, and constantly tweaking, the design of complex computer algorithms (determining a top 10 list of ‘relevant’ page results through over 200 factors).

 

Google, powered by its PageRank algorithm, has dominated I.R. since the early 1990s, indexing the web like a “back-of-the-book” index (Chowdhury, 2010, p.5). While this oversimplifies the complexity of the task, modern information retrieval, in searching through increasingly multimedia online resources, has necessitated the addition of newer more sophisticated models. Utilising ‘artificial intelligence’ & semantic search technology to complement the PageRank algorithm, Google now navigates through the content of pages & generates suggested ‘answers’ to queries as well as the 10 clickable links users commonly expect.

 

According to 2011 figures in Hillis, Petit & Jarrett (2013), Google processed 91% of searches internationally and 97.4% of the searches made using mobile devices. This undoubted & sustained dominance has led to accusations of abuse of power in two recent instances.

 

Nicas & Kendall (2016) report that the Federal Trade Commission along with European regulators are examining claims that Google has been abusing its position in terms of smartphone companies feeling they had to give Google Services preferential treatment because of Android’s dominance.

 

In addition, Robinson (2016) states that the Authors Guild are petitioning the Supreme Court over Google’s alleged copyright-infringement; going back a decade ago when over 20 million library books were digitised without compensation or author/publisher permission. The argument is that the content taken has since been utilised by Google for commercial gain to generate more traffic, more advertising money and thus confer on them market leader status. This echoes the New Yorker article’s response to Google’s aspiration to build a digital universal library: “Such messianism cannot obscure the central truth about Google Book Search: it is a business” (Toobin in Hillis, Petit & Jarrett, 2013).

 

  1. PageRank

Google’s business is powered, like every search engine, by its ranking algorithm. For Cahill et al (2009), Google’s “PageRank is a quantitative rather than qualitative system”.  PageRank works by ranking pages in terms of how well linked a page is, how often it is clicked on and the importance of the page(s) that links to it. In this way, PageRank assigns importance to a page.

 

Other parameters are taken into consideration including, most notably, the anchor text which provides a short descriptive summary of the page it links to. However, the anchor text has been shown to be vulnerable to manipulation, primarily from bloggers, by the process known as ‘Google bombing’. Google bombing is defined as “the activity of designing Internet links that will bias search engine results so as to create an

inaccurate impression of the search target” (Price in Bar-Ilan, 2007).  Two famous examples include when Microsoft came as top result for the query ‘More evil than Satan’ and when President Bush ranked as first result for ‘miserable failure’. Bar-Ilan (2007) suggests google bombs come about for a variety of reasons: ‘fun, ‘personal promotion’, ‘commercial’, ‘justice’, ‘ideological’ and ‘political’.

 

Although reluctant to alter search results, the reputational damage google bombs were having necessitated a response. In the end, Google altered the algorithm to defuse a number of google bombs. Despite this, “spam or joke sites still float their way to the top.”(Cahill et al, 2009) so there is a clear argument to be had about Google, as a private corporation, continuing to ‘tinker’ with the results delivered by its algorithm and how much its coders should, or should not, arbitrate access to the web in this way. After all, the algorithm will already bear hallmarks of their own assumptions without any transparency on how these decisions are arrived at. Further, Google Bombs, Byrne (2004) argues, empower those web users whom the ranking system, for whatever reason, has disenfranchised.

 

Just how reliable & trustworthy is Google?

 

Easy, efficient, rapid and total access to Truth is the siren song of Google and the culture of search. The price of access: your monetizable information.”(Hillis, Petit & Jarrett, 2013, p.7)

For Cahill et al (2009), Google has made the process of searching too easy and searchers have becoming lazier as a result; accepting Google’s ranking at face value. Markland in van Dijck (2010) makes the point that students favouring of Google means they are dispensing with the services libraries provide. The implication being that, despite library information services delivering a more relevant & higher quality search result, Google’s quick & easy ‘fast food’ approach is hard to compete with.

This seemingly default trust in the neutrality of Google’s ranking algorithm also has a ‘funnelling effect’ according to Beel & Gipp (2009); narrowing the sources clicked upon 90% of the time to just the first page of results with a 42% click through on the first choice alone. This then creates a cosy consensus in terms of the fortunate pages clicked upon which will improve their ranking while “smaller, less affluent, alternative sites are doubly punished by ranking algorithms and lethargic searchers.” (Pan et al. in van Dijck, 2010)

 

While Google would no doubt argue that all search engines closely guard how their ranking algorithms are calibrated to protect them from aggressive competition, click fraud and SEO marketing, the secrecy is clearly at odds with principles of public librarianship. Further, Van Dijck (2010) argues that this worrying failure to disclose is concealing how knowledge is produced through Google’s network and the commercial nature of Google’s search engine. After all, search engines greatest asset is the metadata each search leaves behind. This data can be aggregated and used by the search engine to create profiles of individual search behaviour and collective profiles which can then be passed on to other commercial companies for profit. That is not to say it always does but there is little legislation to stop it in an area that is largely unregulated. The right to privacy does not, it seems, extend to metadata and ‘in an era in which knowledge is the only bankable commodity, search engines own the exchange floor.’ (Halavais in van Dijck, 2010)

The University of Edinburgh by Mihaela Bodlovic – http://www.aliceboreasphotography.com/ (CC-BY-SA)

 

  1. Scholarly knowledge and the reliability of Google Scholar

When considering the reliability, transparency & trustworthiness of Google and Google Scholar it is pertinent to look at its scope and differences with other similar sites. Unlike Pubmed and Web of Science, Google Scholar is not a human-curated database but is instead an internet search engine therefore its accuracy & content varies greatly depending on what has been submitted to it.  Google Scholar does have an advantage is that it searches the full text of articles therefore users may find searching easier on Scholar compared to WoS or Pubmed which are limited to searching according to the abstract, citations or tags.

Where Google Scholar could be more transparent is in its coverage as some notable publishers have been known, according to van Dijck (2010), to refuse to give access to their databases. Scholar has also been criticised for the lack of completeness of its citations, as well as its covering of social science and humanities databases; the latter an area of strength for Wikipedia according to Park (2011). But the searcher utilising Google Scholar would be unaware of these problems of scope when they came to use it.

Further, Beel & Gipp (2009) state that the ranking system on Google Scholar, leads to articles with lots of citations receiving higher rankings, and as a result, receive even more citations because of this. Hence, while the digitization of sources on the internet opens up new avenues for scholarly exploration, ranking systems can be seen to close ranks on a select few to the exclusion of others.

As Van Dijck (2010) points out: “Popularity in the Google-universe has everything to do with quantity and very little with quality or relevance.” In effect, ranking systems determine which sources we can see but conceal how this determination has come about. This means that we are unable to truly establish the scope & relevance of our search results. In this way, search engines cannot be viewed as neutral, passive instruments but are instead active “actor networks” and “co-producers of academic knowledge.” (van Dijck, 2010).

Further, it can be argued that Google decides which sites are included in its top ten results. With so much to gain commercially, from being discoverable on Google’s first page of results, the practice of Search Engine Optimising (SEO), or manipulating the algorithm to get your site in the top ten search results, has become widespread. SEO techniques can be split into ‘white hat’ (legitimate businesses with a relevant product to sell) and ‘black hat’ (sites who just want clicks and tend not to care about the ‘spamming’ techniques they employ to get them). As a result, PageRank has to be constantly manipulated, as with Google bombs, to counteract the effects of increasingly sophisticated ‘black hat’ techniques. Hence, the need for an improved vigilance & critical evaluation of the searches returned by Google has become a crucial skill in modern information retrieval.

 

  1. The solution: Google’s response to modern information retrieval – Answer Engines

Google is the great innovator and is always seeking newer, better ways of keeping users on its sites and improving its search algorithm. Hence, the arrival of Google Instant in 2010 to autofill suggested keywords to assist searchers. This was followed by Google’s Knowledge Graph (and its Microsoft equivalent Bing Snapshot). These new services seek not just to provide the top ten links to a search query but also to ‘answer’ it by providing a number of the most popular suggested answers on the page results screen (usually showing an excerpt of the related Wikipedia article & images along the side panel), based on, & learning from, previous users’ searches on that topic.

Google’s Knowledge Graph is supported by sources including Wikipedia & Freebase (and the linked data they provide) along with a further innovation, RankBrain, which utilises artificial intelligence to help decipher the 15% of queries Google has not seen before. As Barr (2016) recognises: “A.I. is becoming increasingly important to extract knowledge from Google’s sea of data, particularly when it comes to classifying and recognizing patterns in videos, images, speech and writing.”

Bing Snapshot does much the same. The difference being that Bing provides links to the sources it uses as part of the ‘answers’ it provides. Google provides information but does not attribute it. Without this, it is impossible to verify their accuracy. This seems to be one of the thorniest issues in modern information retrieval; link decay and the disappearing digital provenance of sources. This is in stark contrast to Wikimedia’s efforts in creating Wikidata: “an open-license machine-readable knowledge base” (Dewey 2016) capable of storing digital provenance & structured bibliographic data. Therefore, while Google Knowledge Panels are a step forward, there are issues again over its transparency, reliability & trustworthiness.

Moreover, the 2014 EU Court ruling onthe right to be forgotten’, which Google have stated they will honour, also muddies the waters on issues of transparency & link decay/censorship:

Accurate search results are vanishing in Europe with no public explanation, no real proof, no judicial review, and no appeals processthe result is an Internet riddled with memory holes — places where inconvenient information simply disappears.”(Fioretti, 2014).

The balance between an individual’s “right to be forgotten” and the freedom of information clearly still has to be found. At the moment, in the name of transparency, both Google and Wikimedia are posting notifications to affected pages that they have received such requests. For those wishing to be ‘forgotten’ this only highlights the matter & fuels speculation unnecessarily.

Wikipedia

 

  1. The solution: Wikipedia’s ‘transparent’ search engine: Discovery

Since the setup of the ‘Discovery’ team in April 2015 and the disclosure of the Knight Foundation grant, there have been mixed noises from Wikimedia with some claiming that there was never any plan to rival Google because a newer ‘internal’ search engine was only ever being developed in order to integrate Wikimedia projects through one search portal.

Ultimately, a lack of consultation between the board and the wider Wikimedia community members reportedly undermined the project & culminated in the resignation of Lila Tretikov, Executive Director of the Wikimedia Foundation, at the end of February and the plans for Discovery were shelved.

However, Sentance (2016) reveals that, in their leaked planning documents for Discovery, the Foundation were indeed looking at the priorities of proprietary search engines, their own reliance on them for traffic and how they could recoup traffic lost to Google (through Google’s Knowledge Graph) at the same time as providing a central hub for information from across all their projects through one search portal. Wikipedia results, after all, regularly featured in the top page of Google results anyway – why not skip the middle man?

Quite how internet searchers may have taken to a completely transparent, non-commercial search engine we’ll possibly never know. However, it remains a tantalizing prospect.

 

  1. The solution: Alternative Search Engines

An awareness of the alternative search engines available for use and their different strengths and weaknesses is a key component of the information literacy needed to navigate this sea of information. Bing Snapshot, for instance, makes greater use of providing the digital provenance for its sources than Google at present.

Notess (2016) serves notice that computational searching (e.g. Wolfram Alpha) continues to flourish along with search engines geared towards data & statistics (e.g. Zanran, DataCite.org and Google Public Data Explorer).

However, knowing about the existence of these differing search engines is one thing but knowing how to successfully navigate them is quite another as Notess (2016) himself concludes where “Finding anything beyond the most basic of statistics requires perseverance and experimenting with a variety of strategies.”

Information literacy, it seems, is key.

Information Literacy
By Ewa Rozkosz via Flickr (CC-BY-SA)

 

  1. The solution: The need for information literacy

Given that electronic library services are maintained by information professionals, “values such as quality assessment, weighed evaluation & transparency” (van Dijck, 2010) are in much greater evidence than in commercial search engines. That is not to say that there aren’t still issues in library OPAC systems: whether it be in terms of the changes in the classification system used over time or the differing levels of adherence by staff to these classification protocols; or the communication to users of best practice in utilising the system.

The use of any search engine, requires literacy among the user group. The fundamental problem remains the disconnect between what a user inputs and what they can feasibly expect at the results stage. Understanding the nature of the search engine being used (proprietary or otherwise) a critical awareness of how knowledge is formed through its network and the type of search statement that will maximise your chances of success are all vital. As van Dijck (2010) states “Knowledge is not simply brokered (‘brought to you’) by Google or other search engines… Students and scholars need to grasp the implications of these mechanisms in order to understand thoroughly the extent of networked power”(Dijck, 2010).

Educating users of this broadens the search landscape, and defuses SEO attempts to circumvent our choices. Information literacy cannot be left to academics or information professionals alone, though they can play a large part in its dissemination. As mentioned at the beginning, we are all ‘searchers’. Therefore, it is incumbent on all of us to become literate in the ways of ‘search’ and pass it on, creating our own knowledge networks. Social media offers us a means of doing this; allowing us to filter information as never before and filtering is “transforming how the web works and how we interact with our world.” (Swanson, 2012)

 

Conclusion

Google may never become any more transparent. Hence, its reliability & trustworthiness will always be hard to judge. Wikipedia’s Knowledge Engine may have offered a distinctive model more in line with these terms but it is unlikely, at least for now, to be able to compete as a global crawler search engine.

 

 

Therefore, it is incumbent on searchers not to presume neutrality or assign any kind of benign munificence on any one search engine. Rather by educating themselves as to the merits & drawbacks of Google and other search engines, users will then be able to formulate their searches, and their use of search engines, with a degree of information literacy. Only then can they hope the returned results will match their individual needs with any degree of satisfaction or success.

Bibliography

  1. Arnold, A. (2007). Artificial intelligence: The dawn of a new search-engine era. Business Leader, 18(12), pp. 22.
  2. Bar‐Ilan, Judit (2007). “Manipulating search engine algorithms: the case of Google”. Journal of Information, Communication and Ethics in Society 5 (2/3): 155–166. doi:1108/14779960710837623. ISSN1477-996X.
  3. Barr, A. (2016). WSJ.D Technology: Google Taps A.I. Chief To Replace Departing Search-Engine Head. Wall Street Journal. ISSN 00999660.
  4. Beel, J.; Gipp, B. (2009). “Google Scholar’s ranking algorithm: The impact of citation counts (An empirical study)”. 2009 Third International Conference on Research Challenges in Information Science: 439–446. doi:1109/RCIS.2009.5089308.
  5. Byrne, S. (2004). Stop worrying and learn to love the Google-bomb. Fibreculture, (3).
  6. Cahill, Kay; Chalut, Renee (2009). “Optimal Results: What Libraries Need to Know About Google and Search Engine Optimization”. The Reference Librarian 50 (3): 234–247. doi:1080/02763870902961969. ISSN0276-3877.
  7. Chowdhury, G.G. (2010). Introduction to modern information retrieval. Facet. ISBN 9781856046947.
  8. Croft, W. Bruce; Metzler, Donald; Strohman, Trevor (2010). Search Engines: Information Retrieval in Practice. Pearson Education. ISBN9780131364899.
  9. Cuthbertson, A. (2016)“Wikipedia takes on Google with new ‘transparent’ search engine”. Available at: http://europe.newsweek.com/wikipedia-takes-google-new-transparent-search-engine-427028. Retrieved 2016-05-08.
  10. Dewey, Caitlin (2016). “You probably haven’t even noticed Google’s sketchy quest to control the world’s knowledge”. The Washington Post. ISSN0190-8286. Retrieved 2016-05-13.
  11. Fichter, D. and Wisniewski, J. (2014). Being Findable: Search Engine Optimization for Library Websites. Online Searcher, 38(5), pp. 74-76.
  12. Fioretti, Julia (2014). “Wikipedia fights back against Europe’s right to be forgotten”. Reuters. Retrieved 2016-05-02.
  13. Foster, Allen; Rafferty, Pauline (2011). Innovations in Information Retrieval: Perspectives for Theory and Practice. Facet. ISBN9781856046978.
  14. Gunter, Barrie; Rowlands, Ian; Nicholas, David (2009). The Google Generation: Are ICT Innovations Changing Information-seeking Behaviour?. Chandos Publishing. ISBN9781843345572.
  15. Halcoussis, Dennis; Halverson, Aniko; Lowenberg, Anton D.; Lowenberg, Susan (2002). “An Empirical Analysis of Web Catalog User Experiences”. Information Technology and Libraries 21 (4). ISSN0730-9295.
  16. Hillis, Ken; Petit, Michael; Jarrett, Kylie (2012). Google and the Culture of Search. Routledge. ISBN9781136933066.
  17. Hoffman, A.J. (2016). Reflections: Academia’s Emerging Crisis of Relevance and the Consequent Role of the Engaged Scholar. Journal of Change Management, 16(2), pp. 77.
  18. Kendall, Susan. “LibGuides: PubMed, Web of Science, or Google Scholar? A behind-the-scenes guide for life scientists.  : So which is better: PubMed, Web of Science, or Google Scholar?”. libguides.lib.msu.edu. Retrieved 2016-05-02.
  19. Koehler, W.C. (1999). “Classifying Web sites and Web pages: the use of metrics and URL characteristics as markers”. Journal of Librarianship and Information Science 31 (1): 21–31. doi:1177/0961000994244336. ISSN0000-0000.
  20. LaFrance, Adrienne (2016). “The Internet’s Favorite Website”. The Atlantic. Retrieved 2016-05-12.
  21. Lecher, Colin (2016). “Google will apply the ‘right to be forgotten’ to all EU searches next week”. The Verge. Retrieved 2016-04-29.
  22. Mendez-Wilson, D (2000). ‘Humanizing The Online Experience’, Wireless Week, 6, 47, p. 30, Business Source Premier, EBSCOhost, viewed 1 May 2016.
  23. Milne, David N.; Witten, Ian H.; Nichols, David M. (2007). “A Knowledge-based Search Engine Powered by Wikipedia”. Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management. CIKM ’07 (New York, NY, USA: ACM): 445–454. doi:1145/1321440.1321504. ISBN9781595938039.
  24. Moran, Wes & Tretikova, Lila (2016). “Clarity on the future of Wikimedia search – Wikimedia blog”. Retrieved 2016-05-10.
  25. Nicas, J. and Kendall, B. (2016). “U.S. Expands Google Probe”. Wall Street Journal. ISSN 00999660.
  26. Notess, G.R., (2013). Search Engine to Knowledge Engine? Online Searcher, 37(4), pp. 61-63.
  27. Notess, G.R. (2016). SEARCH ENGINE update. Online Searcher, 40(2), pp. 8-9.
  28. Notess, G.R., (2016). SEARCH ENGINE update. Online Searcher, 40(1), pp. 8-9.
  29. Notess, G.R., (2014). Computational, Numeric, and Data Searching. Online Searcher, 38(4), pp. 65-67.
  30. Park, Taemin Kim (2011). “The visibility of Wikipedia in scholarly publications”. First Monday 16 (8). doi:5210/fm.v16i8.3492. ISSN1396-0466.
  31. Price, Gary (2016). “Digital Preservation Coalition Releases New Tech Watch Report on Preserving Social Media | LJ INFOdocket”. www.infodocket.com. Retrieved 2016-05-01.
  32. Ratfcliff, Chris (2016).“Six of the most interesting SEM news stories of the week” | Search Engine Watch”. Retrieved 2016-05-10.
  33. Robinson, R. (2016) How Google Stole the Work of Millions of Authors. Wall Street Journal. ISSN 00999660.
  34. Rowley, J. E.; Hartley, Richard J. (2008). Organizing Knowledge: An Introduction to Managing Access to Information. Ashgate Publishing, Ltd. ISBN9780754644316.
  35. Sandhu, A. K.; Liu, T. (2014). “Wikipedia search engine: Interactive information retrieval interface design”. 2014 3rd International Conference on User Science and Engineering (i-USEr): 18–23. doi:1109/IUSER.2014.7002670
  36. Sentance, R. (2016). “Everything you need to know about Wikimedia’s ‘Knowledge Engine’ so far | Search Engine Watch. Retrieved 2016-05-02.
  37. Simonite, Tom (2013).“The Decline of Wikipedia”. MIT Technology Review. Retrieved 2016-05-09.
  38. Swanson, Troy (2012). Managing Social Media in Libraries: Finding Collaboration, Coordination, and Focus. Elsevier. ISBN9781780633770.
  39. Van Dijck, José (2010). “Search engines and the production of academic knowledge”. International Journal of Cultural Studies 13 (6): 574–592. doi:1177/1367877910376582. ISSN1367-8779.
  40. Wells, David (2007). “What is a library OPAC?”. The Electronic Library 25 (4): 386–394. doi:1108/02640470710779790. ISSN0264-0473.

 

Bibliographic databases utilised

 

OER17 – Less goat and More (empathetic) bear

I attended OER16, my first OER conference, but did not present. I had my own side room, just off the main drag, where I could provide respite from the main programme and entertain the Wiki curious.

Mostly I fired out tweets, recorded sessions and observed. And, it has to be said, had a great time doing so.

This year’s OER17 Conference was a different kettle of fish. I felt there was a lot to say, and be said, so I ill-advisedly submitted four sessions (I retracted a fifth on ‘Wikimedia vs. the Right to Forgotten‘).

Martin Poulter: Putting Wikipedia and Open Practice into the mainstream in a University at OER17 (Own work, CC-BY-SA)
Martin Poulter: Putting Wikipedia and Open Practice into the mainstream in a University at OER17 (Own work, CC-BY-SA)

Thankfully, my colleague Martin Poulter came to my aid to assist, and improve, on two of these sessions (one about goats and one about Wikimedia games) and in the end I’m glad we went for it this year because, between Lucy Crompton-Reid’s brilliant keynote and fab sessions from Alice White, Stefan Lutschinger, Sara Mörtsell, Martin Poulter and Navino Evans, I think the Wikimedia presentations played a really positive role in this year’s conference after what has been such a low year in politics. But maybe I’m just biased.

And our biases were laid out in the open this year, I think, because the theme was ‘The Politics of Open‘ and politics is, no getting away from it, deeply personal. ‘Shouting from the heart‘ was the mot juste. Perhaps because of this, or the steady supply of coffee and biscuits, the conference did seem that much fuller of warm embraces, smiles and laughter as much as critical discourse. People being good-natured with one another, huddling together in dark times, espousing what they held to be true. And this was not so much bonhomie as ‘bonfemie’ (doubtful this will catch on) because the conference had such a surfeit of brilliant articulate women forming its backbone with an all-female list of keynotes and plenary speakers. (The Arsenal fans in the pub next door would have appreciated such a strong backbone to their side no doubt.)

Lorna Campbell - The Distance Travelled: Reflections on open education policy in the UK since the Cape Town Declaration (Own work, CC-BY-SA)
Lorna Campbell – The Distance Travelled: Reflections on open education policy in the UK since the Cape Town Declaration (Own work, CC-BY-SA)

I still need to catch up on Thursday’s talks but here’s what I observed:

I observed passion (Lorna Campbell’s blistering first talk on UK Open Education policy left scorched earth in her wake and her second ‘Shouting from the Heart’, invoking the Declaration of Arbroath, had her choked and us fair greetin’).

I observed cool logic (because logic is cool and, from what I observed, there are no greater purveyors of undeniable reasoning than the three M’s: Martin Poulter, Martin Weller and Melissa Highton).

Handy definitions from Melissa Highton's talk - 'Brexit, praxis and OER redux – why not being open now costs us money in the future.' (Own work, CC-BY-SA)
Handy definitions from Melissa Highton’s talk – ‘Brexit, praxis and OER redux – why not being open now costs us money in the future.’ (Own work, CC-BY-SA)

I observed fun and playfulness in our Wikimedia Games session (which exposed Lucy’s competitive side) and Charlie Farley’s Board Game Jam. The #LILAC17 Credo Digital Literacy award-winning Charlie Farley no less.

Passion. Logic. Playfulness. Qualities that, to my mind, are what education should be about.

Godwin’s Law (redefined) meant that Trexit had to be discussed at some point during the conference while calls to action and calls for solidarity were also asked and answered (Let’s make copyright right right now“,Repeal the 8th” and “#IWill” for instance).

'Get your smart phone out!' - Lisette Kalshoven, and Alek Tarkwoski fixing copyright for teachers and students at OER17 (Own work, CC-BY-SA)
‘Get your smart phone out!’ – Lisette Kalshoven, and Alek Tarkwoski fixing copyright for teachers and students at OER17 (Own work, CC-BY-SA)

And we came out of the two days feeling pretty upbeat that there may actually be a way through the woods, out of the “unenlightenment” and into the bright future of a Viv Rolfe and David Kernoghan chaired #OER18.

(I could be wrong but there may even have been a moment of demob happiness around the room watching David rise out his seat to announce we could call him #OER18 co-chair).

No mean feat anyway after a grim year.

In this respect, I think Maha Bali’s keynote was an inspired choice and really set the tone for the whole two days. If politics is personal then the act of gift-giving is personal too; imposing your choices on someone else; whether it is the ‘gift’ of an open educational resource or the ‘gift’ of your elder brother buying you a Pixies CD for your birthday when he had the only CD player in the house and you’d never heard of the Pixies at that point. (He gave me a cassette copy in the end and kept the CD).

I’m grateful to Maha for the reminder of my brother’s wiliness but also that the best quality an educator has (beyond passion, logic and playfulness) is empathy.

Being able to empathise with other learners and considering how they can best access learning materials and the kinds of barriers they come up against is critical in OEP. You may think you’re being inclusive but we are too often trapped in our own worldview, traveling those same over-trammelled thought pathways; unable to see that our solutions aren’t really solutions at all or understand, or even acknowledge, the challenges of access or licensing others face; the obstacles they may have to overcome; the risks they may have to take.

Self-absorption in all its forms kills empathy, let alone compassion. When we focus on ourselves, our world contracts as our problems and preoccupations loom large. But when we focus on others, our world expands. Our own problems drift to the periphery of the mind and so seem smaller, and we increase our capacity for connection – or compassionate action.”
― Daniel Goleman, Social Intelligence: The New Science of Human Relationships

So that’s my takeaway:

Be less goat.

Be more empathetic bear.

Cheers to Josie, Alek, Maren and the rest of the ALT team.

Link to the summary of the Wikimedia related sessions at the Open Education Conference.

Related post: