Month: June 2016

COMING SOON: Wikidata & Wikisource Showcase for Repo-Fringe

Wikisource logo
Wikisource logo
Wikidata logo
Wikidata logo

For Repo-Fringe 2016, myself and Histropedia’s Navino Evans will be co-presenting a showcase of two of Wikipedia’s sister projects: Wikisource, the free content library, and Wikidata, the structured data knowledge base. With both projects, it is not about what they hold in their repositories so much as what that knowledge means to the user able to access it; be it the experience of being able to commune with the past through Wikisource for those authentic ‘shiver-inducing’ moments of digital contact with library & archival materials or being able to manipulate & visualise structured data through Wikidata, actually querying & utilising information on Wikipedia, as never before in myriad ways. The possibilities for both projects are endless and highlight the importance of curating & safeguarding repositories of open knowledge such as these.

Hence our showcase event, as part of Repository Fringe 2016 on 2nd August at the John McIntyre Conference Centre in Edinburgh, will focus on this and provide practical demonstrations of how to engage with the past, present & future with these two projects.

Consequently, the English teacher part of me has opted for a title which attempts to sum this up:

“It’s not what you do. It’s what it does to you.”

Wikidata and Wikisource Showcase – 2nd August 2016

Engaging with the past, present & future with Wikipedia’s sister projects.

This is a nod to Simon Armitage’s poem, ‘It ain’t what you do, it’s what it does to you‘, a hymn of praise to the experiential.

 

It ain’t what you do, it’s what it does to you

 

I have not bummed across America
with only a dollar to spare, one pair
of busted Levi’s and a bowie knife.
I have lived with thieves in Manchester.

I have not padded through the Taj Mahal,
barefoot, listening to the space between
each footfall picking up and putting down
its print against the marble floor. But I

skimmed flat stones across Black Moss on a day
so still I could hear each set of ripples
as they crossed. I felt each stone’s inertia
spend itself against the water; then sink.

I have not toyed with a parachute cord
while perched on the lip of a light-aircraft;
but I held the wobbly head of a boy
at the day centre, and stroked his fat hands.

And I guess that the tightness in the throat
and the tiny cascading sensation
somewhere inside us are both part of that
sense of something else. That feeling, I mean.

In addition, I am reminded of another poem on the power of the experiential:

The famous aviation poem written in 1941 by 19-year-old Pilot Officer John Gillespie Magee Jr, three months before he was killed.

Oh! I have slipped the surly bonds of earth,
And danced the skies on laughter-silvered wings;
Sunward I’ve climbed, and joined the tumbling mirth
Of sun-split clouds, –and done a hundred things
You have not dreamed of –Wheeled and soared and swung
High in the sunlit silence. Hov’ring there
I’ve chased the shouting wind along, and flung
My eager craft through footless halls of air…
Up, up the long, delirious, burning blue
I’ve topped the wind-swept heights with easy grace
Where never lark or even eagle flew —
And, while with silent lifting mind I’ve trod
The high untrespassed sanctity of space,
Put out my hand, and touched the face of God.

A little light Summer reading – Wikipedia & the PGCAP course

I was pleased we were able to host a week themed on ‘Wikimedia & Open Knowledge’ as part of the University of Edinburgh’s Postgraduate Certificate of Academic Practice.

Participants on the course were invited to think critically about the role of Wikipedia in academia.

In particular, to read, consider, contrast and discuss four articles:

  • The first by Dr. Martin Poulter, Wikimedian in Residence at the University of Oxford, is highly recommended in terms of articulating Wikipedia & its sister projects role in allowing digital ‘shiver-inducing’ contact with library & archival material;
Search Failure: The Challenge of Modern Information Retrieval in an age of information explosion.
Search Failure: The Challenge of Modern Information Retrieval in an age of information explosion.

In addition – RECOMMENDED reading on Wikipedia’s role in academia.

 

  1. https://wikiedu.org/blog/2014/10/14/wikipedia-student-writing/ – HIGHLY RECOMMENDED
  2. https://outreach.wikimedia.org/wiki/Education/Reasons_to_use_Wikipedia
  3. http://www.theatlantic.com/technology/archive/2016/05/people-love-wikipedia/482268/
  4. https://medium.com/@oiioxford/wikipedia-s-ongoing-search-for-the-sum-of-all-human-knowledge-6216fb478bcf#.5gf0mu71b  RECOMMENDED
  5. https://wikiedu.org/blog/2016/01/14/wikipedia-15-and-education/
  6. https://www.refme.com/blog/2016/01/15/wikipedia-the-digital-gateway-to-academic-research

This was my response to the reading (and some additional reading).

Title:

Search failure: the challenges facing information retrieval in an age of information explosion.

 

Abstract:

This article takes, as its starting point, the news that Wikipedia were reportedly developing a ‘Knowledge Engine’ and focuses on the most dominant web search engine, Google, to examine the “consecrated status” (Hillis, Petit & Jarrett, 2013) it has achieved and its transparency, reliability & trustworthiness for everyday searchers.

 

Introduction:

The purpose of this article is to examine the pitfalls of modern information retrieval & attempts to circumnavigate them, with a focus on the main issues surrounding Google as the world’s most dominant search engine.

 

“Commercial search engines dominate search-engine use of the Internet, and they’re employing proprietary technologies to consolidate channels of access to the Internet’s knowledge and information.” (Cuthbertson, 2016)

 

On 16th February 2016, Newsweek published a story entitled ‘Wikipedia Takes on Google with New ‘Transparent’ Search Engine’. The figure applied for, and granted by the Knight Foundation, was a reported $250,000 dollars as part of the Wikimedia Foundation’s $2.5 million programme to build ‘the Internet’s first transparent search engine’.

The sum applied for was relatively insignificant when compared to Google’s reported $75 billion revenue in 2015 (Robinson, 2016). Yet, it posed a significant question; a fundamental one. Just how transparent is Google?

 

Two further concerns can be identified from the letter to Wikimedia granting the application: “supporting stage one development of the Knowledge Engine by Wikipedia, a system for discovering reliable and trustworthy public information on the Internet.”(Cuthbertson, 2016). This goes to the heart of the current debate on modern information retrieval: transparency, reliability and trustworthiness? How then are we faring in these three measures?

 

  1. Defining Information Retrieval

Informational Retrieval is defined as “a field concerned with the structure, analysis, organisation, storage, searching, and retrieval of information.” (Salton in Croft, Metzler & Strohman, 2010, p.1).

Croft et al (2010) identify three crucial concepts in information retrieval:

  • Relevance – Does the returned value satisfy the user searching for it.
  • Evaluation  – Evaluating the ranking algorithm on its precision and recall.
  • Information Needs  – What needs generated the query in the first place.

Today, since the advent of the internet, this definition needs to be understood in terms of how pervasive ‘search’ has become. “Search is the way we now live.” (Darnton in Hillis, Petit & Jarrett, 2013, p.5). We are all now ‘searchers’ and the act of ‘searching’ (or ‘googling’) has become intrinsic to our daily lives.

 

  1. Dominance of one search engine

 

When you turn on a tap you expect clean water to come out and when you do a search you expect good information to come out” (Swift in Hillis, Petit & Jarrett, 2013)

 

With over 60 trillion pages (Fichter and Wisniewski, 2014) and terabytes of unstructured data to navigate, the need for speedy & accurate responses to millions of queries has never been more important.

 

Navigating the vast sea of information present on the web means the field of Information Retrieval necessitates wrestling with, and constantly tweaking, the design of complex computer algorithms (determining a top 10 list of ‘relevant’ page results through over 200 factors).

 

Google, powered by its PageRank algorithm, has dominated I.R. since the early 1990s, indexing the web like a “back-of-the-book” index (Chowdhury, 2010, p.5). While this oversimplifies the complexity of the task, modern information retrieval, in searching through increasingly multimedia online resources, has necessitated the addition of newer more sophisticated models. Utilising ‘artificial intelligence’ & semantic search technology to complement the PageRank algorithm, Google now navigates through the content of pages & generates suggested ‘answers’ to queries as well as the 10 clickable links users commonly expect.

 

According to 2011 figures in Hillis, Petit & Jarrett (2013), Google processed 91% of searches internationally and 97.4% of the searches made using mobile devices. This undoubted & sustained dominance has led to accusations of abuse of power in two recent instances.

 

Nicas & Kendall (2016) report that the Federal Trade Commission along with European regulators are examining claims that Google has been abusing its position in terms of smartphone companies feeling they had to give Google Services preferential treatment because of Android’s dominance.

 

In addition, Robinson (2016) states that the Authors Guild are petitioning the Supreme Court over Google’s alleged copyright-infringement; going back a decade ago when over 20 million library books were digitised without compensation or author/publisher permission. The argument is that the content taken has since been utilised by Google for commercial gain to generate more traffic, more advertising money and thus confer on them market leader status. This echoes the New Yorker article’s response to Google’s aspiration to build a digital universal library: “Such messianism cannot obscure the central truth about Google Book Search: it is a business” (Toobin in Hillis, Petit & Jarrett, 2013).

 

  1. PageRank

Google’s business is powered, like every search engine, by its ranking algorithm. For Cahill et al (2009), Google’s “PageRank is a quantitative rather than qualitative system”.  PageRank works by ranking pages in terms of how well linked a page is, how often it is clicked on and the importance of the page(s) that links to it. In this way, PageRank assigns importance to a page.

 

Other parameters are taken into consideration including, most notably, the anchor text which provides a short descriptive summary of the page it links to. However, the anchor text has been shown to be vulnerable to manipulation, primarily from bloggers, by the process known as ‘Google bombing’. Google bombing is defined as “the activity of designing Internet links that will bias search engine results so as to create an

inaccurate impression of the search target” (Price in Bar-Ilan, 2007).  Two famous examples include when Microsoft came as top result for the query ‘More evil than Satan’ and when President Bush ranked as first result for ‘miserable failure’. Bar-Ilan (2007) suggests google bombs come about for a variety of reasons: ‘fun, ‘personal promotion’, ‘commercial’, ‘justice’, ‘ideological’ and ‘political’.

 

Although reluctant to alter search results, the reputational damage google bombs were having necessitated a response. In the end, Google altered the algorithm to defuse a number of google bombs. Despite this, “spam or joke sites still float their way to the top.”(Cahill et al, 2009) so there is a clear argument to be had about Google, as a private corporation, continuing to ‘tinker’ with the results delivered by its algorithm and how much its coders should, or should not, arbitrate access to the web in this way. After all, the algorithm will already bear hallmarks of their own assumptions without any transparency on how these decisions are arrived at. Further, Google Bombs, Byrne (2004) argues, empower those web users whom the ranking system, for whatever reason, has disenfranchised.

 

Just how reliable & trustworthy is Google?

 

Easy, efficient, rapid and total access to Truth is the siren song of Google and the culture of search. The price of access: your monetizable information.”(Hillis, Petit & Jarrett, 2013, p.7)

For Cahill et al (2009), Google has made the process of searching too easy and searchers have becoming lazier as a result; accepting Google’s ranking at face value. Markland in van Dijck (2010) makes the point that students favouring of Google means they are dispensing with the services libraries provide. The implication being that, despite library information services delivering a more relevant & higher quality search result, Google’s quick & easy ‘fast food’ approach is hard to compete with.

This seemingly default trust in the neutrality of Google’s ranking algorithm also has a ‘funnelling effect’ according to Beel & Gipp (2009); narrowing the sources clicked upon 90% of the time to just the first page of results with a 42% click through on the first choice alone. This then creates a cosy consensus in terms of the fortunate pages clicked upon which will improve their ranking while “smaller, less affluent, alternative sites are doubly punished by ranking algorithms and lethargic searchers.” (Pan et al. in van Dijck, 2010)

 

While Google would no doubt argue that all search engines closely guard how their ranking algorithms are calibrated to protect them from aggressive competition, click fraud and SEO marketing, the secrecy is clearly at odds with principles of public librarianship. Further, Van Dijck (2010) argues that this worrying failure to disclose is concealing how knowledge is produced through Google’s network and the commercial nature of Google’s search engine. After all, search engines greatest asset is the metadata each search leaves behind. This data can be aggregated and used by the search engine to create profiles of individual search behaviour and collective profiles which can then be passed on to other commercial companies for profit. That is not to say it always does but there is little legislation to stop it in an area that is largely unregulated. The right to privacy does not, it seems, extend to metadata and ‘in an era in which knowledge is the only bankable commodity, search engines own the exchange floor.’ (Halavais in van Dijck, 2010)

 

  1. Scholarly knowledge and the reliability of Google Scholar

When considering the reliability, transparency & trustworthiness of Google and Google Scholar it is pertinent to look at its scope and differences with other similar sites. Unlike Pubmed and Web of Science, Google Scholar is not a human-curated database but is instead an internet search engine therefore its accuracy & content varies greatly depending on what has been submitted to it.  Google Scholar does have an advantage is that it searches the full text of articles therefore users may find searching easier on Scholar compared to WoS or Pubmed which are limited to searching according to the abstract, citations or tags.

Where Google Scholar could be more transparent is in its coverage as some notable publishers have been known, according to van Dijck (2010), to refuse to give access to their databases. Scholar has also been criticised for the lack of completeness of its citations, as well as its covering of social science and humanities databases; the latter an area of strength for Wikipedia according to Park (2011). But the searcher utilising Google Scholar would be unaware of these problems of scope when they came to use it.

Further, Beel & Gipp (2009) state that the ranking system on Google Scholar, leads to articles with lots of citations receiving higher rankings, and as a result, receive even more citations because of this. Hence, while the digitization of sources on the internet opens up new avenues for scholarly exploration, ranking systems can be seen to close ranks on a select few to the exclusion of others.

As Van Dijck (2010) points out: “Popularity in the Google-universe has everything to do with quantity and very little with quality or relevance.” In effect, ranking systems determine which sources we can see but conceal how this determination has come about. This means that we are unable to truly establish the scope & relevance of our search results. In this way, search engines cannot be viewed as neutral, passive instruments but are instead active “actor networks” and “co-producers of academic knowledge.” (van Dijck, 2010).

Further, it can be argued that Google decides which sites are included in its top ten results. With so much to gain commercially, from being discoverable on Google’s first page of results, the practice of Search Engine Optimising (SEO), or manipulating the algorithm to get your site in the top ten search results, has become widespread. SEO techniques can be split into ‘white hat’ (legitimate businesses with a relevant product to sell) and ‘black hat’ (sites who just want clicks and tend not to care about the ‘spamming’ techniques they employ to get them). As a result, PageRank has to be constantly manipulated, as with Google bombs, to counteract the effects of increasingly sophisticated ‘black hat’ techniques. Hence, the need for an improved vigilance & critical evaluation of the searches returned by Google has become a crucial skill in modern information retrieval.

 

  1. The solution: Google’s response to modern information retrieval – Answer Engines

Google is the great innovator and is always seeking newer, better ways of keeping users on its sites and improving its search algorithm. Hence, the arrival of Google Instant in 2010 to autofill suggested keywords to assist searchers. This was followed by Google’s Knowledge Graph (and its Microsoft equivalent Bing Snapshot). These new services seek not just to provide the top ten links to a search query but also to ‘answer’ it by providing a number of the most popular suggested answers on the page results screen (usually showing an excerpt of the related Wikipedia article & images along the side panel), based on, & learning from, previous users’ searches on that topic.

Google’s Knowledge Graph is supported by sources including Wikipedia & Freebase (and the linked data they provide) along with a further innovation, RankBrain, which utilises artificial intelligence to help decipher the 15% of queries Google has not seen before. As Barr (2016) recognises: “A.I. is becoming increasingly important to extract knowledge from Google’s sea of data, particularly when it comes to classifying and recognizing patterns in videos, images, speech and writing.”

Bing Snapshot does much the same. The difference being that Bing provides links to the sources it uses as part of the ‘answers’ it provides. Google provides information but does not attribute it. Without this, it is impossible to verify their accuracy. This seems to be one of the thorniest issues in modern information retrieval; link decay and the disappearing digital provenance of sources. This is in stark contrast to Wikimedia’s efforts in creating Wikidata: “an open-license machine-readable knowledge base” (Dewey 2016) capable of storing digital provenance & structured bibliographic data. Therefore, while Google Knowledge Panels are a step forward, there are issues again over its transparency, reliability & trustworthiness.

Moreover, the 2014 EU Court ruling onthe right to be forgotten’, which Google have stated they will honour, also muddies the waters on issues of transparency & link decay/censorship:

Accurate search results are vanishing in Europe with no public explanation, no real proof, no judicial review, and no appeals processthe result is an Internet riddled with memory holes — places where inconvenient information simply disappears.”(Fioretti, 2014).

The balance between an individual’s “right to be forgotten” and the freedom of information clearly still has to be found. At the moment, in the name of transparency, both Google and Wikimedia are posting notifications to affected pages that they have received such requests. For those wishing to be ‘forgotten’ this only highlights the matter & fuels speculation unnecessarily.

 

  1. The solution: Wikipedia’s ‘transparent’ search engine: Discovery

Since the setup of the ‘Discovery’ team in April 2015 and the disclosure of the Knight Foundation grant, there have been mixed noises from Wikimedia with some claiming that there was never any plan to rival Google because a newer ‘internal’ search engine was only ever being developed in order to integrate Wikimedia projects through one search portal.

Ultimately, a lack of consultation between the board and the wider Wikimedia community members reportedly undermined the project & culminated in the resignation of Lila Tretikov, Executive Director of the Wikimedia Foundation, at the end of February and the plans for Discovery were shelved.

However, Sentance (2016) reveals that, in their leaked planning documents for Discovery, the Foundation were indeed looking at the priorities of proprietary search engines, their own reliance on them for traffic and how they could recoup traffic lost to Google (through Google’s Knowledge Graph) at the same time as providing a central hub for information from across all their projects through one search portal. Wikipedia results, after all, regularly featured in the top page of Google results anyway – why not skip the middle man?

Quite how internet searchers may have taken to a completely transparent, non-commercial search engine we’ll possibly never know. However, it remains a tantalizing prospect.

 

  1. The solution: Alternatives Engines

An awareness of the alternative search engines available for use and their different strengths and weaknesses is a key component of the information literacy needed to navigate this sea of information. Bing Snapshot, for instance, makes greater use of providing the digital provenance for its sources than Google at present.

Notess (2016) serves notice that computational searching (e.g. Wolfram Alpha) continues to flourish along with search engines geared towards data & statistics (e.g. Zanran, DataCite.org and Google Public Data Explorer).

However, knowing about the existence of these differing search engines is one thing but knowing how to successfully navigate them is quite another as Notess (2016) himself concludes where “Finding anything beyond the most basic of statistics requires perseverance and experimenting with a variety of strategies.”

Information literacy, it seems, is key.

 

  1. The solution: The need for information literacy

Given that electronic library services are maintained by information professionals, “values such as quality assessment, weighed evaluation & transparency” (van Dijck, 2010) are in much greater evidence than in commercial search engines. That is not to say that there aren’t still issues in library OPAC systems: whether it be in terms of the changes in the classification system used over time or the differing levels of adherence by staff to these classification protocols; or the communication to users of best practice in utilising the system.

The use of any search engine, requires literacy among the user group. The fundamental problem remains the disconnect between what a user inputs and what they can feasibly expect at the results stage. Understanding the nature of the search engine being used (proprietary or otherwise) a critical awareness of how knowledge is formed through its network and the type of search statement that will maximise your chances of success are all vital. As van Dijck (2010) states “Knowledge is not simply brokered (‘brought to you’) by Google or other search engines… Students and scholars need to grasp the implications of these mechanisms in order to understand thoroughly the extent of networked power”(Dijck, 2010).

Educating users of this broadens the search landscape, and defuses SEO attempts to circumvent our choices. Information literacy cannot be left to academics or information professionals alone, though they can play a large part in its dissemination. As mentioned at the beginning, we are all ‘searchers’. Therefore, it is incumbent on all of us to become literate in the ways of ‘search’ and pass it on, creating our own knowledge networks. Social media offers us a means of doing this; allowing us to filter information as never before and filtering is “transforming how the web works and how we interact with our world.” (Swanson, 2012)

 

Conclusion

Google may never become any more transparent. Hence, its reliability & trustworthiness will always be hard to judge. Wikipedia’s Knowledge Engine may have offered a distinctive model more in line with these terms but it is unlikely, at least for now, to be able to compete as a global crawler search engine.

 

 

Therefore, it is incumbent on searchers not to presume neutrality or assign any kind of benign munificence on any one search engine. Rather by educating themselves as to the merits & drawbacks of Google and other search engines, users will then be able to formulate their searches, and their use of search engines, with a degree of information literacy. Only then can they hope the returned results will match their individual needs with any degree of satisfaction or success.

Bibliography

  1. Arnold, A. (2007). Artificial intelligence: The dawn of a new search-engine era. Business Leader, 18(12), pp. 22.
  2. Bar‐Ilan, Judit (2007). “Manipulating search engine algorithms: the case of Google”. Journal of Information, Communication and Ethics in Society 5 (2/3): 155–166. doi:1108/14779960710837623. ISSN1477-996X.
  3. Barr, A. (2016). WSJ.D Technology: Google Taps A.I. Chief To Replace Departing Search-Engine Head. Wall Street Journal. ISSN 00999660.
  4. Beel, J.; Gipp, B. (2009). “Google Scholar’s ranking algorithm: The impact of citation counts (An empirical study)”. 2009 Third International Conference on Research Challenges in Information Science: 439–446. doi:1109/RCIS.2009.5089308.
  5. Byrne, S. (2004). Stop worrying and learn to love the Google-bomb. Fibreculture, (3).
  6. Cahill, Kay; Chalut, Renee (2009). “Optimal Results: What Libraries Need to Know About Google and Search Engine Optimization”. The Reference Librarian 50 (3): 234–247. doi:1080/02763870902961969. ISSN0276-3877.
  7. Chowdhury, G.G. (2010). Introduction to modern information retrieval. Facet. ISBN 9781856046947.
  8. Croft, W. Bruce; Metzler, Donald; Strohman, Trevor (2010). Search Engines: Information Retrieval in Practice. Pearson Education. ISBN9780131364899.
  9. Cuthbertson, A. (2016)“Wikipedia takes on Google with new ‘transparent’ search engine”. Available at: http://europe.newsweek.com/wikipedia-takes-google-new-transparent-search-engine-427028. Retrieved 2016-05-08.
  10. Dewey, Caitlin (2016). “You probably haven’t even noticed Google’s sketchy quest to control the world’s knowledge”. The Washington Post. ISSN0190-8286. Retrieved 2016-05-13.
  11. Fichter, D. and Wisniewski, J. (2014). Being Findable: Search Engine Optimization for Library Websites. Online Searcher, 38(5), pp. 74-76.
  12. Fioretti, Julia (2014). “Wikipedia fights back against Europe’s right to be forgotten”. Reuters. Retrieved 2016-05-02.
  13. Foster, Allen; Rafferty, Pauline (2011). Innovations in Information Retrieval: Perspectives for Theory and Practice. Facet. ISBN9781856046978.
  14. Gunter, Barrie; Rowlands, Ian; Nicholas, David (2009). The Google Generation: Are ICT Innovations Changing Information-seeking Behaviour?. Chandos Publishing. ISBN9781843345572.
  15. Halcoussis, Dennis; Halverson, Aniko; Lowenberg, Anton D.; Lowenberg, Susan (2002). “An Empirical Analysis of Web Catalog User Experiences”. Information Technology and Libraries 21 (4). ISSN0730-9295.
  16. Hillis, Ken; Petit, Michael; Jarrett, Kylie (2012). Google and the Culture of Search. Routledge. ISBN9781136933066.
  17. Hoffman, A.J. (2016). Reflections: Academia’s Emerging Crisis of Relevance and the Consequent Role of the Engaged Scholar. Journal of Change Management, 16(2), pp. 77.
  18. Kendall, Susan. “LibGuides: PubMed, Web of Science, or Google Scholar? A behind-the-scenes guide for life scientists.  : So which is better: PubMed, Web of Science, or Google Scholar?”. libguides.lib.msu.edu. Retrieved 2016-05-02.
  19. Koehler, W.C. (1999). “Classifying Web sites and Web pages: the use of metrics and URL characteristics as markers”. Journal of Librarianship and Information Science 31 (1): 21–31. doi:1177/0961000994244336. ISSN0000-0000.
  20. LaFrance, Adrienne (2016). “The Internet’s Favorite Website”. The Atlantic. Retrieved 2016-05-12.
  21. Lecher, Colin (2016). “Google will apply the ‘right to be forgotten’ to all EU searches next week”. The Verge. Retrieved 2016-04-29.
  22. Mendez-Wilson, D (2000). ‘Humanizing The Online Experience’, Wireless Week, 6, 47, p. 30, Business Source Premier, EBSCOhost, viewed 1 May 2016.
  23. Milne, David N.; Witten, Ian H.; Nichols, David M. (2007). “A Knowledge-based Search Engine Powered by Wikipedia”. Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management. CIKM ’07 (New York, NY, USA: ACM): 445–454. doi:1145/1321440.1321504. ISBN9781595938039.
  24. Moran, Wes & Tretikova, Lila (2016). “Clarity on the future of Wikimedia search – Wikimedia blog”. Retrieved 2016-05-10.
  25. Nicas, J. and Kendall, B. (2016). “U.S. Expands Google Probe”. Wall Street Journal. ISSN 00999660.
  26. Notess, G.R., (2013). Search Engine to Knowledge Engine? Online Searcher, 37(4), pp. 61-63.
  27. Notess, G.R. (2016). SEARCH ENGINE update. Online Searcher, 40(2), pp. 8-9.
  28. Notess, G.R., (2016). SEARCH ENGINE update. Online Searcher, 40(1), pp. 8-9.
  29. Notess, G.R., (2014). Computational, Numeric, and Data Searching. Online Searcher, 38(4), pp. 65-67.
  30. Park, Taemin Kim (2011). “The visibility of Wikipedia in scholarly publications”. First Monday 16 (8). doi:5210/fm.v16i8.3492. ISSN1396-0466.
  31. Price, Gary (2016). “Digital Preservation Coalition Releases New Tech Watch Report on Preserving Social Media | LJ INFOdocket”. www.infodocket.com. Retrieved 2016-05-01.
  32. Ratfcliff, Chris (2016).“Six of the most interesting SEM news stories of the week” | Search Engine Watch”. Retrieved 2016-05-10.
  33. Robinson, R. (2016) How Google Stole the Work of Millions of Authors. Wall Street Journal. ISSN 00999660.
  34. Rowley, J. E.; Hartley, Richard J. (2008). Organizing Knowledge: An Introduction to Managing Access to Information. Ashgate Publishing, Ltd. ISBN9780754644316.
  35. Sandhu, A. K.; Liu, T. (2014). “Wikipedia search engine: Interactive information retrieval interface design”. 2014 3rd International Conference on User Science and Engineering (i-USEr): 18–23. doi:1109/IUSER.2014.7002670
  36. Sentance, R. (2016). “Everything you need to know about Wikimedia’s ‘Knowledge Engine’ so far | Search Engine Watch. Retrieved 2016-05-02.
  37. Simonite, Tom (2013).“The Decline of Wikipedia”. MIT Technology Review. Retrieved 2016-05-09.
  38. Swanson, Troy (2012). Managing Social Media in Libraries: Finding Collaboration, Coordination, and Focus. Elsevier. ISBN9781780633770.
  39. Van Dijck, José (2010). “Search engines and the production of academic knowledge”. International Journal of Cultural Studies 13 (6): 574–592. doi:1177/1367877910376582. ISSN1367-8779.
  40. Wells, David (2007). “What is a library OPAC?”. The Electronic Library 25 (4): 386–394. doi:1108/02640470710779790. ISSN0264-0473.

 

Bibliographic databases utilised

 

Teaching with Wikipedia – how to get started (an Edinburgh University case study)

Wikipedia is much more straightforward using the new Visual Editor interface which makes editing Wikipedia now as easy as using Microsoft Word. Students can be taught how to edit in approximately 60 minutes and thereafter can research and write, with academic rigour, brand new Wikipedia articles.

The video interview provided by the University of Edinburgh’s Dr. Chris Harlow illustrates  the Wikipedia research session he ran in September 2015.

Dr. Chris Harlow - Reproductive Biology (University of Edinburgh)
Dr. Chris Harlow – Reproductive Biology (University of Edinburgh)

A practical example of engaging with Wikipedia in teaching and learning – watch Dr. Chris Harlow speak about his recent experiences introducing Wikipedia to his 3rd year Honours students to researching & writing a Wikipedia article.

Teaching with Wikipedia – Dr. Chris Harlow (Reproductive Biology research session)

Duration: (7:09)
User: Ewan McAndrew – Added: 03/06/16

YouTube URL: http://www.youtube.com/watch?v=qIHlOWxepoc

Some additional resources & recent examples of approaches to teaching with Wikipedia are detailed here:

1.    Teaching with Wikipedia (University of Edinburgh examples)

2.    How to use Wikipedia as a teaching tool (PDF)

3.    Wikipedia Education Program – Case Studies: How universities are teaching with Wikipedia (PDF)

If you would like to know more about how Wikipedia fits in with academia then these recent articles make very compelling reading:

1.    Wikipedia 15 and education

2.    Wikipedia the digital gateway to academic research

The project page for the residency with details on upcoming events is located here: Wikipedia: University of Edinburgh and the latest Wikipedia training session (30th June 2016) is available to book here: bit.ly/1UdQ4f6

Further video tutorials can be found on the Wikimedian in Residence Youtube channel here.

Further examples of Teaching with Wikipedia include:

Making use of Wikipedia’s new Content Translation tool – University College, London.

  1. The UCL’s Wikipedia ‘Translate-a-thon’ is written up here: https://www.ucl.ac.uk/teaching-learning/case-studies-news/e-learning/teaching-translation-wikipedia
  2. In addition, UCL also ran a Wikipedia session for familiarising Year 1 undergraduates with using sources – making good use of the Wiki Education project dashboard to allow educators to manage & monitor class Wikipedia assignments & communicate with students from a central hub: https://prezi.com/apxnjcabgtdd/when-ucl-students-write-wikipedia.
  3. This one also includes how Wikipedia work complements UCL’s educational strategic aims.

Telling the stories of rural England with Wikipedia – The University of Portsmouth.
Dr Humphrey Southall, Reader in Geography, University of Portsmouth, written with Dr Martin Poulter, describe a Wikipedia-based assignment given to first-year students in Applied Human Geography and also looking at how academics can inform the widest public about their subject, and raise awareness of the reliable sources used in research.

 

In addition – Wiki Education resources

Wiki Education has a variety of materials which may be helpful. 

COMING SOON: Edinburgh Gothic editathon: Sat 12th November

By Count Girolamo Nerli (Italian, 1863 - 1926) [Public domain], via Wikimedia Commons
Robert Louis Stevenson     By Count Girolamo Nerli (Italian, 1863 – 1926) [Public domain], via Wikimedia Commons

 

On Saturday 12th November 2016, the University’s Information Services team are partnering with the National Library of Scotland to run a Wikipedia edit-a-thon to celebrate Robert Louis Stevenson Day 2016. Full Wikipedia editing training will be given in the morning before a break for lunch. Thereafter the afternoon’s editathon will focus on improving the quality of articles about all things gothic.

Working together with liaison librarians, archivists & academic colleagues we will provide training on how to edit and participate in an open knowledge community. Participants will be supported to develop articles covering areas which could stand to be improved; gothic art, gothic architecture, gothic literature, gothic film, gothic music, gothic history etc.

We also invite participants from around the world with an interest in all things Gothic to join in & contribute remotely; either through supplying ideas for our hitlist of Wikipedia articles to create/improve prior to the event or through remote editing during the event or even arranging your own simultaneous editathon events.

Details to follow but keep the date and come along to learn about how Wikipedia works and contribute a greater understanding of Gothic history!

The event page is here.

COMING SOON: Day of the Dead editathon – 31st October 2016

Day of the Dead Wikipedia editathon
Day of the Dead Wikipedia editathon – 31st October 2016

Dario Taraborelli, head of research at Wikimedia, passed a link to http://passingon.natematias.com/ to my colleague, Melissa Highton, Assistant Vice Principal of Online Learning and Director of Learning, Teaching & Web Services at the University of Edinburgh, as the University ran two Wikipedia editathons last year on ‘Women in Science & Scottish History’ and ‘Ada Lovelace Day – celebrating Women in STEM’.

We are endeavouring to keep the momentum going this year and have already run events on Women in Art for International Women’s Day and Women in Espionage for ‘Spy Week 2016’. All of these events are mentioned on my project page.

https://en.wikipedia.org/wiki/Wikipedia:University_of_Edinburgh

So the plan is to run an event, or a series of events, in the Autumn which would make use of the brilliant application which scrapes data from the annotated corpus of 25 years’ worth of New York Times articles to help identify missing Wikipedia articles about notable women; utilising these obituary records to help celebrate the lives of those recently passed on and changing Wikipedia’s representation of notable females in the process.

The tricky part will be whether the application could incorporate Scotland/UK based news obituaries e.g. scraped from the Scotsman newspaper or the Guardian newspaper for example.

Some investigating to be done…

 

COMING SOON: Ada Lovelace Day – 11th October 2016

Ada Lovelace Alfred Edward Chalon [Public domain], via Wikimedia Commons
Ada Lovelace
Alfred Edward Chalon [Public domain], via Wikimedia Commons
This is just a gentle reminder that Ada Lovelace Day 2016 will be coming up on Tuesday 11th October 2016 and we will be looking to reconvene a working group to prepare for an Ada Lovelace day of events; incorporating a Wikipedia editathon celebrating the achievements of women in science, technology, engineering and maths (STEM).

 

http://findingada.com/

Ada Lovelace Day | Celebrating the achievements of women …

findingada.com

Ada Lovelace Day is an international celebration of the achievements of women in science, technology, engineering and maths (STEM). Ada Lovelace Day in 2016 will be …

At this moment in time, I am looking for expressions of interest in being involved in this event once more and Wikipedia pages we should look to create and improve related to Women in STEM.

 

NB: The focus might shift a little this year to female mentors given that Mary Somerville is to grace the £10 note this year so with an extra focus on women in maths too.

 

If you know of someone who would like to be involved then please feel free to forward on the event details and let them know I’d love to hear from them.

 

https://en.wikipedia.org/wiki/Wikipedia:University_of_Edinburgh/Events_and_Workshops/Ada_Lovelace_Day_2016

I’ve created the Wikipedia event page accordingly so that we can populate it over the next few months with some notable women in STEM.

 

Other projects are in development too. If you would like to be involved in them then email me.

https://en.wikipedia.org/wiki/Wikipedia:University_of_Edinburgh#Projects_in_Development

 

Whisky (and Projects) Galore!

The residency so far
The residency so far

As the dust settled after the hectic days of Spy Week 2016 and OER16 came to a close and the university exam period came and went, I was left thinking… what’s next?

Projects in development (from the University of Edinburgh Wikimedia residency page)

  • History of Veterinary Medicine edit-a-thon – Event for Royal (Dick) School of Veterinary Studies staff to research & create articles relating to the history of veterinary medicine. 4th July 2016
  • Euro Stem Cell Editathon at Centre for Regenerative Medicine, Edinburgh. Editathon for UoE staff and Eurostemcell partner labs in Europe & at the Wellcome Library.
  • Wikidata (& WikiSource) Showcase (with Pauline Ward & Histropedia’s Navino Evans) at the John McIntyre Conference Centre JMCC – 1st & 2nd August 2016
  • Reproductive Medicine Edit-a-thon (with Dr. Chris Harlow) – 21 September and 28 September. Partnering with West Virginia University.
  • Vet School Wikipedia research session – Edit-a-thon event for Royal (Dick) School of Veterinary Studies students to research & create new Wikipedia articles on Veterinary Medicine. Proposed for October 5th 2016.
  • International Alumni project – Celebrating the international students who studied at Edinburgh University and gone on to have a huge impact abroad (including simultaneous editathons, hopefully, in Singapore & Hong Kong to create a global edit-a-thon). Mooted for early October 2016 for Black History Month.
  • Ada Lovelace Day – Tuesday 11th October 2016 – celebrating the achievements of Women in STEM with a particular focus on female mentors given that Mary Somerville will grace the new £10 note. Truly noteworthy.
  • Day of the Dead editathon – Monday 31st October 2016 – using the obituaries from Scottish & UK newspapers to recognise & celebrate the lives of those sadly passed away.
  • Edinburgh Gothic (agreed a partnership with the National Library of Scotland) – Saturday 12th November. Marking the day before Robert Louis Stevenson Day, the National Library of Scotland will join us to celebrate the best of Edinburgh Gothic, releasing Robert Louis Stevenson images into the public domain to Wikicommons (wherever possible) and any additional material not yet transcribed onto Wikisource. Looking to see if we can combine efforts in gothic art, gothic history, gothic costume design, gothic music, gothic film, gothic literature etc. to fill any gaps on Wikipedia… in the most macabre way.
  • The Kelvin Hall relaunch (in Glasgow) – mooted for late November / early December 2016 (again in collaboration with the National Library of Scotland). The idea is to create an edit-a-thon based on the Moving Image Archive by showing participants short films from the archive on the Video Wall there, creating Wikipedia articles for the films & filmmakers, and showing a longer film afterwards at the Hunterian cinema.
  • Translate-a-thon – Reaching out to bilingual and multi-lingual students to translate articles from English Wikipedia to their own native language Wikipedia (& vice versa) using Wikipedia’s new Content Translation tool.
  • Festival of Architecture 2016 – An architecture-themed editathon to celebrate the achievements of architects for the Festival of Architecture 2016.
Whisky Galore
Whisky Galore

And the whisky? It seems my less than unsubtle hints following my trip to Skye in April resulted in my getting a fair few bottles for my birthday.

Projects and whisky galore. Lots to be excited about and lots to get on with!