Sunday 6 October 2013

Could we go beyond Turnitin & anti-plagiarism softwares?

March 2008 version of TC through the Internet Archive
The 10th of October 2013, I will participate to THATcamp Leadership at the RRCHNM, the Center for History and New Media at George Mason University, Fairfax, Virginia. THATcamp, The Humanities and Technology Camp was conceived at George Mason in 2008 and became soon international. 

THATcamps were held in Paris in 2010 and 2013, in Florence at the European University Institute in 2011, in Lausanne and in Luxembourg/Trier in 2012 and many other THATcamps in Europe and in other continents.

THATcamp Florence 2011



Participating to ThatCamp will allow you to perform Digital Humanities activities in informal ways. That’s why it has been called an unconference. Following the CHNM's definition, "an unconference is a highly informal conference. Two differences are particularly notable. First, at an unconference, the program isn’t set beforehand: it’s created on the first day with the help of all the participants rather than beforehand by a program committee. Second, at an unconference, there are no presentations — all participants in an unconference are expected to talk and work with fellow participants in every session." During THATCamps "humanists and technologists of all skill levels learn and build together in sessions proposed on the spot".

What should I propose to THATcamp Leaderhip is something I was wondering from sometimes now so I decided to post a session proposal on the GMU website looking at what are my next duties for the History Department at the European University Institute, Florence, ItalyThe EUI Dean of Studies and the Academic Service decided to introduce systematically the use of anti-plagiarism software. The reason is for single Ph.D. researchers to look at the various chapters and drafts of their dissertation during the four years research/writing process and verify the originality of the contents. They want to avoid having researchers shamed and expelled out of the community of scholars like this student in Norway

The software Turnitin has been chosen and new administrative rules introduced on how to use it. Now, scholars on both side of the Ph.D. writing process: he who writes it and he who is supervising it, are both involved with digital tools. This is something that never happened before. At the EUI, this task which was performed by the staff of the Dean of Studies and the Academic Service, has now to be performed directly by the thesis supervisor before the decision taken by the departments to officially accept that a candidate submit a thesis for discussion with the jury. So, at the end of the process, when the thesis is submitted, each supervisor should perform this new task against plagiarism directly on the manuscript of his/her supervise. This task -and the instruments that are available to perform it- are today an evidence of the worldwide shift towards digital. It is taken for granted that everything we write is somewhere in the virtual space and can be retrieved and analyzed to avoid using someone else's ideas without acknowledging it. This is an extraordinary shift in the humanities sciences towards “other” humanities. It introduced a bit of digital humanities for everybody in a way!

Introductory courses to plagiarism, originality check, good academic practices and, finally, to Turnitin itself, have been organized for the first time this academic year 2013-2014 for all new doctoral researchers.
As History Information Specialist, I was asked to give my contribution both to the general discussion about plagiarism and to the correct way to use quotations in one's own research/writing activity. As far as the history department is concerned, I am helping its members –researchers, fellows and professors- to understand how they should proceed with the software. I will teach some Atelier Multimédia courses about it. But it's not this specific contribution -in the EUI context- that I would like to question. 

I would like to have the input of the participants -if my session proposal will be selected of course- and bring to the attention of THATcamp Leadership what were the many queries and reflections on the use of such software that challenged –at least for me- a “simple” task: showing how to use Turnitin. This task became more complicated than I thought. I started to think beyond plagiarism and to look at what an “originality check” was meaning in a new digital scholarly process in the Humanities and History. What could we all do with Turnitin? And taken for granted that all EUI scholars will have to use it, what should I tell to those who never used any software before?



So my questions to TC Leadership would be to look at this software (and other similar software’s) from a different viewpoint. Is it possible to allow our community of humanists and social scientists to integrate one of the most important methods that enriched the process of document retrieval and document analysis in the field of Digital Humanities -"text-mining"- when teaching how to use plagiarism software? Here are some possible issues to discuss during THATcamp:
  • Turnitin is a software against plagiarism. Are they any other software’s you would recommend and why? Anything in the OA/OS world ?
  • Do you use these software’s only for originality checking and fighting plagiarism?
  • Which other tasks could they perform? Are they allowing us to know more and more easily about the deep web contents? And if so how and why?
  • How could we trace the originality of translated texts -from English to other languages and vice-versa-, using different languages corpora?
  • Could we think to use Turnitin to understand who is quoting what and in which contexts and the many other ways we interact with big online commercial textual databases like EEBO, ECCO, MOMW I & II, etc., or with open access web databases like Rousseau online ?
  • Up to which extend, these textual databases accessed through Turnitin, would allow contextualized keyword searching, similarity searching, frequency searching, etc., so to understand if a quotation we plan to use has already been used entirely or partially in other writings, how, where and by whom?
  • Could we perform with Turnitin a much more complex citations search then the one we were allowed to perform from years now with the Web of Knowledge (ISI) when, looking at the footnotes in a scholarly paper, we deduce that if somebody uses the same quotations, he/she may research in the same field and have similar ideas?
  • Which text-mining activities are allowed using this software’s if we accept the fact that Turnitin is a good Digital Humanities tool, able to perform one of the most important tasks within “big amount of data's”: distance/close reading, searching for contexts, origin of quotations, places of words in millions of documents?
  • And, as a consequence, could we discuss if this is not only about plagiarism but if these kind of software’s may become a vector to introducing wider communities –not only the digital humanities community- to new ways to perform their research activities? Are they taking care in a daily research activity -and even without knowing about it-, of some characteristics, of both the linguistic turn and the digital turn if we may use big concepts?
Turnitin seems to be an instrument that allows new digital experiments with, unfortunately some technical limitations. Our session in Virginia, could try to problematically look at the systematic introduction of these tools in universities worldwide: now that you know how to use it and what’s in it, which tasks do you think you could perform with such a tool? In what ways this instrument could become useful to you? And, this is maybe the most important question, in a global world where digital documents and primary sources aren’t all written in English, how these experiments with digital texts could take care of different cultural and linguistic frameworks?