[EAS] Scholarship and the Data Deluge

Peter J. Kindlmann pjk at design.eng.yale.edu
Fri Oct 31 18:15:53 EDT 2008

 From the current issue of TL INFOBITS 
<http://its.unc.edu/tl/infobits/bitoct08.php>, one of my favorite 
resource identifiers, comes this:



"Retrieving whole books, articles, and other documents is no longer 
sufficient for scholarly research. Faculty and students want to mine 
documents or other textual works--whether for molecules, materials, 
or mavens, depending on their field of study. . . . What is new in 
the digital environment? Information can be extracted in smaller 
units, mashed up, and recombined -- preferably with  sources. Faculty 
and students alike need assistance in learning how to think with 
these tools and services if they are to ask truly new questions with 

In "Supporting the 'Scholarship' in E-Scholarship" (EDUCAUSE REVIEW, 
vol. 43, no. 6, November/December 2008), Christine L. Borgman 
examines new forms of scholarly research -- data-intensive, 
distributed, collaborative, and multidisciplinary -- that are being 
enabled by the "data deluge," the vast amount and variety of digital 
materials available to researchers. These new forms of scholarship 
will have an impact not only on scholars, but also on academic 
libraries and campus information technology infrastructure.

You can read the article at http://connect.educause.edu/library/erm0863

EDUCAUSE Review [ISSN 1527-6619], a bimonthly print magazine that 
explores developments in information technology and education, is 
published by EDUCAUSE (http://www.educause.edu/). Articles from 
current and back issues of EDUCAUSE Review are available on the Web 
at http://www.educause.edu/pub/er/


It is a short and polite article, more polite than I typically am, 
really more of a reminder of issues of growing concern, rather than a 
treatment. The references help with that.

The author does take issue with WIRED's wooly claim that ".... 
science no longer needs theory, models, metadata, ontologies, or "the 
scientific method": mining the data deluge replaces all of them." 

Traditions of scholarship on all levels, undergraduate, graduate and 
professional, are rapidly dissolving. Undergraduates need to be given 
careful working definitions of plagiarism. There are no longer 
assured predispositions one can rely on. Filtering papers through 
plagiarism detection search engines is now routine. Attribution of 
ideas at the graduate and professional level is a challenge. Some 
might say it is not worth trying so hard, were it not for the lure of 
patenting, itself a broken process, and the honorable standing 
associated with the PhD degree.

Multiple authorship in collaborative research makes a hash of 
publication and citation counts, which had always more to do with 
accounting than with critical thinking, but are rapidly becoming ever 
more meaningless.

Citation analysis was invented, by the way, as a tool for librarians. 
It was not the initial belief that citation analysis would be useful 
to analyze the quality of an individual's work by the frequency of 
their citations. Yale's Derek deSolla Price, one of the founders of 
the field of citation analysis, who died in 1983, long before the 
"mix-master" effect of the Internet, was strikingly qualified in its 

"It is absurd to give someone with 21 citations an edge over someone 
with 19, but that is simply a matter of calculating appropriate 
standard deviations and probable errors. We all hope for greater 
sophistication but the point is that citations do provide a 
reproducible and clearly useful measure of something. The big 
question is not whether it correlates with quality in some particular 
sense of that term but rather what sort of quality is being measured."

What sort of quality indeed. There are serious policy issues that 
need to be addressed as the underpinnings of current scholarship. We 
have outgrown the age of comfortable traditions, which in their day 
weren't even all that comfortable. There isn't much at stake, only 
the future of the university.


More information about the EAS-INFO mailing list