I’ve been toying with WordStat™ software from Provalis Research again. It is very useful for the kind of qualitative analysis required in domain analysis. One valuable tool in the content analysis package is a KWIC index. Ancient students of KO will recognize that acronym for “Keyword-in-Context,” a kind of indexing once thought potentially fruitful. Here is an example including three “contexts” for the word “model” from ISKO 13’s proceedings.
||of information retrieval systems
|A reference ontology for biomedical informatics: the Foundational
|Towards a Comprehensive
||of the Cognitive Process and the Mechanisms of Individual Sensemaking
As you see, it is very useful for comprehending the precise context of those big words that show up in the center of word clouds or the foreground of MDS plots.
However, the interesting thing I’ve just learned is that most of the presence of the term “information science” in our domain comes not from the keywords in research papers, but rather from the title of the third most cited journal in our domain JASIST (forgive me for not spelling out here, and using that term again). Thus it is not that that term is a topic of critical interest, rather it is that as much as 20% of our research appears in a competing journal.
If our science is going to continue to thrive and grow, our authors need to stop sending their research to competing journals. Better a world in which our journal Knowledge Organization has to split into an A for ontology and a B for epistemology and a C for domain analysis, etc., than one in which the dispersion of our science hinders exploitative power and weakens the scientific structure of our domain.
I famously wring my metaphorical hands about the number of authors who submit manuscripts to Knowledge Organization reporting research that is topically relevant, but showing absolutely no inculcation in the theories or values of the science of KO. Emotions range from demoralized to furious on these occasions. Fortunately, rational academic policies dictate manuscript acceptance, and in almost all cases we return these errant papers to the authors with instructions to go do their homework. Some of them do, happily.
I am in the midst of a domain analysis of the 75 papers presented at the recent ISKO International Conference in Krakow (http://www.isko2014.confer.uj.edu.pl/en_GB/-start). The complete results of that analysis will appear in an editorial in a future issue of KO. But the interesting thing I am seeing this time is that there is, indeed, a core of knowledge organization. Seventy-five papers, 1200-some citations, from 20 countries, citing over 400 journal articles, 300 books and 200 anthologies. And yet, most of the citations are to a tightly-knit intellectually coherent core of KO. Most journal citations by far (44%) are to Knowledge Organization, the majority of conference papers cited are in ISKO international conferences or regional chapter conferences, and the most-cited monographs are by Hjørland and Ranganathan.
It is good news, that there is such a strong and resilient and theoretically useful core of knowledge organization. The challenge, it seems, is to require those interloping into our topical areas to encounter our theoretical base.
I recently completed a rich analysis of the entirety of American Documentation in order to trace the evolution of the concept of a concept across that era of the growth of the emerging field of information science. I wrote a short paper on the subject for CAIS 2014 (available here: http://www.cais-acsi.ca/ojs/index.php/cais/issue/current.
The “abstract” is this: A core entity of information science is the “concept.” Agreement on the basic definition as a mental construct representing a concrete instance, conceals divergence in understanding of the nuances. A case study of the domain’s nascent era represented by American Documentation reveals some of the contours of the terms evolution.
There were lots of fun things to be encountered in those years of AD, and I was going to upload some photos of things like the rapid selector and Termatrex and so on, until I went to do so and found all of those “further reproduction prohibited” notices. Oh well. The whole run is available to ASIST members in the ASIST Digital Library.
I thought it was fascinating to see how interwoven knowledge organization was in those early days of documentation into information science. There was a lengthy evolution of something called “the duality concept,” which was an expression of the dichotomies between known-fact and browsing, between simple and complex terminology, and thus between isolate and hierarchy.
Stay tuned: a lengthy journal article is forthcoming.
It is appalling the number of manuscripts we receive for review for Knowledge Organization, that are about things like ontologies and taxonomies and domain analyses, and that cite absolutely no literature from the domain of knowledge organization.
Usually my first intuitive reaction is to think the authors simply were negligent in submitting their siloed papers to us without checking that our journal is published by a scientific society that might expect its own science to be used. Sometimes I have a second intuitive reaction that the authors are so siloed they do not even know that domains other than their own exist and have their own literatures. I suppose both of these are true to some extent.
Lately I have come to see that there is increasingly no connection–no synthesis, no syndesis, not even any syncopation–in the evolution of theory. I think this has something to do with the habits of researchers to conduct so-called literature reviews online using Google Scholar, or worse just Google alone, and never bothering even to go to the many multi-disciplinary indexing services available online through most research libraries (this ought to be demonstrable empirically; perhaps one could take a random sample of published articles and actually search for relevant literature? Never mind that this is the responsibility of peer-reviewers!). Internet resources usually provide something quasi-relevant (remember Patrick Wilson’s excoriation that relevance often means “satisfactory”?–see Two Kinds of Power), enough to fill out the tiny tweet-like excuses for paragraphs most people manage to type these days. But this is no proper approach to science.
Theory requires connection and connection requires sequence in human thought. In order to make sense of an empirical observation all of the science available that can be brought to bear must be connected. To move that empirical observation forward as an hypothesis, or to move the hypothesis forward as a theory requires that observations be classified cumulatively. It all requires “syn”–synthesis, syndesis, syncopation.
If either of the people reading this blog are considering contributing to the science of knowledge organization let them hie at once to the ISKO website and use the powerful new KO literature search tool: http://www.isko.org/lit.html. While they’re at it, let’s urge them to go to the ISKO member’s portal at Ergon-Verlag http://www.ergon-verlag.de/isko_ko/ where they now can find KO from 1993 to the present and AIKO from 2006 to the present (and soon will find the entire backlog).
Accuracy in all aspects of scholarship is critical. It seems increasingly to me, as a journal editor, that authors are taking less care with citations than ever before. It’s a bit like what we hear about pilots getting lax because they know their planes have autopilot—authors no longer make extensive files of source publications because they can view an abstract online with a couple of clicks and use one or another citation service to get automatic citations. One problem for another time is how this seems to lead to ritual citation. But more to the point of this post, it leads to errant citations, if the author is pasting from a citation service (or worse, from another paper whose author pasted it, etc., etc.) rather than keying a citation from a source document. Of course, the story I’m about to tell might just not have anything to do with any of this; I’ve no way of knowing how this happened.
When we prepare an issue of Knowledge Organization for publication we do several things that involve cross-checking for accuracy. One of them is verifying all of the citations in the text and the accompanying references in the reference list. Sometimes, despite having three different people working on this (as a cross-check, of course) something will slip through the cracks and we’ll find ourselves at the twelfth hour having to hold up production because a mystery develops. This one had to do with a citation. The issue was ready for press and we realized nobody had answered the question about what this abbreviated citation really was for:
Ranganathan, S. R. 1967. Areas for research in library and information science (development of library science. 6). Library science 4: 235-93.
Immediately one question was obvious, and that was why there was something like a series statement in the title portion of a journal article citation. I asked my colleagues to verify the citation and was told nothing like that could be found anywhere. We all tried looking it up in various ways. It seemed very curious that we could not find this citation online (but then again, 1967 was eons ago in digital journal time). It also was not possible to locate any journal with exactly the title Library Science from this period.
I decided to search the catalog of the library at the University of Illinois at Urbana-Champaign. I used to work there years ago and I knew the collection was nearly exhaustive in information science. Also, UIUC is relatively nearby, so it would be possible to actually go there or send someone (or beg someone there) to look at the source if necessary. What I found in their online catalog was a journal called Library Science With a Slant to Documentation, published in India by SRELS (Sarada Ranganathan Endowment for Library Science) beginning in 1964 and ending in 1999, all of which seemed promising. However, I could not find a digitized copy of this journal anywhere by searching online. Volume 4 was dated 1967, but there was no explanation for the odd series statement, and there was no way to find a table of contents for the journal online. (I thought briefly of those halcyon days when long tables full of bound periodical indexes were at my fingertips, with citations stretching back more than a century; and the closed stacks of bound volumes were just through that little door over there ….)
I decided to turn to our ISKO colleagues by placing a notice on ISKO-L. Within a few hours I had several responses from around the world, acknowledging that we had found the correct title, and apparently the citation had employed a formerly standard title abbreviation. Paper copies of the journal were located. And even more oddly, European colleagues were able to find the digitized article online using Google. Now, why couldn’t we do that from the U.S.? I also heard from others in the U.S. who couldn’t find it online! How bizarre!
The next mystery arising concerned the phrase “library and information science,” because several people pointed out that Ranganathan would not have used that expression. Eventually a copy of the article was received from Kothi Raghavan; I’ll reproduce the first page here:
Sure enough, there is a series statement in parentheses within the title, and the title does not say “and information science” and the journal title is Library Science With a Slant to Documentation.
The upshot is there were at least three inaccuracies in the original citation, so it was good thing we chased it down rather than creating a bibliographic ghost by publishing it in erroneous form. But it also was a lesson in the pitfalls of relying too heavily only on our digitized sources. As I tell my doctoral students, who inevitably groan and refuse to believe me, a scholar has to look at the actual sources to verify their veracity.
The mystery was resolved and the correct citation appeared in Knowledge Organization. Thanks to Kathryn La Barre, Gerhard Riesthuis, Thomas Dousa, Vivien Petras, Joe Tennis, F.J. Devadason and Kothi Raghavan for helping resolve this little mystery.
And remember, apparently, caveat emptor applies to citations.
Classification interaction is empirically demonstrated, and I’m thrilled about that. For the “Big Data” workshop at SIG/CR I proposed a preliminary survey research project in which a sample of the nine million UDC numbers in the WorldCat would be used to match deconstructed components of the UDC expressions to content-designated components of the respective bibliographic records. The purpose was to learn about the interrelationship between a faceted classification and the artifacts it represents. All of the variables (except age of work) were nominal-level, so I used Chi-squared to look for statistically-significant correlations. It was thrilling to find correlations all through the study. Results (and definitions of all of these terms!) are in the paper “Big Classification: Using the Empirical Power of Classification Interaction” in the 2013 SIG/CR Proceedings (or will be). The outcome is preliminary but exciting nonetheless.
But just when I thought it couldn’t get any better I took one more look at the largest results table and realized it was revealing a network among the correlations. I was therefore doubly thrilled (with some coaching from Laura Ridenour) to be able to create a visualization of that network structure using Gephi 0.8.2. Here is an early version (not the one that appears in the paper):