The reading for this week took us on a grand tour of last
decade’s thinking about how radically changing technology influences scholarly
education, as well as a short explanation of how search engines work (hint:
it’s not magic and/or wizards, it just seems that way).
We’ll begin with the Paepcke, Garcia-Molina, and Wesley
piece, cleverly titled Dewey Meets Turing. They sketch a brief history of the uneasy
relationship between librarians and computer scientists in developing what we
know all take for granted: the digital library.
Apparently the librarians were frustrated that the computer scientists
weren't as thorough in organizing the digitized content (what about
preservation, after all?!), and the computer scientists saw the librarians as
stodgy traditionalists who slowed down development with their endless, boring
organization. While this low-level
eye-rolling was happening, the World Wide Web blew the roof off of everyone’s
plans. Instead of crafting beautiful but
closed digital systems for libraries, everyone quickly realized that the public
nature of the Web was the future, and the future was messy. At the time this
article was written (2003), Open Access wasn't as prominent an idea as it is
today, and it addresses the concerns raised by this article. In fact, I imagine it was concerns like this
(in a kind of high-pitched “what do we do what do we do what do we do?!” kind
of mentality) that drove the growth of OA technologies and mindsets. My favorite point from this article is that
change comes very slowly in the LIS field, driven by librarians “spending years
arguing over structures”. Get it
together, everyone, or the train will leave without us.
Still more than a decade in the past, though more
progressive in their thinking, ARL laid out an argument for digital
repositories that has come mostly to fruition here in the second decade of the
21st century. Institutional
repositories are necessary for long-term digital preservation of scholarly
material. The digital migration is a
healthy and empowering movement, but preservation at the institutional level is
necessary for knowledge maintenance.
Moreover, building a strong institutional repository can reflect
strongly on the institution’s prestige; it’s something to be proud of. This paper presages the development of green
Open Access. That is, a digital
repository at the institutional level that collects, organizes, preserves, and
distributes scholarly material that goes beyond just an article accepted and
published in a journal. Instead, it allows
access to a greater body of work, such as data sets, algorithms, theses and
dissertations, and other knowledge objects outside the traditional purview of
peer-review, organized in such a way as to enable new forms of discovery and
connection in a networked environment.
The article warns against absolutely requiring
scholars to self-archive their material, although this seems to be a painless
and productive practice where it is happens today. “Open Access is gonna be great, you guys!”
seems to be the theme of the article.
Moving on to the Hawking article about the structure of web
engines. He describes the mechanisms of
web crawlers (“bots” designed to index web content. Like…all of it. Or most of it- tip of the hat
to the black and white hats): be fast, be polite, only look at what the queue
tells you to, and avoid multiple copies of the same material at different URLs,
never stop, and stay strong against spam. Algorithms index this content and makes it
all searchable, no mean feat, as the amount of information on the available Web
is mind-bendingly huge. Indexing
algorithms create cross-searchable tables based on searchable descriptors, and
then rank them in respect to popularity (how many times a thing’s been clicked). Really slick algorithms that seem to infer
meaning (done through skipping, early termination, “clever assignment of
document numbers”, and caching) get famous, like Google’s. It’s fast, simple, and flexible.
The final article was about the Open Access Initiative
Protocol for Metadata Harvesting, a protocol that is much touted in the other
articles, as well. It allows for interdisciplinary,
interoperable searching of diverse types of content that find themselves
suddenly close together in a digital world.
Though there existed previously a wide variety of organizational systems
across disciplines, through the exacting use of XML, DublinCore, and other
useful metadata structures, digital scholarly content. OAI protocol gives different institutional repositories
a way to communicate with one another to create larger collections freely
accessible to anyone with an internet connection. In addition, as is so important with regards
to Open Access, metadata must be in place to track the provenance of a
knowledge object. Knowledge for the people. Right on, OAI Protocol for Metadata
Harvesting, right on. Of course, this
article came form 2005, a simpler time. As
we approach XML and metadata schemes in this course, it seems to me that these
protocols don’t simplify anything but instead they manage to keep things organized
until they change. Again. Which isn't a
bad thing, of course, and is in fact baseline necessary. The tone in 2005, however, seems to be that
of simplification. Moving toward a
controlled and universal vocabulary for organizing and providing Open Access is
more of a pipe dream; the best we can manage so far is pointing toward a
language, and then using it. We've come
a long way since 2005, but still no wizards. Dang it.
No comments:
Post a Comment