[OAI-eprints] Cliff Lynch on Institutional Archives (fwd)

Sun, 16 Mar 2003 02:03:11 +0000 (GMT)

As Thomas Krichel has replied to this message on this list, I
am posting the message to which he has replied. I may reply
to his reply afterward. -- SH

---------- Forwarded message ----------
Date: Sat, 15 Mar 2003 15:30:10 +0000
From: Stevan Harnad <harnad@ecs.soton.ac.uk>
Reply-To: September 1998 American Scientist Forum
    <SEPTEMBER98-FORUM@LISTSERVER.SIGMAXI.ORG>
To: SEPTEMBER98-FORUM@LISTSERVER.SIGMAXI.ORG
Subject: Cliff Lynch on Institutional Archives

Quote/Comments on:

    Clifford A. Lynch: "Institutional Repositories:
    Essential Infrastructure for Scholarship in the Digital Age"
    http://www.arl.org/newsltr/226/ir.html

Cliff Lynch makes many very good points. I disagree with him only on one
point, but it is a fundamental one, with important practical and
strategic implications for the immediate future: What is the most pressing
reason for creating and filling institutional repositories at this
time? Cliff thinks it is to promote new forms of scholarship whereas
I think it is to promote refereed research. The new scholarship
is coming too, and will certainly grow in importance, but the immediate
rationale for creating and filling institutional repositories is for the
self-archiving of institutional research output, in order to maximize
its research impact, by maximizing user access to it, through open access:
http://www.soros.org/openaccess/

> faculty have been exploring ways in which works of authorship in the new
> digital medium can enhance teaching and learning and the communication
> of scholarship

This is the familiar and valid complaint that the university has not
been sufficiently supportive of online innovations by faculty, neither
in terms of resourcing it nor in terms of rewarding it. This is true,
and it is indeed a problem, and no doubt slowing innovation. But it is
also being remedied, by increasing recognition and support, and the
persistence of innovative faculty. It is *not* the reason universities
need digital repositories urgently at this time, and this is *not* the
(main) content that will fill them.

> faculty have exploited the Net as a vehicle for sharing their ideas
> worldwide, whether these ideas are expressed in relatively familiar
> forms such as digital versions of traditional journal articles or (less
> commonly) in entirely new forms...

This is a combination of the two kinds of content that are at issue
here. I am putting the primary emphasis on the "familiar forms" rather
than the new ones (important and valuable though they too are). The
progress, productivity and funding of scholarly and scientific research
depend directly on its visibility and accessibility: the degree to which
it is found, seen, read, used, cited, applied, built-upon by other
researchers. In a word, it all depends on *research impact.* And research
impact depends on research access. Whatever blocks access blocks impact.

There are 20,000 peer-reviewed research journals, across all disciplines
worldwide, publishing 2,000,000 articles annually. Almost all of these
articles are accessible to researchers (i.e., to their potential users)
only if their institution can afford the toll-access (subscription,
license) to the journal in which they were published. And most
universities cannot afford toll-access to most journals -- even the
richest can only afford a minority of the 20,000. This means that *all*
research on the planet is inaccessible to *most* of its potential
users. And every single case of access-denial is a case of potential
impact loss. The overwhelming, pressing rationale for institutional
repositories is accordingly: to put an end of this daily impact loss --
a legacy of the paper era when the true costs of paper access made it
unavoidable, but no longer necessary in the online era, when open access
can be provided by institutions for their own refereed research output.

It is quite natural for researchers to self-archive their own refereed
research output in their own institutional archives, giving it away to
all of its would-be users worldwide for free, in order to maximize its
research impact, for they have been giving it away free to their
publishers for the very same reason throughout the paper era: Unlike all
other authors, researchers have always given away their work, written
only for impact, not for royalty revenue from toll-income. Hence it is
only natural that now that it has become possible to do so, they should
self-archive it in their own institutional archives so as to put an end
to the needless daily impact loss that is a legacy of the paper era.

This -- and not new forms of scholarship -- is the immediate, pressing
rationale for creating and filling institutional repositories at this
time. And this (refereed research output) is the content with which they
need to be filled, as soon as possible. With it -- and their newfound
role as *outgoing* collections of a university's own research output
instead of *incoming* collections of the output of other universities --
the institutional archives will also become the repositories for new
forms of scholarship. But the first and most urgent step is to put an
end to the needless daily impact loss for peer-reviewed research.

What about the peer-reviewed journals? Their toll-access mechanism of
cost-recovery may continue to co-exist with the open-access versions in
the institutional repositories, with those researchers whose institutions
can afford it using the former and those who cannot using the latter
-- or the journals may eventually have to cut costs and downsize to
the essentials in the online era, which may well prove to be just
peer-review service-provision alone, with the access, storage and
distribution offloaded onto the institutional repositories.

Peer-review only costs about $500 per outgoing paper, whereas
those institutions who can afford it are paying an average of $2000
(collectively) per incoming paper in access-tolls -- in exchange for
the very limited access this provides, restricted to the minority who
can afford it.
http://www.nature.com/nature/debates/e-access/Articles/harnad.html#B1

> faculty are well motivated to rise above the institutional failures to
> help them disseminate their works

Indeed they are, in the service of maximizing their research impact and
putting an end to its needless loss. But maximizing research impact is
in the interest of their institutions too, as the benefits of research
impact (research funding, prizes, prestige) are shared by faculty and
their institutions.

Let me count the three most obvious ways that the self-archiving of
institutional research output benefits researchers' institutions:

(1) Open access to an institution's research output maximizes its
impact and its rewards, as noted.

(2) Open access, being reciprocal if practised by other institutions too,
maximizes faculty access to the research output of *other* institutions,
generating better-informed and more current research (using the research
output of others, as you would have them use yours!).

(3) If/when there is ever an eventual downsizing of peer-reviewed
journals to the remaining online-age essentials (probably only peer
review itself), then there is also the prospect of eventual institutional
windfall savings of up to 75% on serials budgets.

> a faculty member seeking... broader dissemination and availability of
> his or her traditional journal articles...faces several time-consuming
> problems...  [F]aculty time is being wasted, and expended ineffectively,
> on system administration activities and content curation.

Cliff here means the time-consuming problem of maintaining a website for
self-archiving one's own research output. An institutional archive
is certainly a more sensible solution than having each researcher
maintain his own archive.

> Institutional repositories can maintain data in addition to authored
> scholarly works. In this sense, the institutional repository is a
> complement and a supplement, rather than a substitute, for traditional
> scholarly publication venues.

Not only is the institutional archive a supplement rather than a
substitute when it self-archives data that could not be included with
the published article, but it is a supplement even when it self-archives
the article: The self-archived open-access version is a supplement to the
journal's toll-access version, to maximize its research impact. It is not
a substitute for journal publication -- and certainly not a substitute
for peer review -- though it might one day become a substitute for
toll-access (for those who can afford it: for those who cannot, it
is already a substitute today!).

> where the disciplinary practice is ready, institutional repositories can
> feed disciplinary repositories directly. In cases where the disciplinary
> culture is more conservative, where scholarly societies or key journals
> choose to hold back change, institutional repositories can help
> individual faculty take the lead in initiating shifts in disciplinary
> practice.

There is no need -- in the age of OAI-interoperability -- for
institutional archives to "feed" central disciplinary archives: They
need only feed OAI metadata harvesters. The institution is the natural
locus for self-archiving its own research output, for each of
its disciplines. And it is individual researchers, not disciplines,
who will overcome the old habits, with the incentive to self-archive
coming from the discipline-universal benefits of maximizing research
impact. These benefits are shared by researchers and their institutions,
not by researchers and their disciplines (which are more of a locus
for *competing* for impact than for *sharing* it!). And journals are not
holding back change (and cannot): They are themselves changing with the
new possibilities the online medium has provided to allow researchers to
maximize their research impact:
http://www.lboro.ac.uk/departments/ls/disresearch/romeo/Romeo%20Publisher%20Policies.htm

But it is certainly true that university archives can help faculty take
the lead by providing the resources and policy that facilitates
self-archiving:
http://www.eprints.org/self-faq/#institution-facilitate-filling

> Institutional repositories can encourage the exploration and adoption of
> new forms of scholarly communication... This, to me, is perhaps the most
> important and exciting payoff

Here is where Cliff and I disagree. Exciting as they are, the new forms
are not the immediate priority: Open access to the "old forms" is. Then
the new forms will come too. But first the full research impact of the
old forms, at last. They will pave the way for the rest.

> The first potential danger is that institutional repositories are cast
> as tools of institutional (administrative) strategies to exercise
> control over what has typically been faculty controlled intellectual
> work. I believe that any institutional repository approach that requires
> deposit of faculty or student works and/or uses the institutional
> repository as a means of asserting control or ownership over these works
> will likely fail, and probably deserves to fail... This is not to say
> that policies mandating the deposit of materials that are broadly
> recognized as part of the institutional record ... are inappropriate.

I agree completely. The purpose of institutional archives and
archive-filling policies is not to assert control or ownership over
faculty research output! It is to maximize its research impact by
maximizing user access to it.
http://www.ecs.soton.ac.uk/~harnad/Temp/Ariadne-RAE.htm
http://paracite.eprints.org/cgi-bin/rae_front.cgi

Mixing up the open-access agenda with other university dreams about
generating new revenue streams from faculty intellectual output (software,
patents, courseware, distance education, electronic publishing) is not
only wrong-headed, but it risks delaying the real and sizeable benefits
of open access to refereed research output, turning the institutional
repository movement into aimless gridlock for some time to come.

> My second concern is... [that] administrators, librarians, and faculty
> members wishing to challenge existing systems of scholarly publishing
> (specifically their economic models and their creation of barriers to
> access through intellectual property control and licensing arrangements)
> may try to link their efforts too directly to institutional repositories
> by imposing inappropriate policy constraints

I agree. See above. And here is a model for an appropriate policy:
http://www.ecs.soton.ac.uk/~lac/archpol.html

> it dramatically underestimates the importance of institutional
> repositories to characterize them as instruments for restructuring the
> current economics of scholarly publishing

I agree again. It is not the business of universities to restructure the
economics of scholarly publishing. It is the business of universities to
do research, publish their findings, and make sure that those findings are
put to full use. Maximizing all would-be users' access to them is the
way to ensure the latter. And that might (but just might) eventually
have some effects on the economics of refereed journal publication. But
that would only be a side-effect, not the direct motivation or
justification at all: That direct motivation and justification is
to maximize the impact of institutional research output by making it
open-access -- by self-archiving it in the institutional repository.

> the institutional repository isn't a journal, or a collection of
> journals, and should not be managed like one. That's not the point or
> the purpose of an institutional repository.

Correct. It is an open-access supplement to toll-access via the journals.

> Institutional repositories are not a challenge or alternative to
> disciplinary repositories; rather, they complement them, just as they
> can complement existing venues of scholarly publication.

In the era of OAI, institutional and disciplinary archives are equivalent,
because completely interoperable. However, the shared interest of
researchers and their institutions in maximizing the impact of their
research output makes institutional archives a better bet for hastening
open access, especially as they are in a position to modify their
existing publish/perish policies so as to mandate self-archiving in
order to maximize research impact.
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/0293.html

> It is desirable to make this as simple as possible... with a simple and
> stable submission interface to the institutional repository.

The simple solution is available already: See the 60+ Eprints.org
institutional archives http://software.eprints.org/#ep2
in use for over 2 years and growing:
http://www.ecs.soton.ac.uk/~harnad/Temp/tim.ppt

The challenging part is not creating the free self-archiving software,
nor in making it simple, nor in getting it adopted, but in getting
the archives filled, which requires a clear, coherent institutional
self-archiving policy -- with a clear sense of *what* needs to be
self-archived, *how* and *why*:
http://www.ecs.soton.ac.uk/~lac/archpol.html

> It's vital that institutions recognize institutional repositories as a
> serious and long-lasting commitment to the campus community (and to the
> scholarly world, and the public at large)

Yes, but *far* more important than this advance long-lasting commitment
to an empty archive is a coherent policy for getting it filled!

> An institutional repository can fail over time for many reasons: policy
> (for example, the institution chooses to stop funding it), management
> failure or incompetence, or technical problems. Any of these failures
> can result in the disruption of access...I worry a great deal about what
> the various impacts and implications of the first few major failures of
> institutional repositories

And I worry a great deal about worries about the permanence of empty
or even non-existent archives, instead of directing all energies and
resourcefulness to filling the archives! Get the precious intellectual
eggs into the basket, and their very presence there will be the best
guarantor that they will be maintained in perpetuum. Worry instead
about permanence now and all you do is add another item to the
long list of needless worries that are holding back self-archiving:
http://www.eprints.org/self-faq/#1.Preservation

And this is also the point to remind ourselves, again, that
self-archiving is a *supplement* to, not a *substitute* for journal
publication. Until and unless there is a transition and downsizing
from toll-access journal publication to open-access journal publication,
the primary preservation burden is not on the institutional archives!
Their burden is merely to provide open-access to it, now, as a supplement
for those who cannot afford toll-access.

So stop worrying about archives failing and work instead on archives
filling!

> Not every higher education institution will need or want to run an
> institutional repository, though I think ultimately almost every such
> institution will want to offer some institutional repository services to
> its community. We will see various forms of consortial or cluster
> institutional repositories.

Maybe. But it seems to me that this is only a substantive question if we
are talking about the industrial strength archive software such as
DSpace. For the "light" softwares such as Eprints, there is so little
start-up time and maintenance required that I would think any
institution that generated research output could and would run its own.
(Again, there is not enough *content* yet to talk about fancy consortial
schemes! Let's get the culture of self-archiving rolling before we worry
about the load being too great for an institution to manage on its own!)

> Federation of institutional repositories may also subsume the
> development of arrangements that recognize and facilitate faculty
> mobility and cross-institutional collaborations.

This can be managed at the metadata level without any special need to
"federate" (over and above OAI-interoperability). A metadata tag
indicating current institutions, and tags indicating prior institutions
and dates will allow all research to be triangulated upon (for where it
was done, and when).

> The MIT [free repository] software is not the only option available,
> although I believe it is the most general-purpose; for example, there
> is [free repository] software from the University of Southampton in
> the U.K. <http:// www.eprints.org/> designed more specifically for
> institutional or disciplinary repositories of papers, as opposed to
> arbitrary digital materials.

And I have here tried to give the reasons why the pressing challenge now
is not general-purpose archiving of arbitrary digital materials, but
the self-archiving of institutional refereed research output, to
maximize its research impact by maximizing its visibility and
accessibility, through open access.
http://www.ecs.soton.ac.uk/~harnad/Temp/unto-others.html

Stevan Harnad

-------------------------------------------------------------------
NOTE: A complete archive of the ongoing discussion of providing open
access to the peer-reviewed research literature online is available at
the American Scientist September Forum (98 & 99 & 00 & 01 & 02):

    http://amsci-forum.amsci.org/archives/september98-forum.html
                            or
    http://www.cogsci.soton.ac.uk/~harnad/Hypermail/Amsci/index.html

Discussion can be posted to: september98-forum@amsci-forum.amsci.org

See also the Budapest Open Access Initiative:
    http://www.soros.org/openaccess

the BOAI Forum:
    http://www.eprints.org/boaiforum.php/

the Free Online Scholarship Movement:
    http://www.earlham.edu/~peters/fos/timeline.htm

the SPARC position paper on institutional repositories:
    http://www.unites.uqam.ca/src/sante.htm

the OAI site:
    http://www.openarchives.org

and the free OAI institutional archiving software site:
    http://www.eprints.org/