No subject

Mon Oct 23 03:33:13 EDT 2006

> A major study of librarian purchasing preferences has shown that
> librarians will show a strong inclination towards the acquisition [sic] 
> of Open Access (OA) materials as they discover that more and more
> learned material has become available in institutional repositories.

(1) OA materials are not "acquired" (and it is both misleading and
absurd to cast either the questions or the responses in an acquisitions
context). Non-OA products are acquired, and the availability of OA
versions of them might or might not induce *cancellation* in favour of
other non-OA products under various circumstances (that are not even 
touched upon by this study or its methodology).

Why would the model assume arbitrary differential rates of OA growth
among journals rather than roughly uniform growth across all journals in
each field (apart form random fluctuations)? And if there were systematic
differential OA growth within a field, wouldn't librarians' decisions
depend very much on the field, and on which journal contents happen to
became OA faster, rather than on any general predictions generated from
this theoretical model?

(2) Nothing whatsoever was determined about what happens as more and
more OA becomes available all round, nor about how availability would
be ascertained, nor at what rate OA would grow and be ascertained. There
were merely static questions about 3 hypothetical competing "products,"
some stipulated to be PP% OA within MM months.

> Overall the survey shows that a significant number of librarians
> are likely to substitute OA materials for subscribed resources,
> given certain levels of reliability, peer review and currency of
> the information available. This last factor is a critical one --
> resources become much less favoured if they are embargoed for a
> significant length of time.

The survey shows nothing whatsoever about libraries substituting OA
material for anything, because free self-archived content is not
something a *subscriber* institution (library) provides (by buying it
in) but something an *author* institution provides, via its IR, by
self-archiving it.

If the questions had been forthrightly put as pertaining to cancellation
decisions under various hypothetical conditions, then at least we would
have had librarians' speculations about what they think they would
cancel under those hypothetical conditions. But instead we have
inferences from a model based on least- and most-preferred "product"
options having little or no bearing on any question other than the
librarians' preferences for the hypothetical properties: They prefer
journals with lower prices, whose content is higher quality, more
reliable, more immediate, peer-reviewed, and preferably 100% of it.
(Librarians don't much care whether the peer-reviewed article is the
author's final draft or the publisher's PDF, as long as it's
peer-reviewed: That *is* a genuine finding of this study!)

There is no way at all to interpolate or extrapolate from data like
these to draw valid or even coherent conclusions about self-archiving
and cancellations, with or without a "conjoint analysis" model.

> One of the key benefits of the conjoint analysis approach used in
> this survey was the removal of bias by not referring, when testing
> different product configurations, to any named incarnations of
> content types, including subscription journals, licensed full-text
> (or aggregated) databases, or articles on OA repositories. 

This "bias" was eliminated at the cost of making it a questionnaire
about *acquisitions* among a variety of competing "products" when it
should have been a questionnaire about *cancellations* under a variety
of hypothetical OA conditions (many of them unascertainable, hence
moot).

> The survey tested librarians' preferences for a series of hypothetical
> and unnamed products frequently showing unfamiliar combinations of
> attributes -- such as a fully priced journal embargoed for 24 months,
> or content at 25% of the price but through an unreliable service. By
> taking this approach, the survey measured librarians' preferences for
> an abstract set of potential products thus avoiding any pre-conceived
> preferences for named products, such as journals, licensed full-
> text (aggregated) databases or content on OA repositories.

Indeed. But OA is not an alternative product for acquisition: it is a
property that might or might not induce cancellation in favor of *other*
products under certain hypothetical (and presumably competitive)
conditions.

> The data were abstracted into a "Share of Preference" model (or
> simulator) which has then been used to model real-life products and
> thus create predictions for librarians' real-life preferences for
> these products. It is therefore possible to go beyond the comparisons,
> in this work, of journals versus OA and to model other preferences,
> such as between OA and licensed full-text databases.

The "Share of Preference model" might be viable when the preference really
concerns competing products for acquisition, with a variety of rival
properties, but it fails completely when applied to free non-products,
not for acquisition at all, but treated as if they were just another
among the rival properties of products competing for acquisition.

We could have said a-priori that librarians (like all consumers) will
prefer a higher quality product over a lower quality product, 100% of a
product over 60% of a product, an immediate product over a delayed
product, a lower-priced product over a higher-priced product. A "Share
of Preference model" could give some rough rank orders for those various
combinations. 

It seems natural to add to such a "Share of Preference model" that
*consumers* will prefer a free product over a priced product, except
that we are talking here about acquisitions librarians, who do not
"acquire" free products but merely buy or cancel priced journals. This
study simply does not and cannot indicate under what OA conditions they
will cancel what for what.

The following (mild) conclusions, are the only ones that can be drawn:

>   There is a strong preference for content that has undergone
>   peer review.

Yes, and librarians don't much care whether the peer-reviewed content is
the publisher's PDF version or the author's final version -- except that
the publisher's PDF is for sale and the author's final draft is not! Nor
does the model tell us under what conditions, if both versions are
available for a journal X, librarians would cancel the publisher's PDF
(and in favour of what journal Y?). The question is never even raised.
That's the question the study was designed to answer, but the method
could not answer it. The survey might as well have asked the librarians
directly, for X/Y pairs of hypothetical or actual journals -- rather
than A/B/C triplets of hypothetical "products" -- banal questions such as:

    "If 100% of X were immediately available for free online and Y
    was not, and your users needed X and Y equally, and you could not
    afford both, and you currently subscribed to X and not to Y, would
    you cancel X for Y?"

I suspect that it is because -- in the absence of any actual evidence
of self-archiving causing cancellations -- a survey on hypothetical
cancellations of journal X in favour of journal Y (or no journal at all)
under various %OA and months-delay conditions would not have been very
convincing or informative that the survey instead resorted to "Share of
Preference" modelling. But I'm afraid the outcome is even less
convincing.

> How soon content is made available is a key determinant of content
> model preference in librarian's acquisition behaviour; delay in
> availability reduces the attractiveness of a product offering.

Yes, immediate access is preferable to delayed access. And, no doubt,
if/when librarians are ever inclined to cancel a journal X because PP%
of its articles are freely available, they are more likely to do so if
that PP% is immediately available than if it is only available 24 months
after publication. But we could have guessed that without this study.

The question is: Under what circumstances are librarians going to cancel
what, when?  This study does not and cannot tell us. Relative preference
models can only tell us that they are more likely to do it under these
conditions than under those conditions (and we already knew all that).

Having said all this, it is important to state clearly that, although
there is still no evidence at all of self-archiving causing
cancellations, it is possible, indeed probable, that self-archiving will
cause some cancellations, eventually. No one knows (1) how soon it will
cause cancellations, nor (2) how many cancellations it will cause. That
all depends on (a) how much demand there still is for the print edition
and (b) for the journal's online edition at that time, (c) for how long
that demand lasts, and (d) how quickly self-archiving grows and
approaches 100%. (Perhaps someone should do a survey on people's
predictions about those factors!)

But regardless of any of this -- and regardless also of the validity or
invalidity of the present survey -- the possibility or probability of
cancellation pressure is most definitely *not* the basis on which the
research community should decide whether or not to self-archive and
whether or not to mandate self-archiving. That decision must be based
entirely on the benefits of OA self-archiving for research access,
impact, productivity and progress -- definitely not on the basis of the
possibility of revenue losses for publishers.

We do well to remind ourselves that these questions are not primarily
about what is or is not good for the publishing industry. They are about
what is and is not good for research, researchers, their institutions,
their funders, and the tax-paying public that funds the funders.
Research is supported and conducted and peer-reviewed and published for
the sake of research progress and applications, not in order to support
the publishing industry, or to protect it from risk.

And what is certain is that peer-reviewed research publishing can and
will successfully adapt to Open Access: How can it fail to do so, when
it is researchers who conduct the research, write the articles, perform
the peer review, read, use, apply and cite the research, and, now,
provide online access to it as well? Publishers are performing a
valuable service (in implementing the peer review and in providing a
paper and online edition) but it is publishing that must adapt to what
is best for research in the online age, definitely not research that
must adapt to what is best for publishing. And publishing can and will
adapt.
http://www.publications.parliament.uk/pa/cm200304/cmselect/cmsctech/399/399we152.htm

    Berners-Lee, T., De Roure, D., Harnad, S. and Shadbolt, N. (2005)
    Journal publishing and author self-archiving: Peaceful Co-Existence
    and Fruitful Collaboration.  http://eprints.ecs.soton.ac.uk/11160/

I might add that Dr. Alma Swan is not the super-ennuated (sic) Proustian
personage repeatedly cited in this PRC survey, but the cygnine author
of a number of landmark surveys, one of them reporting the only existing
evidence -- negative -- for a causal connection between OA self-archiving
and cancellations.

    Swan, A. (2005) Open access self-archiving: An Introduction.
    JISC Technical Report. http://eprints.ecs.soton.ac.uk/11006/

Stevan Harnad
AMERICAN SCIENTIST OPEN ACCESS FORUM:
http://amsci-forum.amsci.org/archives/American-Scientist-Open-Access-Forum.html