[OAI-eprints] From ROAR to DOAR

Stevan Harnad harnad at ecs.soton.ac.uk
Sat Jan 28 08:02:20 EST 2006


On Fri, 27 Jan 2006, Hélène Bosc wrote:

> Peux tu m'expliquer ce qu'il y a derrière Opendoar?

I'll reply in English to your question of what is behind OpenDoar, 
so I can post the reply more widely:

> Manifestement [ça reprend] les réalisations dejà faites à Southampton...

It is true that -- so far -- DOAR is mostly just re-doing, funded, what Tim
had already done, unfunded (with ROAR). DOAR so far covers about 3/5 of
the archives in ROAR and 1/2 the number in OAIster, and does not yet measure
or provide a way to display the time-course of their growth in contents
or number, as ROAR does. (DOAR will need Tim's Celestial to do that.)

    http://www.opendoar.org/
    http://archives.eprints.org/
    http://oaister.umdl.umich.edu/o/oaister/
    http://celestial.eprints.org/
   
However, DOAR does provide an OAI Base URL in what looks (to my eyes: DOAR
does not yet give tallies) to be a much larger proportion of archives than
ROAR does (c. 80%), and this is presumably because DOAR has directly
contacted each archive individually for which the OAI Base URL was missing.

(This is something I had asked Tim to do, but it is perhaps too much
to expect from an unfunded doctoral student, primarily working on his
thesis! The solution of course is for archives to expose their own
OAI Base URLs for harvesters to pick up automatically, and this will
of course be the ultimate outcome. For now, there is no Registry that
all archives use or aspire to be covered by. If DOAR incorporates all
of the useful features of ROAR (especially celestial), and adds value,
it may succeed in becoming that Registry. So far, ROAR's periodic calls
to Archives to register have insufficient success. Most of ROAR's
new archives for the past year or more have been hand-imported by me and
Tim! At least DOAR will be funded to do that thankless task, from now on!)

The second potentially useful feature of DOAR is that it seems to classify
separately the different content types and (I think -- I'm not sure)
that DOAR has checked that those are all full-texts (rather than just
bibiographic metadata: DOAR will have to make this more explicit in
their documentation).

If so, then DOAR can potentially provide size and growth-rate charts by
content types (preprints, postprints, theses, etc.), though there is
no way to do this (or boolean combinations) in DOAR yet. (The Eprints
software already tags and exposes content types as well as whether or
not each entry is a full-text; I expect that the other archive softwares
will soon follow suit. Then it's up to the archives to provide and expose
those metadata, so the harvesters can pick up and tally it.)

Right now, the DOAR entry for an archive looks a lot like a library
card catalogue entry for a journal or a book (perhaps by analogy with
DOAJ) or even a collection. 

    http://www.doaj.org/

This does not quite make sense to me, since users do not consult or use
individual online institutional archives as they do for individual books
or journals or collections. For one thing, most of the archives will be
university IRs. Most universities produce contents of all of the types
listed, and in all of the subjects listed; and rarely will any user want
all/only, say, articles on subject X from individual institution Y: They
will instead use an OAI harvester and service-provider like OAIster or
citebase or citeseer or even google scholar, that searches across all
institutions on that subject, or even all subjects. 

Hence the only likely use for those type and subject classifications
is either (1) for an automatic OAI harvester, using them to mediate
in harvesting the archives' metadata directly or (2) for individuals
interested in gathering summary statistics on individual archive
offerings. (And again, the optimal and most likely outcome is that the
archives themselves will expose these metadata to be picked up directly
by harvesters, rather than having to be mediated by a middle-service,
hand-gathering and checking any missing data.)

So there are still functionality issues to be thought through if DOAR is
to provide a useful service. But I expect these things will be resolved,
and that DOAR will build on ROAR something that provides genuine value
to the OA community and the research community in general, helping to
hasten the day of 100% OA.

Stevan Harnad






More information about the OAI-eprints mailing list