[OAI-eprints] Introducing the Subject Categorization
discussion
Jessie Hey
jmnh@ecs.soton.ac.uk
Mon, 20 Jan 2003 16:46:18 +0000
<html>
Another comment from Chris Gutteridge on Google and harvesting stemming
from our discussion.<br>
Jessie<br><br>
<blockquote type=cite class=cite cite>Date: Fri, 17 Jan 2003 15:31:14
+0000<br>
From: ePrints Support <eprints-support@ecs.soton.ac.uk><br>
To: EPrints Underground List
<eprints-underground@ecs.soton.ac.uk><br>
Subject: Re: [EP-underground] Re: Interoperability - subject
classification/terminology (fwd)<br>
User-Agent: Mutt/1.2.5.1i<br>
X-ECS-MailScanner: Found to be clean, Found to be clean<br>
Sender: owner-eprints-underground@ecs.soton.ac.uk<br>
Reply-To: EPrints Underground List
<eprints-underground@ecs.soton.ac.uk><br><br>
There are some problems to google searches for certain things,
however.<br><br>
One of which that the smartest the google can get (currently) is
does<br>
it contain word "X" but many words have multiple meanings, or
have different<br>
subject area-specific meanings. One good example is that "wave"
means something<br>
utterly different to <br>
* physics - energy wave, raditation etc.<br>
* oceanography - waves on the ocean<br>
* combat tactics - attack waves<br>
Probably more. This is, I believe, the primary argument for subject
<br>
classification, and also, possibly, the ability to browse or get
updates<br>
on items in your field of interest.<br><br>
The interesting question is do we expect/need OAI harvesters that
can<br>
harvest just history, or just art? And if so to what
granularity?<br><br>
On Wed, Jan 15, 2003 at 09:36:45 +0000, Stevan Harnad wrote:<br>
> <br>
> <br>
> ---------- Forwarded message ----------<br>
> Date: Wed, 15 Jan 2003 15:06:26 +0000<br>
> From: Steve Hitchcock <sh94r@ecs.soton.ac.uk><br>
> To: Pauline Simpson <ps@soc.soton.ac.uk>,
OAI-eprints@fafner.openlib.org<br>
> Subject: Re: Interoperability - subject
classification/terminology<br>
> <br>
> At 13:31 15/01/03 +0000, Pauline Simpson wrote:<br>
> <br>
> >Following on from the OAI Geneva meeting - to open the
discussion please see<br>
>
><a href="http://tardis.eprints.org/discussion/" eudora="autourl">http://tardis.eprints.org/discussion/</a><br>
> <br>
> Pauline, A thought-provoking page that helpfully outlines all the
<br>
> issues. A few points below, but first we need to make a distinction
between <br>
> works where the full text is not available digitally, and those
where it <br>
> is. So the question whether there is a need for classification boils
down <br>
> to: Yes for the former, and (mostly) No for the latter.<br>
> <br>
> By (mostly) I mean let's make it optional. That means, in the case
of <br>
> institutional repositories of research papers (the latter category),
don't <br>
> burden the repository with the need to maintain categorization as a
core <br>
> task. Leave that to services. If it's worth doing, then people will
find <br>
> the resources to do it, but it must not compromise the task of
<br>
> repositories, which is to make the texts available.<br>
> <br>
> If full texts are available, we have the chance to automate search
and <br>
> indexing, say full-text indexing or citation indexing. This is
vastly more <br>
> powerful and cost-effective, but we have to recognise it is not the
same <br>
> thing as classification. Full text indexing can begin to tell us
what a <br>
> text is *about*, rather than simply where it is located, the
classical <br>
> purpose of classification. Through knowing what a text is about, we
can <br>
> make connections with other works in ways that are much more
flexible than <br>
> is offered by classification.<br>
> <br>
> You ask: Can we rely on web search engines like Google to search
deeply or <br>
> accurately enough?<br>
> <br>
> At the moment, simply, yes. It's not the fault of Google that it
can't <br>
> index most of the journal literature.<br>
> <br>
> Where I think classification may continue to have a role is in
interface <br>
> design - you give examples. Classification can inform browsing. This
brings <br>
> us back to services. Services will produce interfaces. In principle,
<br>
> repositories do not need to produce user (as opposed to author or
<br>
> management) interfaces, although in practice there will be few
<br>
> institutional repositories that will be able to resist doing so, for
good <br>
> reasons, but again, they don't have to, and it should be optional
and minimal.<br>
> <br>
> When you ask if the 'push' scenario should replace harvesting,
that's <br>
> interesting because it is counter to the framework OAI has put in
place. <br>
> That is, to reduce the burden on data providers at the expense of
service <br>
> providers, recognising that we have to make the entry threshold for
authors <br>
> and repositories as low as possible. That can make it difficult for
service <br>
> providers, see Liu et al.<br>
>
<a href="http://www.dlib.org/dlib/april01/liu/04liu.html" eudora="autourl">http://www.dlib.org/dlib/april01/liu/04liu.html</a><br>
> but overall it probably remains the best approach, especially if
<br>
> repositories concentrate on optimising the submitted metadata within
the <br>
> OAI framework.<br>
> <br>
> Steve Hitchcock<br>
> Open Citation (OpCit) Project
<<a href="http://opcit.eprints.org/" eudora="autourl">http://opcit.eprints.org/</a>><br>
> IAM Research Group, Department of Electronics and Computer
Science<br>
> University of Southampton SO17 1BJ, UK<br>
> Email: sh94r@ecs.soton.ac.uk<br>
> Tel: +44 (0)23 8059 3256 Fax: +44
(0)23 8059 2865<br>
> <br>
> <br>
> _______________________________________________<br>
> OAI-eprints mailing list<br>
> OAI-eprints@lists.openlib.org<br>
>
<a href="http://lists.openlib.org/mailman/listinfo/oai-eprints" eudora="autourl">http://lists.openlib.org/mailman/listinfo/oai-eprints</a><br><br>
-- <br><br>
Christopher
Gutteridge
eprints-support@ecs.soton.ac.uk<br>
ePrints2 Coder, Support and
Stuff +44 23 8059
4833</blockquote>
<x-sigsep><p></x-sigsep>
~~~~~~~~<br>
Jessie M.N. Hey <br>
Research Fellow TARDIS eprints project,<br>
NOL, University of Southampton Waterfront Campus, European Way,<br>
Southampton, SO14 3ZH, England<br>
Tel: +44 (0)23 8059 6112 Fax +44 (0)23 8059 6115<br>
<a href="http://tardis.eprints.org/" eudora="autourl"><font color="#0000FF"><u>http://tardis.eprints.org/</a><br><br>
</font></u></html>