[rclis] Julio's data in AMF

Thomas Krichel krichel at openlib.org
Thu Nov 6 17:31:03 EST 2003


  Hi,

  this is just to say that there is a first release of 
  Julio's data (the one that forms the bulk of DoIS) 
  in AMF, within a Geneva style archive, at
 
  http://wotan.liu.edu/rclis/jul

  This has been a struggle for about two weeks to get
  it to go. It will have to be tested out in practice
  if and when we build a successor/implementation for
  DoIS that is based on AMF/XML. 

  I am still working on converting DBLP to AMF. I have done the conversion
  of the journal data (about 1/3 of the total), but the rest,
  essentially data on conference papers is still to be done. The problem
  here is the representation of conferences. If each conference is own
  collection, we have a huge collection of conferences. This is not
  problematic (a part from being labor-intensive to maintain) in itself,
  but it limits the usefulness of collection level data. On the other
  hand, we can try to collection conference series data, since many
  conferences are held annually or so. Such classification would give a
  lot better subject classification through the work of the conferences,
  but it would be more work and needs real expertise to maintain.

  In the meantime, I have started working on the Konz project, see
  http://rclis.org/internal/konz.html.  Progress there has been quite
  good. I completed a first implementation in about three weeks working
  on this full-time in Novosibirsk. But when I started to run, at the
  end of my stay in Siberia, the disk I worked with, based in New
  York, crashed.  I suspected the problem is that is is too big at 
  160G, and that the kernel can not see it. Fixing
  the disk and the computer has cost me quite a bit of time, it is
  still not fully stable. But I now have it at home in my closet,
  and I monitor it constantly. I predict that using konz, on the full
  DBLP, we will be able to get 30,000 full texts. This is really pretty
  good. I don't want to go into more details right now. Konz
  is obviously a sophisticated piece of work. Complete DBLP
  conversion and konz running on the whole set will
  be done, I expect, by the end of the year.  Of course, with me
  working on my own essentially, on data collection, it will take more
  time until we have a really good set. But I will not give up and I am
  optimistic that this work will reap great reward.

  Other good news is that Google really seems to love portals. 
  Just look for example at my boss Michael Koenig, last time 
  I searched for his name on Google, DoIS came right up as the
  the first hit. Thus I am  sure, once we get the coverage
  of DoIS extended will will disseminate quite will if we get
  few people to open links to it.



  Cheers,

  Thomas Krichel                      mailto:krichel at openlib.org
                                 http://openlib.org/home/krichel
                             RePEc:per:1965-06-05:thomas_krichel

 



More information about the rclis mailing list