[CollEc] New CollEc

Düben, Christian Christian.Dueben at uni-hamburg.de
Wed Oct 7 15:56:45 UTC 2020


app.collec.repec.org works. I adjusted the documentation accordingly.

As long as Nikos does not complain, let us stick to the current code.

I suggest that we discuss the migration after observing the app's performance when accessed by many users. Do you log CPU and memory use? It would be helpful to have data spanning multiple weeks for an analysis.

Christian Düben
Research Associate
Chair of Macroeconomics
Hamburg University
Von-Melle-Park 5, Room 3102
20146 Hamburg
Germany
+49 40 42838 1898
christian.dueben at uni-hamburg.de
http://www.christian-dueben.com

-----Original Message-----
From: Thomas Krichel <krichel at openlib.org> 
Sent: Mittwoch, 7. Oktober 2020 09:19
To: Düben, Christian <Christian.Dueben at uni-hamburg.de>
Cc: CollEc Run <collec-run at lists.openlib.org>
Subject: Re: [CollEc] New CollEc

  Düben, Christian writes

> Do you mean the shortest paths between any two authors in the network? 
> Users can look at their shortest paths in the Shortest Paths tab. That 
> functionality computes them directly from the graph. The distances and 
> the closeness measures also rely on these shortest paths. But at no 
> point are they written to disk. They are only stored in memory. I did 
> some extensive testing and writing these paths to disk is incredibly 
> inefficient. There is a reason why the new CollEc's daily updating 
> routine takes only a tiny fraction of the time and computational 
> resources required by the old CollEc. It is the eradication of 
> inefficiencies along multiple dimensions.

  I'm not sure I understand, but it's clear that the approach
  I took recalculated the same paths over over again. I did not
  know better. 

> Unless there is a very good reason why these paths should be written 
> to disk, I do not see a point of doing so. Are they used anywhere 
> outside CollEc?

  They have been exported to IZA. But I have not read Nikos complaining.
  He is on the list. 

> app.collec.repec.org sounds good. Let me know when you changed it. I 
>am then going to adjust the documentation on the entry points.

  It looks like this is done, with app pointing to darni. 

> It is not a problem that Helos is an Ubuntu system. In fact, I prefer 
> Ubuntu over Debian. Nonetheless, I am really not looking forward to 
> that migration as I have to set up and test the entire framework 
> again. There should be easier ways to directly migrating the CollEc 
> system as a whole instead of installing it again piece by piece. But 
> that is beyond my current technical skills.

  I can get migrations done quite easily with my ways of working.
  It involves writing all in the user space with minimum root
  involvement, and relying on o/s packages as much as possible. 

  There is nothing we can do about the migration. Helos is sponsored
  to do CollEc. We can't run it on darni. On the upside, helos is more
  powerful than darni. 

> The new CollEc requires rather little disk space. The only component 
> that appears unexpectedly large is the containerized MariaDB. I am 
> going to check tomorrow what might be wrong with it.

  helos has more space. It's used for the RePEc snapshot. That
  requires huge space. It is also used for backup, since aigtu is so
  full it can't handle it all. The plan is to migrate to migrate the
  snapshot to archec.  But that may take a few years more.

-- 

  Cheers,

  Thomas Krichel                  http://openlib.org/home/krichel
                                              skype:thomaskrichel



More information about the CollEc-run mailing list