[CollEc] New CollEc

Düben, Christian Christian.Dueben at uni-hamburg.de
Thu Nov 26 16:09:39 UTC 2020


I guess, Thomas will clarify this.

Christian Düben
Research Associate
Chair of Macroeconomics
Hamburg University
Von-Melle-Park 5, Room 3102
20146 Hamburg
Germany
+49 40 42838 1898
christian.dueben at uni-hamburg.de
http://www.christian-dueben.com


-----Original Message-----
From: Christian Zimmermann <zimmermann at stlouisfed.org> 
Sent: Donnerstag, 26. November 2020 16:50
To: Düben, Christian <Christian.Dueben at uni-hamburg.de>
Cc: Thomas Krichel <krichel at openlib.org>; collec-run at lists.openlib.org
Subject: RE: [CollEc] New CollEc

Corrected.

Now all I need is the new data feed.

Christian Zimmermann                          FIGUGEGL!
Economic Research
Federal Reserve Bank of St. Louis
P.O. Box 442
St. Louis MO 63166-0442 USA
https://ideas.repec.org/zimm/   @CZimm_economist

On Thu, 26 Nov 2020, D�ben, Christian wrote:

> There is just a little error in "volunteering options" link, which is why it does not work. It is "https://ideas.repec.org/volunteers.htmnl", but should be "https://ideas.repec.org/volunteers.html".
>
> Christian D�ben
> Research Associate
> Chair of Macroeconomics
> Hamburg University
> Von-Melle-Park 5, Room 3102
> 20146 Hamburg
> Germany
> +49 40 42838 1898
> christian.dueben at uni-hamburg.de
> http://www.christian-dueben.com
>
>
> -----Original Message-----
> From: Christian Zimmermann <zimmermann at stlouisfed.org>
> Sent: Donnerstag, 26. November 2020 15:45
> To: D�ben, Christian <Christian.Dueben at uni-hamburg.de>
> Cc: Thomas Krichel <krichel at openlib.org>; collec-run at lists.openlib.org
> Subject: RE: [CollEc] New CollEc
>
> All I needed was this image file.
>
> Blog post is published.
>
> First batch of IDEAS user profiles has links to new CollEc.
>
> When I search for "Christian D" your name does not appear (No umlaut 
> on the current keyword)
>
> Christian Zimmermann                          FIGUGEGL!
> Economic Research
> Federal Reserve Bank of St. Louis
> P.O. Box 442
> St. Louis MO 63166-0442 USA
> https://ideas.repec.org/zimm/   @CZimm_economist
>
> On Thu, 26 Nov 2020, D�ben, Christian wrote:
>
>> I do not know the externally available app data's ftp address. I simply put it in /home/icanis/ftp/opt on darni. And Thomas takes care of the ftp part.
>>
>> Why am I not in the database? When I enter my name in the distances tab, which the plot is based on, it works.
>>
>> You mention that you want a fresh and clear picture. If you tell me how exactly you want to change the plot, I can generate it and send it to you. Do you need a higher resolution? See the figure attached to this e-mail, i.e. the plot used in the pdf, as a reference.
>>
>> Christian D�ben
>> Research Associate
>> Chair of Macroeconomics
>> Hamburg University
>> Von-Melle-Park 5, Room 3102
>> 20146 Hamburg
>> Germany
>> +49 40 42838 1898
>> christian.dueben at uni-hamburg.de
>> http://www.christian-dueben.com
>>
>>
>> -----Original Message-----
>> From: Christian Zimmermann <zimmermann at stlouisfed.org>
>> Sent: Donnerstag, 26. November 2020 05:01
>> To: D�ben, Christian <Christian.Dueben at uni-hamburg.de>
>> Cc: Thomas Krichel <krichel at openlib.org>; 
>> collec-run at lists.openlib.org
>> Subject: RE: [CollEc] New CollEc
>>
>> Yes, I mean the app data.
>>
>> Blog post is almost ready. I cannot find a way to share it with you without publishing it... but I think I copied everyting over and only changed links a little and what you suggested below.
>>
>> I wanted ot make a fresh and clear picture of the graph you included in your pdf. Unfortunately, I cannot recreate it, as you do not seem to be in the database...
>>
>> Christian Zimmermann                          FIGUGEGL!
>> Economic Research
>> Federal Reserve Bank of St. Louis
>> P.O. Box 442
>> St. Louis MO 63166-0442 USA
>> https://ideas.repec.org/zimm/   @CZimm_economist
>>
>> On Wed, 25 Nov 2020, D�ben, Christian wrote:
>>
>>> As far as I am concerned, the app is going to remain at app.collec.repec.org.
>>>
>>> Do you mean the app's data that Thomas made available through the ftp server?
>>>
>>> Could you change "Co-authored research has been on the rise over the past decades forming collaborations over vast geographic distances and fields of research" in the blog post's second paragraph to "Co-authored research has been on the rise over the past decades, forming collaborations over enormous geographic distances and many fields of research"?
>>>
>>> Christian D�ben
>>> Research Associate
>>> Chair of Macroeconomics
>>> Hamburg University
>>> Von-Melle-Park 5, Room 3102
>>> 20146 Hamburg
>>> Germany
>>> +49 40 42838 1898
>>> christian.dueben at uni-hamburg.de
>>> http://www.christian-dueben.com
>>>
>>>
>>> -----Original Message-----
>>> From: Christian Zimmermann <zimmermann at stlouisfed.org>
>>> Sent: Mittwoch, 25. November 2020 20:21
>>> To: D�ben, Christian <Christian.Dueben at uni-hamburg.de>
>>> Cc: Thomas Krichel <krichel at openlib.org>; 
>>> collec-run at lists.openlib.org
>>> Subject: RE: [CollEc] New CollEc
>>>
>>> I'll take the co-authors. app.collec.repec.org is not going to change, right?
>>>
>>> Also, authors will start complaining that the data in the new CollEc does not match the data in their statistics. Where can I find the new data?
>>>
>>> Christian Zimmermann                          FIGUGEGL!
>>> Economic Research
>>> Federal Reserve Bank of St. Louis
>>> P.O. Box 442
>>> St. Louis MO 63166-0442 USA
>>> https://ideas.repec.org/zimm/   @CZimm_economist
>>>
>>> On Wed, 25 Nov 2020, D�ben, Christian wrote:
>>>
>>>> Thanks. I am glad you like it.
>>>>
>>>> Which links you use for the author pages depends on you. Which tab would you like to use as a landing page? Each of the output-generating tabs (Distances, Closeness, Betweenness, Co-Authors, Shortest Paths) has an entry point.
>>>>
>>>> Christian D�ben
>>>> Research Associate
>>>> Chair of Macroeconomics
>>>> Hamburg University
>>>> Von-Melle-Park 5, Room 3102
>>>> 20146 Hamburg
>>>> Germany
>>>> +49 40 42838 1898
>>>> christian.dueben at uni-hamburg.de
>>>> http://www.christian-dueben.com
>>>>
>>>> -----Original Message-----
>>>> From: Christian Zimmermann <zimmermann at stlouisfed.org>
>>>> Sent: Mittwoch, 25. November 2020 20:09
>>>> To: D�ben, Christian <Christian.Dueben at uni-hamburg.de>
>>>> Cc: Thomas Krichel <krichel at openlib.org>; 
>>>> collec-run at lists.openlib.org
>>>> Subject: RE: [CollEc] New CollEc
>>>>
>>>> Finally getting to it...
>>>>
>>>> Excellent post. I will try to massage it into the blog format. See me the proper links.
>>>>
>>>>
>>>>
>>>> Christian Zimmermann                          FIGUGEGL!
>>>> Economic Research
>>>> Federal Reserve Bank of St. Louis
>>>> P.O. Box 442
>>>> St. Louis MO 63166-0442 USA
>>>> https://ideas.repec.org/zimm/   @CZimm_economist
>>>>
>>>> On Tue, 6 Oct 2020, Dᅵben, Christian wrote:
>>>>
>>>>> As requested, I came up with a RePEc blog post. You can find a draft attached to this e-mail. How do you like it?
>>>>>
>>>>> Do not post it yet. The hyperlinks already point to app.collec.repec.org which is yet to be connected to the app.
>>>>>
>>>>> Christian Dᅵben
>>>>> Research Associate
>>>>> Chair of Macroeconomics
>>>>> Hamburg University
>>>>> Von-Melle-Park 5, Room 3102
>>>>> 20146 Hamburg
>>>>> Germany
>>>>> +49 40 42838 1898
>>>>> christian.dueben at uni-hamburg.de
>>>>> http://www.christian-dueben.com
>>>>>
>>>>> -----Original Message-----
>>>>> From: CollEc-run <collec-run-bounces at lists.openlib.org> On Behalf 
>>>>> Of Dᅵben, Christian
>>>>> Sent: Montag, 5. Oktober 2020 22:04
>>>>> To: Thomas Krichel <krichel at openlib.org>
>>>>> Cc: collec-run at lists.openlib.org
>>>>> Subject: Re: [CollEc] New CollEc
>>>>>
>>>>> I reinstalled the containerized MariaDB which fixed the issue of the unexpectedly large ibdata1 file in it. A script is currently computing CollEc's results again and inserts the tables into the new containerized data base. The app does not work as intended until this process is complete, i.e. until around 11 pm CEST. This temporary "shutdown" of the app only happens because I set up a new containerized data base. The regular daily updating routine does not impede the app's availability.
>>>>>
>>>>> The cleanup freed around 50 GB of disk space. The new CollEc's total size is around 28 GB.
>>>>>
>>>>> Christian Dᅵben
>>>>> Research Associate
>>>>> Chair of Macroeconomics
>>>>> Hamburg University
>>>>> Von-Melle-Park 5, Room 3102
>>>>> 20146 Hamburg
>>>>> Germany
>>>>> +49 40 42838 1898
>>>>> christian.dueben at uni-hamburg.de
>>>>> http://www.christian-dueben.com
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: CollEc-run <collec-run-bounces at lists.openlib.org> On Behalf 
>>>>> Of Dᅵben, Christian
>>>>> Sent: Montag, 5. Oktober 2020 19:04
>>>>> To: Thomas Krichel <krichel at openlib.org>
>>>>> Cc: collec-run at lists.openlib.org
>>>>> Subject: Re: [CollEc] New CollEc
>>>>>
>>>>> I fixed the Per archive issue. The files are not missing on Aigtu. An intermediate step in the daily updating routine was incomplete. It works now. Sorry about that.
>>>>>
>>>>> Christian Dᅵben
>>>>> Research Associate
>>>>> Chair of Macroeconomics
>>>>> Hamburg University
>>>>> Von-Melle-Park 5, Room 3102
>>>>> 20146 Hamburg
>>>>> Germany
>>>>> +49 40 42838 1898
>>>>> christian.dueben at uni-hamburg.de
>>>>> http://www.christian-dueben.com
>>>>>
>>>>> -----Original Message-----
>>>>> From: CollEc-run <collec-run-bounces at lists.openlib.org> On Behalf 
>>>>> Of Dᅵben, Christian
>>>>> Sent: Sonntag, 4. Oktober 2020 13:12
>>>>> To: Thomas Krichel <krichel at openlib.org>
>>>>> Cc: collec-run at lists.openlib.org
>>>>> Subject: Re: [CollEc] New CollEc
>>>>>
>>>>> Do you mean the shortest paths between any two authors in the network? Users can look at their shortest paths in the Shortest Paths tab. That functionality computes them directly from the graph. The distances and the closeness measures also rely on these shortest paths. But at no point are they written to disk. They are only stored in memory. I did some extensive testing and writing these paths to disk is incredibly inefficient. There is a reason why the new CollEc's daily updating routine takes only a tiny fraction of the time and computational resources required by the old CollEc. It is the eradication of inefficiencies along multiple dimensions. Unless there is a very good reason why these paths should be written to disk, I do not see a point of doing so. Are they used anywhere outside CollEc?
>>>>>
>>>>> app.collec.repec.org sounds good. Let me know when you changed it. I am then going to adjust the documentation on the entry points.
>>>>>
>>>>> It is not a problem that Helos is an Ubuntu system. In fact, I prefer Ubuntu over Debian. Nonetheless, I am really not looking forward to that migration as I have to set up and test the entire framework again. There should be easier ways to directly migrating the CollEc system as a whole instead of installing it again piece by piece. But that is beyond my current technical skills.
>>>>>
>>>>> The new CollEc requires rather little disk space. The only component that appears unexpectedly large is the containerized MariaDB. I am going to check tomorrow what might be wrong with it.
>>>>>
>>>>> The authors' full names are derived from that local copy of the Per archive. The hundreds of missing files set the names of hundreds of people to NA. I am going to sync the archive to another server today and am going to check whether the problem remains. I need to make sure that the files are really missing on Aigtu and there is not a glitch somewhere in my code.
>>>>>
>>>>> Have a nice day.
>>>>>
>>>>> Christian Dᅵben
>>>>> Research Associate
>>>>> Chair of Macroeconomics
>>>>> Hamburg University
>>>>> Von-Melle-Park 5, Room 3102
>>>>> 20146 Hamburg
>>>>> Germany
>>>>> +49 40 42838 1898
>>>>> christian.dueben at uni-hamburg.de
>>>>> http://www.christian-dueben.com
>>>>>
>>>>> -----Original Message-----
>>>>> From: Thomas Krichel <krichel at openlib.org>
>>>>> Sent: Samstag, 3. Oktober 2020 17:11
>>>>> To: Dᅵben, Christian <Christian.Dueben at uni-hamburg.de>
>>>>> Cc: Christian Zimmermann <zimmermann at stlouisfed.org>; 
>>>>> collec-run at lists.openlib.org
>>>>> Subject: Re: [CollEc] New CollEc
>>>>>
>>>>>  Dᅵben, Christian writes
>>>>>
>>>>>> The files meant for the ftp server are in /home/icanis/ftp/opt.
>>>>>> The rank variable in these files handles ties with the "min"
>>>>>> option. Assume we have four authors with closeness values of 0.5, 
>>>>>> 0.3,
>>>>>> 0.3 and 0.2 respectively. With the "min" option their ranks are 
>>>>>> 1, 2,
>>>>>> 2 and 4.
>>>>>
>>>>>  Seems fine, but where is the path data?
>>>>>
>>>>>  CollEc generates path data. As long as your system does does not 
>>>>> produce  path data, it is not a replacement for the existing CollEc.
>>>>>
>>>>>> You mentioned that your code fixes the rsync permission problem 
>>>>>> over time. What time frame did you have in mind? It does not work yet.
>>>>>
>>>>>  Then it does not work. The issue had nothing to do with 
>>>>> permissions, but with the fact that remi at aigtu was completely 
>>>>> broken. That is fixed.
>>>>>
>>>>>> Do I need to add anything to the shell script?
>>>>>
>>>>>  I doubt it. If the data does not update the problem is that aigtu 
>>>>> does not have an up-to-date copy of RePEc:per.
>>>>>
>>>>>> Before Christian Zimmermann links any IDEAS content to the web 
>>>>>> application's entry points, we should discuss the URL.
>>>>>
>>>>>  Collec2 is no good. What about app.collec? It will become 
>>>>> collec.repec.org when it can replace the functionality. I'm much 
>>>>> looking forward to that as the current calculations block  helos 
>>>>> for any other talks.
>>>>>
>>>>>  We need to move this to helos. There you will have more computing power.
>>>>>
>>>>>> Could you add a placeholder HTML page at graphec.repec.org? Something like "GraphEc is currently publicly unavailable. Contact <a href="http://www.christian-dueben.com">Christian Dᅵben</a> regarding the application's public release.". The new CollEc already links to GraphEc and I think that a placeholder page with a short informative text would look nicer than the Apache2 Debian default page.
>>>>>>
>>>>>
>>>>>  I suggest that if graphec is not available, you should not link to it.
>>>>>
>>>>>  When helos was installed, debian had no controller for its NIC.
>>>>> Thus it's an ubuntu machine. Will that be a problem? I could 
>>>>> reinstall helos, but I don't have disk space to at least 
>>>>> temporarily store the data it houses now.
>>>>>
>>>>>  Generally, my machines are running full. We are short of disk space.
>>>>>  It has been difficult to get further resources because of the 
>>>>> crisis.
>>>>>
>>>>> --
>>>>>
>>>>>  Cheers,
>>>>>
>>>>>  Thomas Krichel                  http://openlib.org/home/krichel
>>>>>                                              skype:thomaskrichel
>>>>>
>>>>> _______________________________________________
>>>>> CollEc-run mailing list
>>>>> CollEc-run at lists.openlib.org
>>>>> http://lists.openlib.org/cgi-bin/mailman/listinfo/collec-run
>>>>>
>>>>> _______________________________________________
>>>>> CollEc-run mailing list
>>>>> CollEc-run at lists.openlib.org
>>>>> http://lists.openlib.org/cgi-bin/mailman/listinfo/collec-run
>>>>>
>>>>> _______________________________________________
>>>>> CollEc-run mailing list
>>>>> CollEc-run at lists.openlib.org
>>>>> http://lists.openlib.org/cgi-bin/mailman/listinfo/collec-run
>>>>>
>>>>
>>>
>>
>



More information about the CollEc-run mailing list