[CollEc] CollEc

Thomas Krichel krichel at openlib.org
Fri Nov 21 15:02:09 UTC 2014


  Nikos Askitas writes

> I hope that you remember me

  I have a very poor memory, but I remember. 

> I have been looking at CollEc recently and started programming the
>  search interface a bit.

  CollEc has a search interface. 

> I would basically like to write a script which will get the shortest
> path(s) between two authors. So I would like to say
> get_paths(Askitas, Krichel) and get the data in some form I can work
> on further. I can strip the HTML with regexs
  
  I'd rather give you the complete dataset.

> 1.  Is there a file which maps RealName to repec handles? Can you point me somewhere?

icanis at katri:~/icanis/input$ cat ras_mans_nodes_1416097129.xml | head -10
<nodes>
  <node ref="paa1" name="Arild Aakvik" homepage="http://people.uib.no/secaa"/>
  <node ref="paa11" name="Arnstein Aassve" homepage="http://faculty.unibocconi.eu/arnsteinaassve/"/>
  <node ref="paa12" name="susan aaronson" homepage="http://www.gwu.edu/~elliott/faculty/aaronson.cfm"/>
  <node ref="paa13" name="Daniel Aaronson"/>
  <node ref="paa2" name="David Aadland" homepage="http://www.uwyo.edu/aadland/"/>
  <node ref="paa22" name="Knut Are Aastveit" homepage="http://www.norges-bank.no/en/about/research/economists/aastveit-knut-are/"/>
  <node ref="paa23" name="Knut Aase" homepage="http://www.nhh.no/Default.aspx?ID=2004"/>
  <node ref="paa6" name="Rolf Aaberge"/>
  <node ref="paa8" name="Rob Aalbers"/>

> 2.  Do you have a script somewhere which would work for pairs of names?

  No, but I have the data. Let's look at at example

icanis at katri:~/icanis/input$ grep Krichel ras_mans_nodes_1416097129.xml
  <node ref="pkr1" name="Thomas Krichel" homepage="http://openlib.org/home/krichel"/>
icanis at katri:~/icanis/input$ grep Askitas ras_mans_nodes_1416097129.xml
  <node ref="pas55" name="Nikos Askitas" homepage="http://www.iza.org/home/askitas"/>

  This gives us the handle, from there we can find the paths in two ways.

icanis at katri:~$ grep 'pas55$' icanis/paths/ras/biwe/k/r/1/paths 
5       pzi1    pur18   psc166  pzi13   pas55

icanis at katri:~$ grep 'pkr1$' icanis/paths/ras/biwe/a/s/55/paths 
5       pzi13   psc166  pur18   pzi1    pkr1

  I use the latest of the two path files that I can grep. That's
  how the search works. BTW, there are close to 1 billion paths.

> The search cgi you have now works for Name and handle or for two
> handles.

  Yes, I should write an interface for two. 

> 3.  Is there a cgi you can post which will deliver shortest paths as
> arrays of arrays without html? Even a path per line separated by
> comma would do.

  I have not done this, but you could do it. 

> 4.  Would you share ALL the coauthorship data with me?

  Sure. Currently I don't have a backup of the paths, since they can
  be recomputed. But it would take a very long time. If you would
  provide a backup for me that would help me.
 
  Kindly open an account "icanis" on a machine with rsync installed
  and authorise the key

icanis at katri:~$ cat ~/.ssh/id_rsa.pub 
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDOhbsMyNTv9O3h0tu7eMjqdR5t2BG1jxqErz2YVoVcPGfSZ2mOgwMLMSeFQY3YwNzkC/lfzbtGKybm9dl+CfvnERKL3tA5+NiQiXyJ0WZVp/bEvePMMsDzcLv+Z4RHStl9mU+9OuTPDYzpheTWbjswszQc9kE2ECqlRUqfIb1MQ1/UWDfaq9p8O0USYc25t/uM95VOtHTPHvy75d2+8m3NJPhdFX2yqTiD0ra2phdwMrgw+Csivgs/gfhIhMepep29+V+sBvHDhuVOVBS9N87ae7AJlw5HDceIKwkxasFfMB9mw7RxK7OGTGBcM78ykFRustlKAiDy3M9t/iqb9NNh icanis at fricka

  This is the size of the account right now

icanis at katri:~$ du -s .
81919236        .

> I would like to write some papers with it?

  Write as many or as few as you like.

  Three additional points.

  (1) If you can calculate summary statistics on the network,
  I can use them in the web site. The site is written using
  XSLT, so an XML file with the stats will be perfect.

  (2) I have taken liberty to add you to the Collec-run mailing list.  You
  are the sixth member on the list. It has very little traffic, but all
  correspondance about CollEc should be sent there.

  (3) I was in Cologne at GESIS between October 13 and 31. I hiked even do
  the Rheinsteig between Bonn and the Drachenfels on 25th October so I
  could have met you. But I had forgotten about you, sorry.  I am back
  in NYC now, homeless and jobless, so if you have a consultancy
  opportunity I would look into it. I will probably be back at GESIS
  at some stage. 

-- 

  Cheers,

  Thomas Krichel                  http://openlib.org/home/krichel
                                              skype:thomaskrichel



More information about the CollEc-run mailing list