Submitted by MDBourgeois on Sat, 12/29/2012 - 13:19
I am working with an external program for extended data analysis. When I export my Analysis files and open them in my network analysis programs (Pajek, UCINet, etc.) the datasets do not show a URL. Rather, they have a numeric ID.
Two Questions:
1) Is there a way to preserve the URLs rather than the ID's?
2) Which ID is being used (row/pagegroup/other)? That is, how do I know which exported ID corresponds with which URL?
Thanks,
Michael
Forums:
Hi,
I'm assuming you are exporting the network in pajek format and then importing into pajek or ucinet
The following is part of a pajek format export file for a test database. Only the first 4 vertices and first 4 edges are shown.
------
*Network uberlink
%uberlink dataset (version: 0.5.17.6)
%dataset SQL statement: WHERE prune_status=0
%Note: vertex ids are sequential and _do not_ correspond uberlink (i.e. pagegroup) ids.
*Vertices 40
1 "http://voson.anu.edu.au/"
2 "http://www.webfoundation.org/"
3 "http://citasa.org/"
4 "http://www.anu.edu.au/"
...
*Arcs
5 1 2
6 1 2
8 1 9
12 1 15
...
-----
So if you open up the pajek file in a text editor you can see which URL (pagegroup URL) the ID numbers correspond to. Note that for pajek format, the ID numbers need to be sequential and that is why they do not correspond to the internal ID numbers used in VOSON.
Regarding having the maps display the URLs rather than the ID numbers, that is something I can't tell you as I don't know much about pajek. I use R for my network analysis, and in R (igraph or statnet) it is possible to display different vertex attributes in the maps.
Rob
Yes, I'm exporting to Pajek and UCInet (and others).
I'm managing large networks and so am using caution when changing ID numbers or other identifiers and prefer that process to be automated when possible to avoid human error. I do see that the PageGroup ID and the sequential ID generated for Pajek export are different, but is there any reason to suspect the voson generated *Row ID* to be different from the voson generated sequential ID ? In other words, if I sorted the file generated for Pajek by the sequential ID number and I sorted my master spreadsheet by the Row ID created internally by Voson, would the URLs match?
A spot check of this appears to be accurate, but impossible determine by eye with such a large network.
Thanks,
M
Yes, what you say is correct unless there is some bug I am unaware of.
The order of URLs in the pajek file should be exactly the same as the order of URLs in the csv file for the corresponding voson-analysis database.
Rob