Visualisation of the lak dataset using Unfolding – Geert
The lak dataset
As the dataset was available in rdf format and I had no experience whatsoever using this format, it took me some time to get acquainted with it. To use the dataset in Java I used the Jena library to search in the rdf model for the needed data. As data to process I used the name of the country authors are linked to.
As the lak dataset did not contain any geo data apart from the name of the country authors are based and the names of the organisation an author is linked to another method had to be used to retrieve geodata that can be visualized. I used Sparql queries to DBpedia to find coordinates of the countries. Names of countries can be written in a lot of different ways unfortunately, for example The United States of America was found in the dataset as “USA” . DBPedia could not find any coordinates for “USA” as the correct name in DBPedia is United_States (because the page that is found when searching for “USA” redirects to the page for “United_States”). For countries like this where I found no coordinates using DBPedia, I used Geonames on the name of the country found in the dataset.
The visualisation itself is very simple and just shows the number of authors that is linked to a specific country. When the number of authors for a country is higher, the color turns from green to yellow to red and the size of the marker increases. I also experimented with hovering over markers, but this is not visible in the following screenshot of my visualisation.