Home › Hesper Insights › Let's explore the Nobel Prize dataset
16 Aug 2020

Let's explore the Nobel Prize dataset

An overview of the official Nobel Prize Linked Data dataset with some example SPARQL queries.

Since 1901, the Nobel Prizes and the Prizes in Economic Sciences have been awarded 597 times. 950 people and organisations have received the award in the following categories: Physics, Chemistry, Physiology or Medicine, Literature, Peace, and Economic Sciences.

The official Nobel Prize Linked Data dataset is an authoritative source of information about Nobel Prizes and laureates. Importantly, the Nobel Prizes are often shared between multiple people, and the same person or organisation can receive multiple Nobel Prizes. RDF works really well for representing such relationships.

The RDF vocabulary for expressing Nobel Prizes as Linked Data

The Nobel Prize dataset both reuses classes and properties from existing vocabularies and utilises some custom classes and properties that are defined in the http://data.nobelprize.org/terms/ (nobel) namespace. For example, the nobel:Laureate class represents a person or organization that receives a Nobel Prize and is a subclass of foaf:Agent:

@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix nobel: <http://data.nobelprize.org/terms/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

nobel:Laureate a owl:Class .
nobel:Laureate rdfs:subClassOf foaf:Agent .
nobel:Laureate rdfs:label "Laureate" .

The same data in JSON-LD:

{
    "@context": {
        "foaf": "http://xmlns.com/foaf/0.1/",
        "nobel": "http://data.nobelprize.org/terms/",
        "owl": "http://www.w3.org/2002/07/owl#",
        "rdfs": "http://www.w3.org/2000/01/rdf-schema#"
    },
    "@id": "nobel:Laureate",
    "@type": "owl:Class",
    "rdfs:label": "Laureate",
    "rdfs:subClassOf": {
        "@id": "foaf:Agent"
    }
}

The full ontology can be downloaded here.

The Nobel Prize and laureate data

In the Nobel Prize Linked Data dataset, the URIs of all instances begin with http://data.nobelprize.org/resource/. For example, the fact that Wilhelm Röntgen was awarded the 1901 Nobel Prize in Physics is represented as follows:

@prefix nobel: <http://data.nobelprize.org/terms/> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

<http://data.nobelprize.org/resource/nobelprize/Physics/1901> nobel:laureate <http://data.nobelprize.org/resource/laureate/1> .
<http://data.nobelprize.org/resource/laureate/1> nobel:nobelPrize <http://data.nobelprize.org/resource/nobelprize/Physics/1901> .
<http://data.nobelprize.org/resource/nobelprize/Physics/1901> rdfs:label "Physics 1901" .
<http://data.nobelprize.org/resource/laureate/1> rdfs:label "Wilhelm Conrad Röntgen" .

The same data in JSON-LD:

{
    "@context": {
        "nobel": "http://data.nobelprize.org/terms/",
        "rdfs": "http://www.w3.org/2000/01/rdf-schema#"
    },
    "@id": "http://data.nobelprize.org/resource/nobelprize/Physics/1901",
    "rdfs:label": "Physics 1901",
    "nobel:laureate": {
        "@id": "http://data.nobelprize.org/resource/laureate/1",
        "rdfs:label": "Wilhelm Conrad Röntgen",
        "nobel:nobelPrize": {
            "@id": "http://data.nobelprize.org/resource/nobelprize/Physics/1901"
        }
    }
}

The Nobel Prize data is available via the D2R Server running at http://data.nobelprize.org/. A SNORQL server is provided at http://data.nobelprize.org/snorql/ so that we can issue SPARQL queries directly in the browser.

Querying the data

You can run some interesting queries directly in NobelPrize.org's SNORQL browser. Some example queries are described below.

All Nobel Laureates who were born in New Zealand

This query will list all Nobel Laureates who were born in New Zealand:

SELECT ?laureate ?laureateLabel
WHERE {
    ?laureate a nobel:Laureate .
    ?laureate rdfs:label ?laureateLabel .
    ?laureate dbpedia-owl:birthPlace <http://data.nobelprize.org/resource/country/New_Zealand> .
}

Try this query in NobelPrize.org's SNORQL

Query result:

laureatelaureateLabel
<http://data.nobelprize.org/resource/laureate/374>"Maurice Hugh Frederick Wilkins"
<http://data.nobelprize.org/resource/laureate/167>"Ernest Rutherford"
<http://data.nobelprize.org/resource/laureate/730>"Alan G. MacDiarmid"

All Nobel Laureates who were born in New Zealand in Chinese

The Nobel Prize dataset not only uses existing established vocabularies but is also linked to other RDF datasets through owl:sameAs. For example, Nobel Prize Linked Data contains the following statement about Wilhelm Röntgen:

<http://data.nobelprize.org/resource/laureate/1> <http://www.w3.org/2002/07/owl#sameAs> <http://dbpedia.org/resource/Wilhelm_Röntgen> .

Using SPARQL federation we can simultaneously query all other linked data sources. This makes it possible to issue a wide range of queries that go beyond the Nobel Prize dataset.

The Nobel Prize dataset only specifies the names of the laureates in English, but since it has links to DBPedia for each laureate, we can get get the names of all Nobel Laureates who were born in New Zealand in Chinese using a single query:

SELECT ?laureate ?laureateLabel ?dbpediaLaureateLabel
WHERE {
    ?laureate a nobel:Laureate .
    ?laureate rdfs:label ?laureateLabel .
    ?laureate dbpedia-owl:birthPlace <http://data.nobelprize.org/resource/country/New_Zealand> .
    ?laureate owl:sameAs ?dbpediaLaureate .
    SERVICE <http://dbpedia.org/sparql> {
        ?dbpediaLaureate rdfs:label ?dbpediaLaureateLabel
        FILTER (lang(?dbpediaLaureateLabel) = "zh")
    }
}

Try this query in NobelPrize.org's SNORQL

Query result:

laureatelaureateLabeldbpediaLaureateLabel
<http://data.nobelprize.org/resource/laureate/374>"Maurice Hugh Frederick Wilkins""莫里斯·威爾金斯"
<http://data.nobelprize.org/resource/laureate/167>"Ernest Rutherford""欧内斯特·卢瑟福"
<http://data.nobelprize.org/resource/laureate/730>"Alan G. MacDiarmid""艾伦·麦克德尔米德"

People and organisations that received more than one Nobel Prize

To find out which Nobel laureates were honoured with the award multiple times, this query can be used:

SELECT ?laureate ?laureateLabel (COUNT(?prize) AS ?prizeCount)
WHERE {
    ?laureate a nobel:Laureate .
    ?laureate rdfs:label ?laureateLabel .
    ?prize a nobel:NobelPrize .
    ?laureate nobel:nobelPrize ?prize .
}
GROUP BY ?laureate ?laureateLabel
HAVING (?prizeCount > 1)
ORDER BY DESC(?prizeCount)

Try this query in NobelPrize.org's SNORQL

Query result:

laureatelaureateLabelprizeCount
<http://data.nobelprize.org/resource/laureate/482>"Comité international de la Croix Rouge (International Committee of the Red Cross) "3
<http://data.nobelprize.org/resource/laureate/217>"Linus Carl Pauling"2
<http://data.nobelprize.org/resource/laureate/222>"Frederick Sanger"2
<http://data.nobelprize.org/resource/laureate/515>"Office of the United Nations High Commissioner for Refugees (UNHCR) "2
<http://data.nobelprize.org/resource/laureate/6>"Marie Curie, née Sklodowska"2
<http://data.nobelprize.org/resource/laureate/66>"John Bardeen"2

Youngest Nobel laureates

The youngest recipients of the award in all categories can be retrieved using the following query:

SELECT ?laureate ?laureateLabel (?laureateAwardYear - year(?laureateBirthday) AS ?laureateAgeWhenAwarded)
WHERE {
    ?laureate a nobel:Laureate .
    ?laureate rdfs:label ?laureateLabel .
    ?laureate foaf:birthday ?laureateBirthday .
    ?laureateAward a nobel:LaureateAward .
    ?laureate nobel:laureateAward ?laureateAward .
    ?laureateAward nobel:year ?laureateAwardYear .
}
ORDER BY ASC(?laureateAgeWhenAwarded)
LIMIT 5

Try this query in NobelPrize.org's SNORQL

Query result:

laureatelaureateLabellaureateAgeWhenAwarded
<http://data.nobelprize.org/resource/laureate/914>"Malala Yousafzai"17
<http://data.nobelprize.org/resource/laureate/21>"William Lawrence Bragg"25
<http://data.nobelprize.org/resource/laureate/38>"Werner Karl Heisenberg"31
<http://data.nobelprize.org/resource/laureate/40>"Paul Adrien Maurice Dirac"31
<http://data.nobelprize.org/resource/laureate/43>"Carl David Anderson"31

Resources

Cover
See also
A network of drugs: The New Zealand Medicines Terminology
An overview of New Zealand's drug vocabulary.
Linked data for the enterprise: Focus on Bayer's corporate asset register
An overview of COLID, the data asset management platform built using semantic technologies.
A beginner's guide to graph embeddings
Understanding what graph embeddings are and why they are important for graph analytics.
Document understanding: Modern techniques and real-world applications
How document understanding helps bring order to unstructured data.
Navigating unstructured data: The rise of question answering
Question answering technologies are key to efficiently dealing with overwhelming amounts of unstructured data.
Why federation is a game-changing feature of SPARQL
SPARQL federation is an incredibly useful feature for querying distributed RDF graphs.
Interested?
Get in touch now.
Hesper
Your local knowledge engineering guru.
Copyright © 2020 Hesper NZ. Various trademarks held by their respective owners.