Masterarbeit
Fine-Grained Description of Scientific Data Sets
Research Area
Intelligent Information Management
Advisers
Description
The digitalization of science has also led to new possibilities to collaboratively use existing research data (Open Research / Open Data). Projects such as Googles Dataset Search or Re3Data.org show, that there are already many data publication platforms that provide datasets with accurate meta descriptions. However, to find datasets for a certain research problem is still difficult. One reason might be, that the provided information primarily focuses on discovery metadata, thus provenance information, and less on structured data describing the content of the dataset in a homogeneous fashion.
This project focuses on metadata standards to describe scientific datasets. After a requirement and State of the Art analysis, existing standards such as OAI-PMH, PREMIS, RADAR / DataCite etc. have to be compared and an overview of their strengths and weaknesses has to be presented with respect to the uniform usage of explicit Linked Data URIs. Then, a vocabulary for a metadata profile has to be derived that is capable of describing multiple aspects of research data sets. The vocabulary shall be simple, provider-independent and ready for a distributed usage. An evaluation has to show the strength of the developed approach. The final version can then become relevant within the schema.org project.