Enabling scientific data on the web

by Raymond Alexander Milowski

Institution: University of Edinburgh
Year: 2014
Keywords: web; science; RDFa; semantics
Scientific data does not exist on the Web in the same way as the written word; reviews, media, wikis, social networks, and blogs all contribute to the interconnected nature of ordinary language on the Web. Network effects create additional value from seemingly minor contributions to the Web. But nothing such as this exists for scientific data. Simply put, within the Open Web Platform, we cannot currently turn and apply similar mechanisms for scientific work without great effort. Thus, the Web has not so far enabled Science as well as it has enabled dissemination and interconnection for the written word: to truly enable Science on the Web, we must endeavor to make data and its semantics first-class Web constituents. This thesis focuses on solving this problem by enabling scientific data to exist on the Web in such a way that it can be processed both as viewable content and consumed data. Starting from the principles on which the Web has so far thrived, we propose solutions to enable complex data exchanges while preserving the Web as it stands. We introduce the Partition Annotate Name (PAN) methodology, which relies upon embracing the core architectural principles of the Web: name things with URIs; process common data formats; use common rules under a shared contract between publisher, developer, and consumer.