You are here: University of Vienna PHAIDRA Detail o:503164
Title
Preserving Websites Of Research & Development Projects: Paper - iPRES 2016 - Swiss National Library, Bern
Language
English
Description (en)
Research and Development (R&D) websites often provide valuable and unique information such as software used in experiments, test data sets, gray literature, news or dissemination materials. However, these sites frequently become inactive after the project ends. For instance, only 7% of the project URLs for the FP4 work programme (1994-1998) were still active in 2015. This study describes a pragmatic methodology that enables the automatic identification and preservation of R&D project websites. It combines open data sets with free search services so that it can be immediately applied even in contexts with very limited resources available. The “CORDIS EU research projects under FP7 dataset” provides information about R&D projects funded by the European Union during the FP7 work programme. It is publicly available at the European Union Open Data Portal. However, this dataset is incomplete regarding the project URL information. We applied our proposed methodology to the FP7 dataset and improved the completeness of the FP7 dataset by 86.6% regarding the project URLs information. Using these 20 429 new project URLs as starting point, we collected and preserved 10 449 947 Web files, fulfilling a total of 1.4 TB of information related to R&D activities. All the outputs from this study are publicly available [16], including the CORDIS dataset updated with our newly found project URLs.
Author of the analogue object
Daniel  Bicho
Daniel  Gomes
Publisher
Swiss National Library, Bern
Format
application/pdf
Size
177.1 kB
Licence Selected
CC BY-NC-SA 3.0 AT
Content
Details
Object type
PDFDocument
Format
application/pdf
Created
27.01.2017 04:25:38
Metadata