You are here: University of Vienna PHAIDRA Detail o:429619
Title
Data Mining Web Archives
Language
English
Description (en)
Many institutions are now building rich, significant archives of web content. Though the number of web archiving programs has grown, access models for these collections have remained focused on URL-based discovery and traditional live-web-style browsing. Given the resources required to build and maintain web archives, finding new forms of access for these collection will help increase use and thus allow institutions to better advocate for the value of collecting and preserving web content. Distant reading, text mining, digital humanities, and other data-driven forms of analysis have become increasingly popular methods of using digitized and digital collections. Web archives, being born-digital, of notable size and temporal breadth, having extensive metadata, and often created with a curated topical focus, are ideal resources for data mining and other forms of computational analysis. This workshop will explore new methods of research use of web archives by giving attendees exposure to, and training in, the tools, methods, and types of analysis possible in working with datasets extracted from the entirety of curated web archive collections. Giving researchers datasets of specific extracted metadata elements, link graph data, named entities, and other post-processed data can help facilitate new uses and new types of visualization, inquiry, and analysis.
Keywords (en)
Web archiving, data mining, research, access iPres 2015
ISBN
978-0-692-59881-8
Author of the digital object
Jefferson  Bailey
Lori  Donovan
Format
application/pdf
Size
315.9 kB
Licence Selected
CC BY 4.0 International
Conferences
Conference 2015
Name of Publication (en)
Proceedings of the 12th International Conference on Digital Preservation
Publisher
School of Information and Library Science, University of North Carolina at Chapel Hill
Other links

ISBN
978-0-692-59881-8

Content
Details
Uploader
Object type
PDFDocument
Format
application/pdf
Created
06.03.2016 08:35:34
Metadata