Description (en)
As an important component of china’s public culture system, the National Library of China(NLC) web archiving program started in 2003. Based on the open source software (OSS) Heritrix, NLC started project of collecting, cataloguing and archiving the government public information , important websites and webpages at home and abroad in 2005. The NLC accumulated abundant practical experience and united libraries nationwide to carry out web archiving and service jointly. In 2018, NLC carried out technology upgrade and developed a set of “Web Archiving and Service Platform” for distributed cloud storage infrastructure. The platform adopts a distributed cloud infrastructure and supports the management and use of at least 1 million metadata data, which enables the NLC to collaborate with multiple libraries (institutes) to conduct web collection services. This paper analyzes the construction ideas, technical routes and key technologies in detail based on the analysis the strategy of web archiving and the requirements for the system platform. It is hoped to provide reference for other institutes to carry out the related work.