Description (en)
In the open access environment, many textual resources have become available in the PDF format on the Web. This research aims to survey PDF files in Japanese institutional repositories (IRs) to address the problems encountered during their longterm preservation. With that aim, 1.5 million PDF files collected from Japanese IRs were analyzed with regard to file format, encryption, and metadata. Most PDF files did not conform to PDF/A. A total of 30.5% of PDFs were encrypted and many PDFs did not have embedded metadata. These results imply that PDF files in Japanese IRs have several serious problems for their long-term preservation.