Abstract The Portable Document Format PDF has become the standard and preferred form for the digital edition of scholarly journal articles. However, its current use among scholarly publishers has been largely restricted to making research articles print-ready, and this greatly limits the potential capacity of the PDF research article to form a greater part of a digital knowledge ecology.

While this article considers historical issues of design and format in scholarly publishing, it also takes a very practical approach, providing demonstrations and examples to assist publishers and scholars in finding greater scholarly value in the way the PDF is used for journal articles.

This involves but is not limited to graphic design and bibliographic linking, the deployment of metadata and research data, and the ability to combine elements of improved machine and human readability.

It was created to assist the circulation of digital documents among the newly networked computers that were spreading through offices, whether in local area networks LAN or through the Internet.

What had become apparent was that documents were being prepared by various word-processing programs, each with their own proprietary file format. Nearly a decade earlier, inthe resourceful Warnock, working with Charles Geschke, figured they had solved the same problem with PostScript marking the beginning of Adobe Systems.

However, PostScript was itself not proving universally applicable. The goal of Camelot was to develop a lightweight file format that would serve the broadest possible range of users, at least until widespread computing power caught up with the demands of PostScript.

Camelot was intended, then, as a temporary, transitional solution to the view-and-print-anywhere problem. InAdobe released the PDF as an open standard for others to develop applications for writing and reading it, in what we might think of as the new twenty-first-century corporate spirit of open standards and open source software.

While finding articles online is becoming a common practice, most academic faculty print out a good proportion of the PDFs they wish to read, while younger and more research-oriented scholars lead the way in reading articles on their computer screens.

Yet to use the PDF in that earlier fashion—as simply a view-and-print file format anywhere—reflects a continuing failure to keep up with increased networking capacities for scholarly communication.

The problem here is not entirely with the technical limits of the PDF, although those are being challenged by some pushing for a more dynamic format e. However, we contend that the short- and medium-term problem is that the PDF has technical and graphical capacities that, if fully utilized, could serve science and scholarship far better in advancing the flow of ideas and the circulation of knowledge.

The challenge we have set for ourselves with this article, as students of scholarly communication, is to set out the various ways in which journal publishers and readers can use the PDF far more effectively to deliver, circulate, read, and utilize the literature so vital for the advancement of knowledge.

Plays Well With Others? The file compatibility issue of the s has evolved, with the growing sophistication of the web, into one of effectively discovering, accessing, linking, and extracting the information necessary to making the most of the published literature to advance research and scholarship.

To date, broadly two responses to working with the PDF in this new environment have emerged: These automated data-mining systems are currently able to extract sufficient indexing and bibliographic information to form rough and ready citation indexes that enable scholars not only to see who cited whom, but also to see what was cited in what context.

As well, PDF management systems, such as Zotero and Mendeleyenable individual scholars to manage their own collection of PDF articles by similarly extracting bibliographic information from the articles.

The very metaphors in play here—crawling of PDFs in hopes of scraping or extracting and yes, it can be like pulling teeth some bibliographic data from them—suggest how crudely this approach currently works, leading to irregularities in the data, whether in the form of, say, a missing journal title or the same article listed twice because of a variation in the extraction.

Others working in this area of scholarly communication are looking for entirely new forms of publishing, beyond the traditional blind-peer-reviewed research article. Annotum is another radical instance, in which authors compose their articles inside the journal website, which uses no more than a special plugin for WordPress designed for rapid review and publication.

While we applaud the innovative indexing and management of PDFs, as well as explorations of more open format and new designs for scholarly communication, what follows is concerned with improving current use of the PDF within the traditional journal context.

We are particularly interested in helping what amounts to a renaissance of scholar-publishers, who are taking advantage of online systems to run independent journals, to make better use of PDFs in the service of their authors and readers Edgar and Willinsky, In light of this widespread use, we offer a number of ways for making more of the PDF for scholarly publishing.

We have reason to believe that PDFs can do more to assist reader-researchers, as well as others, in the comprehension, evaluation, and utilization of research and scholarship. By making the PDF article good for printing alone, rather than also for reading and connecting on the screen, the journal publisher encourages readers, in effect, to print out the PDF article.

We will review how it is that a PDF read on the screen has the potential to do all of that, as well as to be printed by those so inclined. We provide both descriptions and demonstrations of how publishers can improve the use of the PDF for scholarly communication, without giving up the capacity to print the article.

