Hal R. Varian∗
Revised: February 10, 2006
This is an economic analysis of the Google Library project. I describe the project and outline why it is consistent with the legal doctrine of fair use. I go on to examine the transactions costs associated with optin and opt-out models for publisher participation. I conclude that the
Google Library Project is legally sound and economically sensible. In particular, an opt-in model would incur very substantial transactions costs, making the entire undertaking problematic.
Prepared for the AEI-Brookings discussion, “The Google Copyright Controversy: Implications of Digitizing the World’s Libraries,” February 24, 2006, Washington, DC. This document represents my personal views only and does not represent in any way the views of Google. I have consulted for Google since 2002 on various matters, but I have never had any direct involvement with the Google Book project. I have shown drafts of this document to members of that team in order to ensure that my description of their procedures is accurate but any remaining errors are, of course, solely my responsibility. I also wish to thank Pamela Samuelson and Stan Leibowitz, who offered helpful suggestions and references. ∗
The Google Library Project is an attempt to organize the world’s information about books available in libraries. It is essentially an online card catalog that allows for more efficient search than conventional card catalogs.
The Google Library Project is part of Google Book Search, which is available at http://books.google.com; the “About Google Book Search” link on that page contains the definitive description of the program. Note that Google Book Search is currently in beta, so various features of the program may change.
Google Book Search contains scanned images of books from two sources: the “Partner Program” and the “Library Project.” In order to join the
Partner Program, publishers send a message to Google asking that specific books be entered into a database of scanned images. Google then adds the scanned images to the Google Book Search database either by 1) scanning in a physical copy of the book provided by the publisher or a library or 2) using a PDF file provided by the publisher. Note that the Google Partner
Program is an opt-in program: publishers have to specifically request that their content be added to the index.
The Google Library Project does not require the publisher to make a specific request to join the program, though publishers are free remove any of their books from the program at any time (assuming that they hold copyright to the work.). Google has made arrangements with several libraries such as Harvard University Library and the University of Michigan Library that enables them to scan in the works in their collections. This is done at Google’s expense using special technology it developed for the project. If things go as planned, there will eventually be about 25 to 30 million books in the Google
It is important to distinguish the copyright status of the various types of content in the Google Library Project. The Partner Program generally consists of copyrighted material which is provided by the publisher. If the publisher provides physical books or scanned images, the works are typically
“in print;” that is, available via normal retail channels. If the publisher simply provides permission to use books scanned in as part of the Library
Project, the books may be in or out of print.
In contrast to the Partner Program, the Library Project contains books that are both in copyright and out of copyright and in print and out of print.
Users can search the contents in these two collections using the same interface: typing a query into a search box just as though they were searching the Web. Google searches through the text of books in its index using special
algorithms optimized for this purpose and finds books