About ArchiveGrid

ArchiveGrid includes over 5 million records describing archival materials, bringing together information about historical documents, personal papers, family histories, and more. With over 1,000 different archival institutions represented, ArchiveGrid helps researchers looking for primary source materials held in archives, libraries, museums and historical societies.

If you'd like to see your collections included in ArchiveGrid or have questions about the ArchiveGrid project, please get in touch with us.

Frequently Asked Questions

How do we get our collections included in ArchiveGrid?
One of the best ways to be represented is to include your collection descriptions in OCLC's WorldCat database. ArchiveGrid is largely made up of MARC records from WorldCat; if you are already contributing your MARC cataloging to WorldCat and don't see your records in ArchiveGrid, please let us know.

Otherwise, if you have finding aids in EAD, HTML or PDF format and aren't currently contributing MARC records to WorldCat, please complete the form to include your collections. Your role in getting your finding aids into ArchiveGrid is simply to give us permission harvest and use them. OCLC membership is not required to contribute your finding aids to ArchiveGrid, and there are no costs involved, beyond the time and effort on your part to provide us with a way to harvest your finding aid documents. We harvest your finding aids from a webpage you provide to us, and index them in ArchiveGrid. About once every six weeks, we update the ArchiveGrid index by removing all finding aid content, re-harvesting, and re-indexing. Any changes you make on to your finding aids, such as adding, editing, or removing finding aids, will be reflected in ArchiveGrid in its next index update.
What's the connection between ArchiveGrid and OCLC Research?
As a discovery system focused on archival materials, and as a means of making these collections easier to find in search engines and elsewhere, ArchiveGrid illustrates OCLC Research's interest in advancing issues of importance to the archival community. It also serves as a basis for text mining and data analysis projects, and for experimentation and evaluations of discovery system features, carried out by OCLC Research staff.
Where do you get your collection descriptions?
We index finding aids harvested directly from contributors, and we also include MARC bibliographic records from WorldCat which we identified as Archival material. WorldCat records describing archival materials constitute more than 90% of the collection descriptions in ArchiveGrid.
How do you select WorldCat records for inclusion in ArchiveGrid?
There isn't a simple way to identify a MARC record that describes the types of materials held in archives, manuscript collections, and special collections. In order for a WorldCat record to be extracted into ArchiveGrid, it needs to meet the following criteria:
  • Has only one library holding symbol attached (though we relax this rule for NUCMC records)
  • Has a value of b, d, f, g, i, j, k, p, r, or t in MARC Leader byte 6, or the value "a" (language material) in Leader byte 6 and the value "c" (collection) in Leader byte 7, or the value "a" (archival) in Leader byte 8
  • Has no value of any kind in MARC 260 or 264 subfield "a" or "b" (to filter out published works), though 264 fields with a 2nd indictor of 0 (indicating Production, not Publication) are accepted.
  • Does not have "Bibliography" in the beginning string of a MARC subject heading subfield "a" or "v"
  • Does not have a MARC 502 field (Theses or dissertation note)
  • Does not have a MARC 015 field (National bibliography number)
  • If the record has the material type "book" or "serial", has no value in the MARC 008 or 006 "Nature of Contents" bytes (to eliminate theses, reference works, and other non-archival materials)
This filter isn't always successful. Especially for minimally-cataloged materials, we sometimes see descriptions of unpublished manuscripts of various kinds filter through. But we continue to evaluate and improve the filter as best we can.
Our finding aids are only available as HTML or PDF files. Will those formats work well in ArchiveGrid?
With MARC records and EAD XML finding aids, we can take advantage of the document markup to identify "facets" of information in the description, including the names of people, groups, places, topics, and events. HTML pages and PDF files do not provide us that same level of detailed markup for their descriptive data, but there are ways to optimize those formats for searching and display in ArchiveGrid. For HTML files, it's important to include a specific title for the finding aid in the HTML Head Title element.
PDF files include internal document properties for the title, author, subject and other fields. If you supply values in these fields, the ArchiveGrid system can use those values for the document title and for an abstract in brief search results. Elizabeth Post and her colleagues at Boston College Libraries have put together an excellent summary of their experiences in enriching PDF documents, for improved interoperability not just for ArchiveGrid but for any search engine that indexes their finding aids. Read more about it in their paper "Embedding metadata in PDF finding aids to enhance discoverability".
Why does the same collection sometimes have two different entries in the ArchiveGrid index?
Duplicate entries result from harvesting finding aids and extracting WorldCat MARC records for the same collection. While we continue to work on a way to effectively cluster or de-duplicate these two forms of the collection description from the same contributor, we have hesitated in favoring one over the other as each includes access points not found in the other. We are trying to maximize access and discovery, so in this situation we've decided to favor recall over precision.
What about copyright?
OCLC does not claim copyright ownership of individual collection descriptions contributed to ArchiveGrid. More information on rights and responsibilities and a legal statement are available on the web page where you can request to include your collections.
How many institutions contribute?
Collection descriptions from around 1,000 institutions - libraries, museums, historical societies, etc. - are included in ArchiveGrid.
Why is ArchiveGrid a free service?
We transitioned from a subscription-based service to a freely available system in 2012, in order to make the index available to everyone - faculty, scholars, family history researchers, students, and others. ArchiveGrid remains an OCLC Research project, allowing us to improve archives and special collections research through studies of researchers, description, and discovery.
What do you know about your users?
When ArchiveGrid started in the late 1990s as a way to test if EAD finding aids from different sources could work well together in one search system, we mostly had faculty and college students and genealogists in mind as our researchers. We later gathered data about these user groups from studies, and we continued to make system design improvements based on their needs. In a study conducted in 2012, we surveyed archives and special collections users and learned that primary source research is still an important focus for faculty, students and genealogists, but it also plays a role for people motivated to search for archival materials by something in their personal and professional lives. This important segment of archives and special collections include filmmakers, writers, designers, hobbyists, and much more.
What are your site visit statistics like?
We use Google Analytics to track site visits and show selected current statistics here. We recognize that discovery frequently begins with search engines, so we put special effort into promoting ArchiveGrid's collection descriptions in Google, Yahoo, Bing, and other popular services.
Does ArchiveGrid make use of cookies?
Yes it does. We create temporary "functional" cookies that last only for the length of your session, to help remember your preferred location on the map shown on the ArchiveGrid home page, and store identifiers for records you may have saved to a list for downloading. ArchiveGrid also uses Google Analytics to evaluate how the system is behaving, and temporary performance tracking cookies are set for that purpose. We're not using cookies to personally identify users or for behavioral advertising purposes. You're not required to have cookies enabled in your browser to use ArchiveGrid. For more information, consult the OCLC Cookie Policy document describing the use of cookies in OCLC services.

The ArchiveGrid Team

Photo of Bruce Washburn


Bruce Washburn is the lead developer and software engineer for ArchiveGrid. He's grateful for all the other software engineers in OCLC Research that make it possible to work so easily with WorldCat MARC records, and of course to all the ingenious and helpful staff at ArchiveGrid contributing institutions that help us harvest their finding aids. Bruce works in the San Mateo, California office of OCLC Research.
Photo of Merrilee Proffitt


Merrilee Proffitt is a senior Program Officer at OCLC Research and the guiding light of the ArchiveGrid project. Whether it's setting priorities for the discovery system, thinking about how best to connect ArchiveGrid with other systems, or developing projects to study users or to analyze archival materials metadata, Merrilee points the way. Merrilee works in the San Mateo, California office of OCLC Research.
Photo of Jeff Mixter


Jeff Mixter is a Software Engineer at OCLC Research. Jeff's expert knowledge of data modeling and linked data have been key to the progress we've made in recent experimentation with the ArchiveGrid data, and in remodeling the ArchiveGrid discovery system to be "of the web", not just "on" it. Jeff works in the Dublin, Ohio office of OCLC Research.


Last Updated May 4, 2018

ArchiveGrid Index Growth

Interpreting These Statistics
The growth of MARC records in the last year represents identification and inclusion of additional contributors of data to WorldCat, and the on-going growth of WorldCat in general. While there has also been a concerted effort to increase the number of finding aid contributors, that form of archival description is a relatively small percentage of the entire aggregation. And some MARC records may be removed from the current set, as we improve our selection algorithms to identify archival collection descriptions.

ArchiveGrid Weekly Visits, November 2011 - April 2018

75%Visits referred by search engines
7%Visits via links from other websites
10%Visits via direct links
8%Visits via other starting points

ArchiveGrid Contributors

By Country

By State in the United States

Interpreting These Statistics
ArchiveGrid contributing institutions are currently hard to count. In some cases, what might be considered as one institution may be represented in ArchiveGrid by several different contributors, one for each archive or special collections department affiliated with the larger institution. But in other instances, one contributing institution provides descriptions for a great many more individual archives (for example, the NUCMC records contributed by the Library of Congress). Until we can more accurately identify and count these contributors, the current count of contributors will be much lower than the actual number of institutions represented.

Search Widget

To put a small ArchiveGrid search box on your web page, highlight this HTML, copy it to the clipboard, and paste it into the content of your site.

Here's how the widget code will appear

Search ArchiveGrid
Find archival collections and primary source materials

Feel free to make any adjustments you'd like to the styles and text in this form. As long as you keep the form element names and action as they are, it should still work.