About ArchiveGrid

ArchiveGrid is a collection of over two million archival material descriptions, including MARC records from WorldCat and finding aids harvested from the web. It's supported by OCLC Research as the basis for our experimentation and testing in text mining, data analysis, and discovery system applications and interfaces. Archival collections held by thousands of libraries, museums, historical societies, and archives are represented in ArchiveGrid.

ArchiveGrid provides access to detailed archival collection descriptions, making information available about historical documents, personal papers, family histories, and other archival materials. It also provides contact information for the institutions where the collections are kept.

ArchiveGrid data is primarily focused on archival material descriptions for institutions in the United States. This reflects the contribution patterns for descriptions of materials under archival control in WorldCat, which make up the majority of descriptions in ArchiveGrid. We may extend ArchiveGrid beyond its current scope if it is necessary to support OCLC Research experimental objectives.

ArchiveGrid illustrates OCLC's interest in advancing issues important to the archival community. Our work within ArchiveGrid gives OCLC Research a foundation for collaboration and interactions with others in the archival community. We expect to share the results of MARC and EAD tag analysis, provide discovery system analytics for contributors, document investigations of text mining and data visualization, participate in community working groups pursuing improvements to description and discovery, and more. To support those interests and objectives, we'll continue to build this extensive and current aggregation of archival material descriptions, within the constraints of OCLC Research's committed and on-going support for this project.

OCLC previously offered ArchiveGrid as a subscription-based discovery service. The subscription service was discontinued in 2012. While the new, freely-available OCLC Research ArchiveGrid interface is not a full production service, it shares some of the same attributes. Researchers can expect to use it for discovery of archival materials, and archives can work with OCLC Research to have their materials represented in the aggregation in a reliable and persistent way.

If you have questions about your collection descriptions in ArchiveGrid, please get in touch with us. Interested in contributing? Please let us know that as well.

Frequently Asked Questions

How do we get our finding aids in ArchiveGrid?
Your role in getting your finding aids into ArchiveGrid is simply to give us permission harvest and use them. There are no costs involved, beyond the time and effort on your part to provide us with a way to harvest your finding aid documents. We harvest your finding aids from a webpage you provide to us, and index them in ArchiveGrid. About once every six weeks, we update the ArchiveGrid index by removing all finding aid content, re-harvesting, and re-indexing. Any changes you make on to your finding aids, such as adding, editing, or removing finding aids, will be reflected in ArchiveGrid in its next index update.
Can I contribute finding aids even if I don't belong to OCLC?
Yes you can; OCLC membership is not required to contribute your finding aids to ArchiveGrid. There is no cost to contribute your finding aids, and we can accept them in EAD, HTML, or PDF formats.
Where do you get your collection descriptions?
We index finding aids harvested directly from contributors, and we also include MARC bibliographic records from WorldCat which we identified as Archival material. These WorldCat records describing archival materials constitute about 90% of the collection descriptions in ArchiveGrid.
How do you select WorldCat records for inclusion in ArchiveGrid?
There isn't a simple way to identify a MARC record that describes the types of materials held in archives, manuscript collections, and special collections. In order for a WorldCat record to be extracted into ArchiveGrid, it needs to meet the following criteria:
  • Has only one library holding symbol attached (though we relax this rule for NUCMC records)
  • Has a value of b, d, f, g, i, j, k, p, r, or t in MARC Leader byte 6, or the value "a" (language material) in Leader byte 6 and the value "c" (collection) in Leader byte 7, or the value "a" (archival) in Leader byte 8
  • Has no value of any kind in MARC 260 subfield "a" or "b" (to filter out published works)
  • Does not have "Bibliography" in the beginning string of a MARC subject heading subfield "a" or "v"
  • Does not have a MARC 502 field (Theses or dissertation note)
  • Does not have a MARC 015 field (National bibliography number)
  • If the record has the material type "book" or "serial", has no value in the MARC 008 or 006 "Nature of Contents" bytes (to eliminate theses, reference works, and other non-archival materials)
This filter isn't always successful. Especially for minimally-cataloged materials, we sometimes see descriptions of unpublished manuscripts of various kinds filter through. But we continue to evaluate and improve the filter as best we can.
Why does the same collection sometimes have two different entries in the ArchiveGrid index?
Duplicate entries result from harvesting finding aids and extracting WorldCat MARC records for the same collection. While we continue to work on a way to effectively cluster or de-duplicate these two forms of the collection description from the same contributor, we have hesitated in favoring one over the other as each includes access points not found in the other. We are trying to maximize access and discovery, so in this situation we've decided to favor recall over precision.
What about copyright?
OCLC does not claim copyright ownership of individual collection descriptions contributed to ArchiveGrid. More information on rights and responsibilities and a legal statement are available on the web page where you can request to include your collections.
How many institutions contribute?
Collection descriptions from around 1,000 institutions - libraries, museums, historical societies, etc. - are included in ArchiveGrid.
Why is ArchiveGrid a free service?
We transitioned from a subscription-based service to a freely available system in 2012, in order to make the index available to everyone - faculty, scholars, family history researchers, students, and others. ArchiveGrid remains an OCLC Research project, allowing us to improve archives and special collections research through studies of researchers, description, and discovery.
What do you know about your users?
When ArchiveGrid started in the late 1990s as a way to test if EAD finding aids from different sources could work well together in one search system, we mostly had faculty and college students and genealogists in mind as our researchers. We later gathered data about these user groups from studies, and we continued to make system design improvements based on their needs. In a study conducted in 2012, we surveyed archives and special collections users and learned that primary source research is still an important focus for faculty, students and genealogists, but it also plays a role for people motivated to search for archival materials by something in their personal and professional lives. This important segment of archives and special collections include filmmakers, writers, designers, hobbyists, and much more.
What are your site visit statistics like?
We use Google Analytics to track site visits and show selected current statistics here. We recognize that discovery frequently begins with search engines, so we put special effort into promoting ArchiveGrid's collection descriptions in Google, Yahoo, Bing, and other popular services.

The ArchiveGrid Team

...

Bruce

Bruce Washburn is the lead developer and software engineer for ArchiveGrid. More accurately, he is ArchiveGrid's only developer. He's grateful for all the other software engineers in OCLC Research that make it possible to work so easily with WorldCat MARC records, and of course to all the ingenious and helpful staff at ArchiveGrid contributing institutions that help us harvest their finding aids. Bruce works in the San Mateo, California office of OCLC Research.
...

Ellen

Ellen Eckert is primarily responsible for keeping the ArchiveGrid wheels turning. She's the front-line for questions, comments, and problem reports that we receive via email, and is your expert help when you have questions about how to contribute your finding aids to ArchiveGrid. In addition, Ellen is the editor of the ArchiveGrid blog ... you should follow that, it's always entertaining. Ellen works from her home office in Portland, Oregon.
...

Merrilee

Merrilee Proffitt is a senior Program Officer at OCLC Research and the guiding light of the ArchiveGrid project. Whether it's setting priorities for the discovery system, thinking about how best to connect ArchiveGrid with other systems, or developing projects to study users or to analyze archival materials metadata, Merrilee points the way. Merrilee works in the San Mateo, California office of OCLC Research.

Statistics

Last Updated March 10, 2014

ArchiveGrid Index Growth

Interpreting These Statistics
The growth of MARC records in the last year represents identification and inclusion of additional contributors of data to WorldCat, and the on-going growth of WorldCat in general. While there has also been a concerted effort to increase the number of finding aid contributors, that form of archival description is a relatively small percentage of the entire aggregation. And some MARC records may be removed from the current set, as we improve our selection algorithms to identify archival collection descriptions.

ArchiveGrid Weekly Visits, November 2011 - March 2014

527,984Visits
435,635Unique Visitors
1,325,800Pageviews
2.45Pages/Visit
73%Visits referred by search engines
11%Visits via links from other websites
13%Visits via direct links

ArchiveGrid Contributors

By Country

By State in the United States

Interpreting These Statistics
ArchiveGrid contributing institutions are currently hard to count. In some cases, what might be considered as one institution may be represented in ArchiveGrid by several different contributors, one for each archive or special collections department affiliated with the larger institution. But in other instances, one contributing institution provides descriptions for a great many more individual archives (for example, the NUCMC records contributed by the Library of Congress). Until we can more accurately identify and count these contributors, the current count of contributors will be much lower than the actual number of institutions represented.

Search Widget

To put a small ArchiveGrid search box on your web page, highlight this HTML, copy it to the clipboard, and paste it into the content of your site.


Here's how the widget code will appear

Search ArchiveGrid
Find archival collections and primary source materials

Feel free to make any adjustments you'd like to the styles and text in this form. As long as you keep the form element names and action as they are, it should still work.