ArchiveGrid is a collection of over three million archival material descriptions, including MARC records from WorldCat and finding aids harvested from the web.
It's supported by OCLC Research as the basis for our experimentation and testing in text mining, data analysis, and discovery system
applications and interfaces. Archival collections held by thousands of libraries, museums, historical societies, and archives are represented in ArchiveGrid.
ArchiveGrid provides access to detailed archival collection descriptions, making information available about historical documents, personal papers,
family histories, and other archival materials. It also provides contact information for the institutions where the collections are kept.
ArchiveGrid data is primarily focused on archival material descriptions for institutions in the United States but now also includes information about collections in Australia.
This reflects the contribution patterns for
descriptions of materials under archival control in WorldCat, which make up the majority of descriptions in ArchiveGrid.
We may extend ArchiveGrid beyond its current scope if it is necessary to support OCLC Research experimental objectives.
ArchiveGrid illustrates OCLC's interest in advancing issues important to the archival community. Our work within ArchiveGrid gives OCLC Research
a foundation for collaboration and interactions with others in the archival community. We expect to share the results of MARC and EAD tag analysis,
provide discovery system analytics for contributors, document investigations of text mining and data visualization, participate in community
working groups pursuing improvements to description and discovery, and more. To support those interests and objectives, we'll continue to
build this extensive and current aggregation of archival material descriptions, within the
constraints of OCLC Research's committed and on-going support for this project.
OCLC previously offered ArchiveGrid as a subscription-based discovery service. The subscription service was discontinued in 2012. While the
freely-available OCLC Research ArchiveGrid interface is not a full production service, it shares some of the same attributes. Researchers can
expect to use it for discovery of archival materials, and archives can work with OCLC Research to have their materials represented in the
aggregation in a reliable and persistent way.
If you have questions about your collection descriptions in ArchiveGrid, please get in touch with us.
Interested in contributing? Please let us know that as well.
Frequently Asked Questions
How do we get our finding aids in ArchiveGrid?
Your role in getting your finding aids into ArchiveGrid is simply to give us permission harvest and use them.
There are no costs involved, beyond the time and effort on your part to provide us with a way to harvest your finding aid documents.
We harvest your finding aids from a webpage you provide to us, and index them in ArchiveGrid. About once every six weeks, we update the ArchiveGrid index
by removing all finding aid content, re-harvesting, and re-indexing. Any changes you make on to your finding aids, such as adding, editing, or removing finding aids,
will be reflected in ArchiveGrid in its next index update.
Can I contribute finding aids even if I don't belong to OCLC?
Yes you can; OCLC membership is not required to contribute your finding aids to ArchiveGrid. There is no cost to contribute your finding aids, and we can accept them in EAD, HTML, or PDF formats.
Where do you get your collection descriptions?
We index finding aids harvested directly from contributors, and we also include MARC bibliographic records from WorldCat which we identified as Archival material.
These WorldCat records describing archival materials constitute about 90% of the collection descriptions in ArchiveGrid.
How do you select WorldCat records for inclusion in ArchiveGrid?
There isn't a simple way to identify a MARC record that describes the types of materials held in archives, manuscript collections,
and special collections.
In order for a WorldCat record to be extracted into ArchiveGrid, it needs to meet the following criteria:
- Has only one library holding symbol attached (though we relax this rule for NUCMC records)
- Has a value of b, d, f, g, i, j, k, p, r, or t in MARC Leader byte 6, or the value "a" (language material) in Leader byte 6 and the value "c" (collection) in Leader byte 7, or the value "a" (archival) in Leader byte 8
- Has no value of any kind in MARC 260 subfield "a" or "b" (to filter out published works)
- Does not have "Bibliography" in the beginning string of a MARC subject heading subfield "a" or "v"
- Does not have a MARC 502 field (Theses or dissertation note)
- Does not have a MARC 015 field (National bibliography number)
- If the record has the material type "book" or "serial", has no value in the MARC 008 or 006 "Nature of Contents" bytes (to eliminate theses, reference works, and other non-archival materials)
This filter isn't always successful. Especially for minimally-cataloged materials, we sometimes see descriptions of unpublished manuscripts of various kinds filter through. But we continue to evaluate and improve the filter as best we can.
Why does the same collection sometimes have two different entries in the ArchiveGrid index?
Duplicate entries result from harvesting finding aids and extracting WorldCat MARC records for the same collection.
While we continue to work on a way to effectively cluster or de-duplicate these two forms of the collection description from the same contributor,
we have hesitated in favoring one over the other as each includes access points not found in the other.
We are trying to maximize access and discovery, so in this situation we've decided to favor recall over precision.
What about copyright?
OCLC does not claim copyright ownership of individual collection descriptions contributed to ArchiveGrid.
More information on rights and responsibilities and a legal statement are available on the web page where
you can request to include your collections
How many institutions contribute?
Collection descriptions from around 1,000 institutions - libraries, museums, historical societies, etc. - are included in ArchiveGrid.
Why is ArchiveGrid a free service?
We transitioned from a subscription-based service to a freely available system in 2012, in order to make the index
available to everyone - faculty, scholars, family history researchers, students, and others.
ArchiveGrid remains an OCLC Research project, allowing us to improve archives and special collections research through studies of researchers, description, and discovery.
What do you know about your users?
When ArchiveGrid started in the late 1990s as a way to test if EAD finding aids from different sources could work well together in one search system,
we mostly had faculty and college students and genealogists in mind as our researchers. We later gathered data about these user groups from studies,
and we continued to make system design improvements based on their needs.
In a study conducted in 2012, we surveyed archives and special collections users and learned that primary source research is still an important focus for
faculty, students and genealogists, but it also plays a role for people motivated to search for archival materials by something in their personal and professional lives.
This important segment of archives and special collections include filmmakers, writers, designers, hobbyists, and much more.
What are your site visit statistics like?
We use Google Analytics to track site visits and show selected current statistics here
We recognize that discovery frequently begins with search engines, so we put special effort into promoting ArchiveGrid's collection descriptions in Google, Yahoo, Bing, and other
The ArchiveGrid Team
Last Updated May 12, 2014
ArchiveGrid Index Growth
Interpreting These Statistics
The growth of MARC records in the last year represents identification and inclusion of additional contributors of data to WorldCat, and the on-going
growth of WorldCat in general. While there has also been a concerted effort to increase the number of finding aid contributors, that form of
archival description is a relatively small percentage of the entire aggregation. And some MARC records may be removed from the current set, as we improve our
selection algorithms to identify archival collection descriptions.
ArchiveGrid Weekly Visits, November 2011 - May 2014
|86%||Visits referred by search engines|
|7%||Visits via links from other websites|
|7%||Visits via direct links|
By State in the United States
Interpreting These Statistics
ArchiveGrid contributing institutions are currently hard to count. In some cases, what might be considered as one institution may be represented in ArchiveGrid by several
different contributors, one for each archive or special collections department affiliated with the larger institution. But in other instances, one contributing institution
provides descriptions for a great many more individual archives (for example, the NUCMC records contributed by the Library of Congress). Until we can more accurately
identify and count these contributors, the current count of contributors will be much lower than the actual number of institutions represented.
Here's how the widget code will appear
Find archival collections and primary source materials
Feel free to make any adjustments you'd like to the styles and text in this form. As long as you keep the form element names and action as they are, it should still work.