- For International Archives Day, archivists worldwide ask Google for a Doodle
- Jefferson Davis’ library and museum is open, while ArchiveGrid shows where his papers are
- Updates and recent developments on ArchiveGrid webinar lineup
- National parks libraries hold rich potential for ArchiveGrid
- Index update comes with new features, contributors, and webinar plans
Author Archives: Bruce
Last month Ellen posted a note about some of the ways in which we routinely harvest finding aids from ArchiveGrid contributor’s websites.
This month we’re working with our first ArchiveGrid contributor to make their finding aids available with the Site Map protocol. In a way it’s surprising that this is our first opportunity to harvest finding aids this way. The Site Map protocol has been around for years, is a widely used method of making website content visible to search engines, and is relatively easy to set up. At any rate, we’re very pleased to have a Site Map to guide our way.
In our experience in support of ArchiveGrid in cases where a protocol beyond just following links on the website is employed, institutions have in some cases expressed interest in OAI-PMH. In these cases a Site Map may prove to be a more effective mechanism for sharing finding aids. Site Maps can help search engines see the documents you want them to see (Google withdrew support for OAI-PMH in 2008), may already be supported as part of content management systems or web server platforms, and are familiar to a wide array of harvesters. For valuable insights on the role Site Maps and metadata play for institutional repositories in Google Scholar, we recommend the Library HiTech article Invisible institutional repositories: addressing the low indexing ratios of IRs in Google by Kenning Arlitsch and Patrick O’Brien.
If you have Site Maps in place that we could use to harvest your finding aids, and of which we’re not yet aware, please let us know.
Archivists and librarians spend much of their time sorting the names of people, groups and places. Authority control systems are an integral part of processing archival materials and manuscripts, and an important area of innovation, as we’re seeing with work around the EAC-CPF.
Hurricane names represent an interesting alternate approach. As described on the National Weather Service website, there was once a practice of naming tropical storms and hurricanes in the West Indies after a particular saint’s feast day. Given that hurricane season in the North Atlantic (generally from June through November) would encompass the same limited set of saint’s days, the same name could be attributed to more than one storm system.
The first hurricane named this way was Hurricane of San Bartolme in 1568, and earlier storms were named years later by historians. Two major hurricanes named after San Felipe occurred on exactly the same day, but 52 years apart, in 1876 and 1928.
As Wayne Neely writes in The Great Hurricane of 1780, “This system for naming them was haphazard and not really a system at all.”
The Great Hurricane of 1780 is also known as Hurricane San Calixto II. It’s thought to be the deadliest Atlantic hurricane on record, responsible for, among other things, the sinking of 40 French ships involved in the American Revolutionary War, with 4,000 souls lost. You may then be left wondering whether it’s named after Pope Callisto II, or if it’s the second hurricane named after the Feast of Pope Saint Callisto I. We’re not sure.
The practice of naming hurricanes after women began in 1953 in the United States, and in 1978-1979, male names were added to the storm lists. These six-year storm name lists for Atlantic hurricanes are developed by the World Meteorological Organization (WMO), and each year’s list includes 21 names. If a given year has more than 21 named storms, the Greek alphabet is used. Then each list repeats every seventh year. So, these names can recur, perhaps with some of the same ambiguity as Hurricane San Felipe. Remember Tropical Storm Alberto from May, notable as the earliest-forming tropical storm in the Atlantic in nearly 10 years? It was also the name of a tropical storm that caused considerable damage in Florida and Georgia in 1994.
For certain calamitous storms, the name is retired. There are currently 76 names on the “retired” list, including the notorious Andrew, Donna, and Katrina. The list of names is controlled by the WMO, and given recent events, we suspect they will retire Sandy too at their next annual meeting.
For an inside view of what’s involved in search and rescue operations following a major hurricane, take a look at this transcript of a 2005 interview after Hurricane Katrina with Commander Meredith Austin, provided through ArchiveGrid by the U.S. Coast Guard Historian’s Office.
“You know in an average hurricane we’ll fly out to the impacted area and get RV’s if we have to because it’s not that you want to be pampered or anything but when it’s really hot and it’s really humid and you want people to work in really harsh conditions for 12 to 14 or 18 hours a day, you’ve got to have a place for them to recover or they’re going to be no good to you the next day. So to have them sleeping out in tents we have to worry about fire ants and your stuff getting wet. You can do that for a couple of days, anyone can, but we’re here for the long term. There are going to be Strike Team folks down in these areas for probably a year.”
The inspiration for this blog post came from the realization that, after touting system improvements we made over the weekend to the way ArchiveGrid looks and adapts to smartphones and tablets, we forgot to add one feature we had worked on in tandem with said improvements: A “frequently asked questions” section to our About ArchiveGrid page. It’s there now, addressing almost every question contributors and potential data suppliers ask.
Except one, which this post attempts to explain.
The question: How do we filter the MARC records out of WorldCat?
As shown in the statistics on our updated About ArchiveGrid page, MARC records extracted from WorldCat make up the bulk of ArchiveGrid’s content … about 90%. But there isn’t a simple way to identify a MARC record that describes the types of materials held in archives, manuscript collections, and special collections.
We look at every one of the 280 million or more records in WorldCat, and exclude those that have any of these characteristics:
- Have more than one library holding symbol attached
- Do not have the value b, d, f, p, r, or t in MARC Leader byte 6 (see table below), or the value “a” (language material) in Leader byte 6 and the value “c” (collection) in Leader byte 7, or the value “a” (archival) in Leader byte 8
- Have a value of any kind in MARC 260 subfield “a” or “b” (to filter out published works)
- Have a MARC subject heading with a subfield “a” or “v” beginning with the word “Bibliography”
- Have a MARC 502 field (Theses or dissertation note)
- Have the material type “book” or “serial” and any value in the MARC 008 or 006 “Nature of Contents” bytes (to eliminate theses, reference works, and other non-archival materials)
This filter isn’t always successful. Especially for minimally-cataloged materials, we sometimes see descriptions of unpublished manuscripts of various kinds filter through. But we continue to evaluate and improve the filter as best we can.
MARC Leader byte 6 values:
- a Language material
- b Archival and manuscripts control Note: Value obsolete
- c Printed music
- d Manuscript music
- e Cartographic material
- f Manuscript cartographic material
- g Projected medium
- i Non-musical sound recording
- j Musical sound recording
- k 2-dimensional non-projectable graphic
- m Computer file
- o Kit
- p Mixed material
- r 3-dimensional artifact or naturally occurring object
- t Manuscript language material
Over the past few months we’ve been working on an update to the ArchiveGrid interface.
Along with some bug fixes and cosmetic changes, the new interface has two major new features: A “result overview” display that summarizes important access points in a search result, and an “adaptive” (sometimes also called “responsive”) layout to improve how the system works on tablets and smartphones.
You can try a preview version of the new interface now. We’re testing and updating it still, but expect to replace the current interface with the new one in the next month or two.
The Result Overview
Here’s an example of the Result Overview for a search. The search began by looking for collections that match golden gate bridge photographs. With 377 matching collections it would take a while to scroll through each brief record, but the Result Overview helps identify some key access points at a glance:
And as access points are selected, the search result is narrowed. This can be a quick and effective way to reduce a large result to something that can be more easily checked for a deeper dive into the collection descriptions. The Result List and Overview are different views of the same result: selecting their tabs makes it easy to switch from one to the other.
The Adaptive Display
Though we aren’t yet seeing very much ArchiveGrid use by people with smartphones and tablet devices, we want to ensure that all users have an experience in ArchiveGrid that is best suited to the capabilities of their browser. While still a work in progress, the new interface adapts its layout and features based on the device that has connected. We’ve been testing this on a range of computers: an iPhone, a Nexus 7 tablet, a “One Laptop per Child” computer, a variety of notebook and desktop computers, and a large flat screen display connected to a Chromebox.
As with everything else about ArchiveGrid, we’d love to hear your comments and suggested improvements to this new version of the user interface.
In April and May of 2012 we conducted a survey to update our understanding of how special collections research is carried out by faculty, graduate students, genealogists, and unaffiliated scholars. We’re currently analyzing the 695 survey responses, though one clear finding is the importance of librarians and archivists as a source for recommendations. Over 80% of survey respondents identified librarians and archivists, when answering the question “Is there a particular type of user whose comments, recommendations, etc. you find most valuable?”
The full set of survey questions and choices is listed here. They are also downloadable here.
Expect to hear more on the ArchiveGrid blog as our analysis of the survey responses continues.
- Have you used special collections materials?
Special collections materials are defined as library and archival materials in any format, generally characterized by their value, physical format, uniqueness or rarity. For example: rare books, manuscripts, photographs, institutional archives including digital items.
- What kind of special collections materials did you use?
- What are the important attributes of these materials for you?
Unique, Primary Source, Digital, Other
- In the last year or so, what have been the subjects of your research?
Family History, Genealogy, History (unaffiliated/conducting personal research), History (conducting professional research), Academic Coursework, Instruction/Lesson Planning, Other
- What is the intended purpose of your research?
For publication, for degree or coursework, for hire, for personal interest, other
- When using special collections what is your usual role?
Faculty affiliated with a college or university, Post graduate/Graduate student, Undergraduate, Unaffiliated Scholar, Genealogist (professional), Genealogist (conducting personal research)
- Remembering your research in the past year or so, as you begin the research process where do you typically go for help in your initial investigations?
Web search engines, Library catalogs/databases, Colleagues and friends, Email lists/discussion boards, Print materials, None of the above
- When you are in the middle of, or completing, your research, which resources are the most useful to you?
Web search engines, Library catalogs/databases, Colleagues and friends, Email lists/discussion boards, Print materials, None of the above
- When you complete your research, do you need to make sure that all potential sources have been checked?
Never, Sometimes, Always
- How do you discover new websites and other research resources?
Colleagues and friends (via email, word of mouth, etc.), Professional/trade literature, Events and meetings, Email/posts from communities and groups (listservs, chat boards, etc.), Twitter, Facebook, None of the above
- When you want to share information about a new website or research resource, how do you usually identify it?
Website name, Website URL, URL from a search engine, The resource’s institution name, The resource’s collection name, Finding aid or collection description, Library catalog reference, Other
- When you want to share information about a new website or other research resources what are your preferred ways to communicate?
Word of mouth with colleagues and friends , Professional/trade literature, Email with colleagues and friends, Email/posts to communities and groups (listservs, chat boards, etc.), Twitter, Facebook, Other
- Which of these website features are valuable for your research?
User comments, Tags, Reviews, Recommendations, Saving to a list, Connecting with others, None of these are relevant for me
- Comments, tags, reviews and recommendations can come from a variety of sources. Is there a particular type of user whose comments, recommendations, etc. you find most valuable?
A scholar whose reputation I know, Faculty affiliated with any college or university, Faculty affiliated with a specific college or university, Library or archive staff, Undergraduate, Post graduate/graduate student, Genealogist (professional), Genealogist (conducting personal research), Colleagues and friends
The idea that a website should be adaptable and responsive to its users and uses isn’t new, but achieving or even setting that goal has sometimes been difficult. It isn’t uncommon to look at website statistics and see most activity coming from what we imagine are desktop browsers, and then design for that use. Designing for specific devices also is a challenging and expensive enterprise when done properly.
Another approach, sometimes called Adaptive or Responsive web design, has attracted attention recently. It is often related to a Mobile First design philosophy, suggesting that by thinking about mobile users whose devices lack extensive displays and have cumbersome keyboard entry tools, you’ll naturally think first about the system’s key features and build those for efficient use. The Adaptive/Responsive approach then leads you to use the same base of code, as much as is practical, for any device your users may find you with, incorporating display and input features accordingly.
Although ArchiveGrid’s use statistics suggest only a very small number of users reach us with smartphone or tablet devices, that isn’t a reason not to provide them with a good user experience. In fact, the low-use statistics could be directly correlated with issues that a design oriented toward desktop browsers has when it’s delivered to a different type of device.
We find the current design isn’t working well on smartphones and on some tablet devices, although it’s fine at medium to higher-range desktop displays. Here’s how it would look on a display with a resolution of 1280 x 960 pixels (a pretty common size for desktop systems these days):
We’re testing an adaptive/responsive redesign now, which would give the system a different look. At that same desktop resolution, we’ve made some adjustments to allow more information to be visible without scrolling, to highlight the search box, and to fix the display width so that it works more effectively on much higher screen resolutions:
When the same page is viewed on a tablet device, we start to make some choices on what’s most important to see and do. We drop some of the collection highlights and tune the amount of space used by the search box.
And for smartphone users, we highlight the search box, drop a few other less-critical home page widgets, and expect some scrolling to reach other widgets.
There’s still more to think about with this design, and much to learn. We expect to promote it to the Beta version of ArchiveGrid soon. When it surfaces, we will be glad to hear your reactions and suggestions to improve it.
As we began rethinking ArchiveGrid in OCLC Research in 2011, one of our first steps was to develop some personas to represent the system’s users. We felt that earlier user studies had presented a fairly good picture of the general audiences that a system like ArchiveGrid could well serve. It appeared to be best for faculty members, upper-level undergrads and post-graduates, and researchers of other types including amateur and professional genealogists.
The development of personas to give life to anticipated audiences is a common practice in user-centered design. Though there are more details we developed for ours than represented here, a few of the people we envisioned using ArchiveGrid included:
Dr. Matthew Simon, a 59 year old History Department head. On typical work days Dr. Simon may teach mid to upper level history courses, serve on multiple faculty committees, work in collaboration with others and advise students. He carries an iPhone, uses a desktop computer at work and owns a laptop, which he mostly uses at home, uses email regularly, and checks Facebook several times a week. He owns a digital camera and his collection of photographs from his travels are well-organized on his home computer. The proximity to campus libraries and the role librarians play in helping Dr. Simon develop course material keeps him in-the-know about information resources and services and what’s going on in the world of libraries online searching. Dr. Simon and his peers are highly skilled information seekers because they are well-versed in at least a dozen of the paid databases and periodicals the campus libraries subscribe to. However, he has gotten attached to the convenience of Google and Wikipedia to find facts fast but is hesitant to call them information influencers because he believes quality research shouldn’t be as easy and convenient search engines have made it.
Elizabeth Mann, a 45 year old 5th grade teacher. Elizabeth carries a smart phone, uses a desktop computer at work and at home, uses email regularly, texts close friends and family … she recently joined Facebook. Elizabeth has researched family history for nearly a decade and in that time she has become familiar with online searching beyond search engines, although she uses Google and Wikipedia to help focus her information search into keywords. She is familiar with some big-name databases for family history researchers and she knows how to evaluate her information sources, although when she thinks she has completed her research at her tried-and-true places online and at libraries, she doesn’t always know where to look next. Elizabeth prefers web sites with simple interfaces and search options. Databases should be intuitive and easy to navigate and she increasingly expects free and open access to materials.
Amy Powell, a 32 year old journalist. Amy researches and prepares stories in multi-media packages for print and online audiences, and aims to be an effective and accurate storyteller. She’s influenced by colleagues, websites, radio, tv, books, magazines, and social networks; she considers Google search abilities to be advanced, but if she can’t find what she’s looking for after a couple of keyword search attempts, she will move on. Amy recognizes and values authority in her research resources. She uses software and web-based tools on a daily basis but is less comfortable relying on it to make decisions for her. Amy sometimes looks for new ways to do carry out research online, and figures out software and web-based tools on her own.
These are just a few of the personas we developed in the early design stages, and we’ve returned to them at times to give life to, and help us focus on, the audiences we think ArchiveGrid is best able to serve.