Chronicling America provides access to information about historic newspapers and select digitized newspaper pages. To encourage a wide range of potential uses, we designed several different views of the data we provide, all of which are publicly visible. Each uses common Web protocols, and access is not restricted in any way. You do not need to apply for a special key to use them. Together they make up an extensive application programming interface (API) which you can use to explore all of our data in many ways.
Details about these interfaces are below. In case you want to dive right in, though, we use HTML link conventions to advertise the availability of these views. If you are a software developer or researcher or anyone else who might be interested in programmatic access to the data in Chronicling America, we encourage you to look around the site, "view source" often, and follow where the different links take you to get started. When describing Chronicling America as the source of content, please use the URL and a Web site citation, such as "from the Library of Congress, Chronicling America: Historic American Newspapers site".
For more information about the open source Chronicling America software please see the LibraryOfCongress/chronam GitHub site. Also, please consider subscribing to the ChronAm-Users discussion list if you want to discuss how to use or extend the software or data from its APIs.
If you’re interested in other data and machine-usable interfaces available from the Library of Congress, you might find the LC for Robots page helpful at https://labs-dev.loc.gov/lc-for-robots/++
The directory of newspaper titles contains nearly 140,000 records of newspapers and libraries that hold copies of these newspapers. The title records are based on MARC data gathered and enhanced as part of the NDNP program.
Searching the title records is possible using the OpenSearch protocol. This is advertised in a LINK header element of the site's HTML template as "NDNP Title Search", using this OpenSearch Description document.
Title search parameters:
Note that all example URLs below use the same protocol and server name, http://chroniclingamerica.loc.gov/. We only show the URL paths and parameters below to save space.
There are millions of digitized newspaper pages in Chronicling America. These pages span several decades and many U.S. states and territories. New batches of data come in from partner institutions throughout the year and are added to the site regularly.
Searching newspaper pages is also possible via OpenSearch. This is advertised in a LINK header element of the site's HTML template as "NDNP Page Search", using this OpenSearch Description document.
Page search parameters:
The Chronicling America Directory contains hundreds of thousands of bibliographic records for American newspaper titles. To allow the directory to be integrated into your own applications you can use the OpenSearch AutoSuggest API to dynamically lookup these newspaper titles. For example:
The response will be application/x-suggestions+json as described by the the OpenSearch Suggestions extension.
The Chronicling America Web site uses links that follow a straightforward pattern. You can use this pattern to construct links into specific newspaper titles, to any of its available issues and their editions, and even to specific pages. These links can be readily bookmarked and shared on other sites.
We are committed to supporting this link pattern over time, so even if we change how the site works, we will redirect any requests to the system using this specific pattern into the new site. We established redirect rules for links into the previous version of the site when we released a new version in early 2009, and we intend to sustain those rules.
The link pattern uses LCCNs, dates, issue numbers, edition numbers, and page sequence numbers.
In addition to the use of JSON in OpenSearch results, there are also JSON views available for various resources in Chronicling America. These JSON views are typically linked from their HTML representation using the <link> element. For example:
Linked Data allows us to connect the information in Chronicling America directly to related data on the Web explicitly. Chronicling America provides several Linked Data views to make it easy to connect with other information resources and to process and analyze newspaper information with conceptual precision.
We use concepts like Title (defined in DCMI Metadata Terms) and Issue (defined in the Bibliographic Ontology) to describe newspaper titles and issues available in the data. Using these concepts, defined in existing ontologies, can help to ensure that what we mean by "title" and "issue" is consistent with the intent of other publishers of linked data. We also define other concepts not already defined in existing ontologies. This vocabulary includes elements suitable for newspaper information and the NDNP program, including these elements:
These elements are used in RDF views of several types of pages, ranging from a list of the newspaper titles available on the site and information about each, to enumerations of all the pages that make up each issue and all of the files available for each page.
Comparing the RDF versions of the links above with their HTML counterpart links, you might notice that the URI pattern we follow for these views is to remove the final slash, replacing it with ".rdf". We follow this pattern to comply with best practices for publishing linked data, and also to keep the URIs easy to understand and use.
For each of the HTML pages with a linked data counterpart in RDF, we provide links to those alternate views from the HTML page using the LINK header element. This can support automating the process of using the RDF data in tools like bookmarklets, plugins, and scripts, and it also helps us to advertise the availability of the additional views. In many views, such as newspaper page images, we also provide LINK elements pointing to the various available files (image, text, OCR coordinate XML) for each available page or other potentially useful information. We encourage you to explore the entire site and to look for and use these LINK elements if you find them useful when working with NDNP data. Just follow your nose, and view the source.
In addition to the concepts describe above, we use concepts from several other vocabularies in describing NDNP materials and also in linking to related data available on other sites. These additional vocabularies and external sites include:
We are grateful to all of these providers and we hope we can follow their lead in encouraging additional connections between data and vocabulary providers. Please be aware that how we use these vocabularies will likely change over time, as they continue to develop, and as new vocabularies are introduced.
In certain situations the granular access provided by the API may be somewhat constraining. For example, perhaps you are a researcher who would like to try out new indexing techniques on the millions of pages of OCR data in Chronicling America. Or perhaps you are a service provider and anticipate needing to support a high volume of fulltext searches across the corpus, and do not want the Chronicling America API as an external dependency. To support these and other potential use cases we are beginning to provide bulk access to the underlying data sets. The initial bulk data sets include:
curl -i 'http://chroniclingamerica.loc.gov/suggest/titles/?q=manh' HTTP/1.1 200 OK Date: Mon, 28 Mar 2011 19:45:34 GMT Expires: Tue, 29 Mar 2011 19:45:37 GMT ETag: "7d786bec2ca003d86009f8ccdfd72912" Cache-Control: max-age=86400 Access-Control-Allow-Origin: * Access-Control-Allow-Headers: X-Requested-With Content-Length: 7045 Last-Modified: Mon, 28 Mar 2011 19:45:37 GMT Content-Type: application/x-suggestions+json [ "manh", [ "Manhasset life. (Manhasset, N.Y.) 19??-19??", "Manhasset mail. (Manhasset, N.Y.) 1927-1986" ], [ "sn97063690", "sn95071148" ], [ "http://chroniclingamerica.loc.gov/lccn/sn97063690/", "http://chroniclingamerica.loc.gov/lccn/sn95071148/" ] ]
curl -i 'http://chroniclingamerica.loc.gov/suggest/titles/?q=manh&callback=suggest' HTTP/1.1 200 OK Date: Mon, 28 Mar 2011 19:45:34 GMT Expires: Tue, 29 Mar 2011 19:45:37 GMT ETag: "7d786bec2ca003d86009f8ccdfd72912" Cache-Control: max-age=86400 Access-Control-Allow-Origin: * Access-Control-Allow-Headers: X-Requested-With Content-Length: 7045 Last-Modified: Mon, 28 Mar 2011 19:45:37 GMT Content-Type: application/x-suggestions+json suggest([ "manh", [ "Manhasset life. (Manhasset, N.Y.) 19??-19??", "Manhasset mail. (Manhasset, N.Y.) 1927-1986" ], [ "sn97063690", "sn95071148" ], [ "http://chroniclingamerica.loc.gov/lccn/sn97063690/", "http://chroniclingamerica.loc.gov/lccn/sn95071148/" ] ]);
CORS is arguably a more elegant solution, and is supported by most modern browsers. However JSONP might be a better option if your application needs legacy browser support.