millercenter.org
The Miller Center of Public Affairs often
receives requests for bulk downloads of our data. This web page is intended to help satisfy these requests.
If you are interested in downloading data from millercenter.org
, please read this
document. If you have questions, contact Miles Efron (mefron@virginia.edu).
Our most commonly requested data set is the presidential speeches collection. This is a corpus of text data--speeches given by U.S. presidents, from George Washington's time to the contemporary presidency. The collection is not exhaustive; inclusion is an editorial decision by Miller Center staff. However, we have over 1,000 speeches available, and many NLP and computational humanities / social science researchers find the data useful.
With this in mind, we have made the presidential speeches collection available for bulk download. See below for details.
These data are offered as-is, with no warantee or support, for the use of the research and academic communities. The speeches are in the public domain. But please cite like so:
Visitors can download the Miller Center speech archive here: https://data.millercenter.org/miller_center_speeches.tgz.
After downloading, expanding the tar archive will create a directory called speeches
that contains a large number of json files. Each file contains a single speech, along with metadata such as the title of the speech, its date of delivery, and which president delivered it.
Again, please direct questions to Miles Efron (mefron@virginia.edu).