If you wish to undertake a text or data mining project with content from the Library's licensed databases, please contact a Librarian to investigate options, which may include negotiating with the vendor or purchasing access to the data. Although many database licenses prohibit text and data mining and the use of software such as scripts, agents, or robots, we are actively negotiating text mining rights with database vendors. Unauthorized text or data mining in violation of our licenses can result in loss of access for the entire Wellesley College community.
Please also see our Best Practice Tips for mining licensed databases.
Resource | Details |
---|---|
Adam Matthew | Primary source collections spanning the 15th to 21st centuries and containing millions of pages. Adam Matthew allows data mining/text analysis free of charge for fair use/academic research. Secure online access to the data via an API can be provided on submission of an information form. Librarians may contact Adam Matthew via info@amdigital.co.uk to discuss data extraction from the main collection website by automated software. See the Adam Matthew Text Mining/Data Mining Statement for more information. |
Early English Books Online (EEBO) | Digital facsimile page images of virtually every work printed in the English-speaking world from 1473 to 1700, as well as some items printed after 1700. 25,000+ selected texts from the EEBO corpus are available to download for text analysis through the Text Creation Partnership. |
Eighteenth-Century Collections Online (ECCO) | Digital facsimile page images of significant English-language and foreign-language titles printed in the United Kingdom during the 18th century, along with thousands of important works from the Americas; includes books, pamphlets, broadsides, and ephemera. All data is available for bulk download for text mining through the Text Creation Partnership. |
JSTOR | JSTOR's Data for Research self-service site provides datasets for the journals, books, research reports, and pamphlets in the JSTOR digital library at no cost to researchers and libraries. Researchers may create a dataset of up to 25,000 documents (metadata and/or n-grams) using the self-service option. (See How to Create a Dataset.) Large and full-text datasets are provided by request and require an agreement about the use of the data. |
Oxford English Dictionary (OED) | Oxford University Press offers a free prototype API or a developer plan to access all data and functionality. See API FAQ. Read more about research partnerships that use OED datasets. |
Researchers at subscribing academic institutions can text mine subscribed full-text ScienceDirect content via the Elsevier APIs for non-commercial purposes. | |
Women Writers Online | A full-text collection of early women’s writing in English, including full transcriptions of texts published between 1526 and 1850, focusing on materials that are rare or inaccessible. |