Data Services

Get Access to Data

The UW–Madison Libraries have licensed a number of full-text, bibliographic, and other data sets that may be suitable for text mining or similar research uses. We may be able to work with you to gain access to these data sets in a format suitable for your research. Please contact your liaison librarian or Research Data Services to see how we can help you.

Clarivate Web of Science Data Set

The Web of Science data set from Clarivate is based on the content of the UW-Madison Libraries subscriptions to the Web of Science database. The raw data set allows for analysis of research foundations and connections of current science dating back to the year 1900.

Learn how to access the dataset.

Gale Digital Scholar Lab

Gale Digital Scholar Lab is an online tool for collecting data sets comprised of content from the UW-Madison Libraries’ subscriptions to Gale Primary Sources databases. They can be analyzed using text analysis and data visualization tools built into the Digital Scholar Lab. Digital humanities analysis methods include: Named Entity Recognition, Topic Modelling, Parts of Speech, and more.

Learn how to access the lab.

HathiTrust Research Center

HathiTrust is a partnership of academic and research institutions, offering a collection of millions of titles digitized from libraries around the world. The HathiTrust Research Center (HTRC) enables large-scale analysis of works in the HathiTrust Digital Library (HTDL) to facilitate non-profit research and educational uses of the collection.

Learn how to access the HathiTrust Research Center.

TDM Studio

TDM Studio is an online text and data mining tool for research, teaching and learning. It allows users to collect datasets from content available through UW-Madison Libraries’ subscription to ProQuest. Content available includes current and historical newspapers, dissertations and theses, scholarly journals, and primary sources from collections in the fields of science, technology, medicine, public policy, history and literature.

Learn how to access TDM Studio.

Al-Ahram Digital Archive Data Set

The Al-Ahram Digital Archive Data Set is comprised of content from UW-Madison Libraries’ subscription to the digitized archive (1876 – 2020) of one of the longest-running newspapers in the Middle East; nationalized in 1960, it is considered by many to be the de facto voice of the central government of Egypt. This data set is available upon request by contacting:


Constellate is a text and data analytics platform that supports learning, teaching, and performing text analysis, creating datasets, and accessing training materials. It allows access to materials from JSTOR, Portico, and other sources regardless of UW-Madison Libraries’ subscription status. Constellate also provides access to workshops and webinars.

Learn how to access the Constellate trial.

Research Data Management & Curation

Do you have questions about how to manage your research data?

Research Data Services (RDS) is an interdisciplinary organization committed to advancing research data management practice on the UW–Madison campus. RDS focuses on providing researchers with the tools and resources that support their efforts to store, analyze, and share data.

To learn more about working with Research Data Services, visit the RDS website.