Datasets

3 Million Russian Troll Tweets A collection of Russian troll tweets released by five-thirty-eight.
Authorship Detection with Authors from Project Gutenberg A folders containing raw text files to use for Authorship detection.
EBSCOhost A database of academic articles.
Eviction Lab A repository of eviction datasets
Kaggle Kaggle is a repository of datasets, data competitions, and coding notebooks for analyzing datasets.
Law & Order Season 16, Episode 12 Network A character network for an episode of Law and Order.
Lexis Nexis A legal and professional document search engine.
Medicare Provider Utilization and Payment Data Provides data about what hospitals charge medicare for inpatient and outpatient services.
Moby Dick & Pride and Prejudice combined A concatenation of "Moby Dick" and "Pride and Prejudice"
Opioid crisis news articles News articles covering the opioid crisis and an accompanying Voyant page.
Project Gutenberg A collection of public domain books in a variety of formats, including plain text.
Rhode Island lead poisoning data Location data of lead poisoning incidents in Rhode Island.
The Salem Witchcraft Papers Transcriptions of Court Records
The Wire network of character speech interactions A list of character speech interactions in season 1 of The Wire.