The Large Movie Review Database, originally published in the paper, Learning Word Vectors for Sentiment Analysis, by Andrew L. Maas et al, can be downloaded from http://ai.stanford.edu/~amaas/data/sentiment/.
The downloaded archive contains two folders labeled train and test. For train, there are 12,500 positive reviews and 12,500 negative reviews that we will train a classifier on. The test dataset contains the same amount of positive and negative reviews for a grand total of 50,000 positive and negative reviews amongst the two files.
Let's look at an example of one review to see what the data looks like:
It appears that the only thing we have to work with is the raw text from the movie review and review sentiment; we know nothing about the date posted, who posted the review, and other data that may/may not be helpful to us aside from the text.