This corpus consists of 128,951,238 words (tokens) from web-extracted texts. It covers the period Jan 2000 - Dec 2010. Each month contains approximately 1 million words.