Start mining - FREE 100MM tweet db

_hfqa · on May 26, 2011

Download the db here. Please dont abuse http://scramblermedia.com/twitter.sql.gz

sidmitra · on May 26, 2011

Awesome, thanks. I'll see if i can seed a torrent. Will post here if i do that.

_hfqa · on May 26, 2011

super

_hfqa · on May 26, 2011

I had to remove the file from the server because of bandwidth issues.

_hfqa · on May 26, 2011

BTW, the sql contains: - bio data(7MM) - tweets(90MM) - followers(10MM) - following(10MM) - location(7MM) - profileName(10MM) - relationships (100MM) - websites (4.5MM) - users(20MM)

-- 350+MM rows total --

cstrouse · on May 25, 2011

If you upload it to my server I will help you seed it from two locations. Email me for details.

_hfqa · on May 26, 2011

Thanks to Jason for putting this up on the archive.org site: http://www.archive.org/details/2011-05-calufa-twitter-sql

jparicka · on May 26, 2011

http://codebiatch.com/ .. the file is still uploading if it's not in there yet. Good luck with your project!

_hfqa · on May 26, 2011

seems to work fine ;) ... thanks

fhsdfh · on June 4, 2011

Can someone help a novice and explain what types of things can be achieved with such a dump?

uptown · on May 26, 2011

Thanks for the data. Guess it's time to see whether my ISP has a data cap or not.

JoachimSchipper · on May 26, 2011

So, you are scraping Twitter (likely violating their ToS) to get users' Tweets (likely violating their copyright) and now posting about it on HN? When Twitter is selling chunks of its stream, e.g. via InfoChimps?

I don't want to be mean, but this doesn't strike me as a very good idea.

MrMcDowall · on June 8, 2011

It didn't work out well for the last guys who tried it :(

http://discovertext.com/osamabinladen.aspx

mikelbring · on May 25, 2011

Throw it on a torrent?

_hfqa · on May 25, 2011

tried that before and I dont know why it doesnt seed. I have had problems with my OSX admin user before... reinstalling everything is not really a option... :/