Friday, March 1, 2013

Common Crawl: going after Google on a non-profit budget

goog campus 7

Google Search is still one of the most powerful, reliable products on the web, thanks to a smart algorithm and the sheer brute-force quantity of data that Google pulls off the web on a daily basis. The power of Google Search rests on its crawl being bigger and faster than anyone else. But there's a problem — or at least there is if you're a researcher. Once Google has collected that data, it's not interested in sharing. It makes sense as a business play (you wouldn't want to do Bing any favors), but it cuts researchers off from one of the most powerful stockpiles of web data we have.


Continue reading…






via The Verge - All Posts http://www.theverge.com/2013/3/1/4043374/common-crawl-going-after-google-on-a-non-profit-budget

No comments:

Post a Comment