AI Bots

02023-04-20 | Computer, Internet | 0 comments

RANK – DOMAIN – TOKENS – PERCENT OF
ALL TOKENS

768,560 – ottmarliebert.com – 30k – 0.00002%

See the websites that make AI bots like ChatGPT sound so smart – Washington Post

It would be one thing if we were building something together, some kind of open source chat bot, but instead this is all fodder for a proprietary, corporate machine that costs money to use. 

The three biggest sites were patents.google.com No. 1, which contains text from patents issued around the world; wikipedia.org No. 2, the free online encyclopedia; and scribd.com No. 3, a subscription-only digital library. Also high on the list: b-ok.org No. 190, a notorious market for pirated e-books that has since been seized by the U.S. Justice Department. At least 27 other sites identified by the U.S. government as markets for piracy and counterfeits were present in the data set.

See the websites that make AI bots like ChatGPT sound so smart – Washington Post

I get that wikipedia would be ranked highly as it is a free online encyclopedia, but how did they gain access to the subscription-only digital library, ranked third? Did someone pay for an account and then use the account to scrape the entire website? And a market for pirated e-books, since seized by the U.S. Justice Department?!?!

Interesting times!

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *

Archives

Images

Social

@Mastodon (the Un-Twitter)