Excuse my ignorance, but is it time to update the open source licenses in the light of this behavior?
If so, what should the evolved license wording be?
I appreciate that this could be easily circumvented by a 'bad actor', but it would make this abuse overt...
From my little understanding, we have a sort of agreement in place with an item called robot.txt that's more or less a hanshake with such scrapers. Of course, the issue is these scrapers are blatantly ignoring robots.txt
A license can help as well, but what's a license without enforcement? These companies are simply treating the courts as a cost to do business.
Close, robots.txt was originally for web crawlers, to reduce accidental denial-of-service attacks. It had nothing to do with the scraping (i.e. downloading content and parsing the HTML tags in a programmatic manner).
What do you think a search engine’s crawler bot is doing exactly? I could sure be wrong, but I have a hunch that “downloading content and paraing the HTML tags in a programmatic manner” describes it.
Yes, but the difference is that the term "scraping" also targets things like automatically generating RSS feeds from HTML pages, which is not covered by robots.txt.
I thought robots.txt covered all automated, programmatic access by third parties where a bot slurps stuff and follows links, without splitting hairs about it.
But what do I know, the young whippersnappers will just word lawyer me to death, so I better shut up and go away.
I appreciate that this could be easily circumvented by a 'bad actor', but it would make this abuse overt...