by pawelkobojek on 7/7/22, 12:33 PM with 227 comments
by iandanforth on 7/7/22, 2:03 PM
"scraping attacks"
Scraping is not an attack. Monopolists want to pretend they own your data because they get unlimited access to monetize it whereas competitors should have none.
"self-compromised"
Monopolists want to sell you thus it's imperative they maintain the fiction of "one person, one account". By admitting you own your account, they'd have to allow sharing and they wouldn't be able to provide their customers (advertisers) with reliable data about individuals.
"protect people from scraping"
Monopolists will protect themselves and call it protecting you. They will attempt to make you afraid of some other actor using your data in harmful ways so as to detract from how they monetize you and use your data in harmful ways.
"deter the abuse"
Monopolists don't want to argue about what constitutes abuse. Anything they write in their TOS is entirely for their benefit and only constrained by local law (if that). They will abuse you to the fullest extent they can get away with while arguing that any action to use your rights is "abuse."
"safeguard people against clone sites"
Monopolists want to maintain their monopoly, there is no greater threat than a direct challenge to that monopoly by allowing data to move freely.
--
More subtle but even more ironic rhetorical points
"for hire" / "paying for access"
Emphasizing that people making money (gasp) for providing this service, is bad.
"industry leader in taking legal action" + "across many platforms and national boundaries, also requires a collective effort from platforms, policymakers and civil society"
Monopolists can pay high priced marketers to rebrand them as patriotic hero figures fighting valiantly for the little guy.
by fxtentacle on 7/7/22, 1:43 PM
But account hijacking and mass-creation of accounts just to access private pages are clear violations of the Facebook and Instagram ToS, so they surely can sue for that.
by HeckFeck on 7/7/22, 1:22 PM
by rustdeveloper on 7/7/22, 1:15 PM
by PhilipA on 7/7/22, 1:42 PM
It is interesting as how they try to position this as a Chinese attack on them.
by throwaway_meta on 7/7/22, 3:28 PM
With Cambridge Analytica:
- Facebook allowed users (with informed consent) to allow external developers to access their data and limited data about their friends, in order to build social-enabled apps.
- CA exploited this to scrape basic profile data from a large number of users. It broke the ToS by doing so (in particular by using the data for purposes different than stated)
Here the same is happening:
- people are giving a third company access to their profile, which includes access to friends' data (in fact a lot more than what the app platform allowed to do)
- the company is scraping all the data.
At the time of CA, the criticism was that Facebook didn't do enough to enforce its ToS (or maybe that the data sharing should have not been allowed in the first place? But the terms were common knowledge and the attack potential became clear only in hindsight), here people are criticizing that Facebook is in fact enforcing its ToS.
Also note that strong enforcement against scraping is one of the mandates that came from the FTC settlement.
It seems inevitable that any news about Facebook/Meta is read in the worst possible light these days, even when the criticism is self-contradictory. I would expect less superficial commentary from HN.
by carride on 7/7/22, 1:58 PM
by htrp on 7/7/22, 1:40 PM
by i_have_an_idea on 7/7/22, 1:59 PM
"self-compromised" lol
clearly these people just wanted an automated way to access their own data
by pclmulqdq on 7/7/22, 1:25 PM
by ok123456 on 7/7/22, 3:03 PM
Google blocked them.
There was animus between the two companies that resulted in Facebook not making an official android app until 2010.
by pid-1 on 7/7/22, 1:27 PM
by almog on 7/7/22, 2:12 PM
Sorry for being vague here, I haven't publicly disclosed it yet, but will probably have to if it don't get fixed.
by nicholasjarnold on 7/7/22, 2:53 PM
I was a webmaster of a set of servers on a major university's network. I also had access (enough to run arbitrary programs that had pretty much full ingress/egress to the public internet) to a number of machines across the campus's network. Through some of my coursework and ACM chapter activities I met some other similarly minded technical people with similar levels of access.
We decide that it would be fun to use our superpowers (access + programming abilities + curiosity) to sign up for various accounts on FB and essentially scrape and friend as much as possible. At the time they had some rate limiting, some IP banning (which wasn't terrible because the Uni gave public IPv4 addrs to all machines on campus by default) and then added some early CAPTCHA which we ended up breaking pretty trivially with some python and image recognition code.
Never got sued... :) Never really did much with the scripts or data except test that they worked. Fun times.
by cosmiccatnap on 7/7/22, 1:43 PM
by paultopia on 7/7/22, 1:47 PM
by samsoftstuff on 7/7/22, 2:11 PM
by Nextgrid on 7/7/22, 3:32 PM
> After paying for access to the scraping software, customers self-compromised their Facebook and Instagram accounts by providing their authentication information to Octopus.
They didn't "self-compromise" their account. They trust Octopus to act on their behalf, and unlike Facebook, Octopus' interests are most likely more aligned with their users' since their service is paid. This is no different from handing your Facebook credentials to your social media manager or secretary. There's no evidence that Octopus misused this access in any way.
> Octopus designed the software to scrape data accessible to the user when logged into their accounts, including data about their Facebook Friends such as email address, phone number, gender and date of birth, as well as Instagram followers and engagement information such as name, user profile URL, location and number of likes and comments per post.
This is either information people intend to be public or information they trust their friends to keep private. Now if Octopus was leaking the private information to third-parties it would be one thing, but so far I see no evidence Octopus was disclosing the scraped information to anyone but their customer (who is already authorized to access it).
> Meta is an industry leader in taking legal action to protect people from scraping and exposing these types of services
Translation: Meta is an industry leader in protecting its disgusting business model that hinges on making public data behind a walled garden with an unacceptable "privacy" policy. There wouldn't be a market for Octopus (or other scrapers) if Facebook already allowed customers to efficiently access information they're already entitled to, but that would be against their interests as their entire business hinges on information being held hostage.
They've created a problem, are selling the cure (well in this case monetizing it via ads) and are now pissed off that someone else is selling the cure for cheaper.
by Litost on 7/7/22, 2:38 PM
by allenleein on 7/7/22, 1:53 PM
by viburnum on 7/7/22, 2:08 PM
by dangerlibrary on 7/7/22, 1:39 PM
https://www.nytimes.com/2020/01/18/technology/clearview-priv...
by oxff on 7/7/22, 2:10 PM
by trasz on 7/7/22, 1:45 PM
by jmyeet on 7/7/22, 2:06 PM
On one side, you have people who say any form of scraping is be disallowed, even prosecutable. This went so far that the Department of Justice on behalf of AT&T prosecuted a case of URL modification [1]. One of the few bright spots for this psychotic Supreme Court was to curtail the government's power under the CFAA by limiting what constituted "unauthorized" access [2].
On the other hand, there are those who think that any level of scraping should be fine and I think that's untenable too. Consider Yahoo indexing of Stack Overflow [3]:
> In the meantime, since Yahoo (via Slurp!) is about 0.3% of our traffic, but insists on rudely consuming a huge chunk of our prime-time bandwidth, they’re getting IP banned and blocked.
Do these "scraping extremists" think such actions should be illegal? It's actually not that far-fetched given the Ninth Circuit decided LinkedIn wrongly blocked HiQ scraping [4]. Like if you change your website with the intent that it'll make scraping more difficult, is that a problem? What if it's an unintended side effect?
Additionally, companies like Meta, Google and Apple are going to be way more acountable to abiding by data retention laws and regulations than any scraper. If it's OK to scrape FB.com completely, that information is out there forever.
I certainly think the government shouldn't prosecute on behalf of companies. At least that should expose to people how the government's #1 priority is in fact to protect the true constituents: corporations and the capital-owning class.
[1]: https://www.techdirt.com/2013/09/30/dojs-insane-argument-aga...
[2]: https://en.wikipedia.org/wiki/Van_Buren_v._United_States
[3]: https://stackoverflow.blog/2009/06/16/the-perfect-web-spider...
[4]: https://blog.ericgoldman.org/archives/2019/09/ninth-circuit-...
by romanovcode on 7/7/22, 2:23 PM
Sure, as long as Meta is not the one selling the data to Cambridge Analytica it's wrong.
by xvector on 7/7/22, 1:51 PM
by throwaway5959 on 7/7/22, 2:00 PM
by NelsonMinar on 7/7/22, 2:36 PM
by typon on 7/7/22, 2:32 PM
by dmje on 7/7/22, 4:00 PM
by rmbyrro on 7/7/22, 6:00 PM
by upupandup on 7/7/22, 3:21 PM
I don't know how far Facebook can get with this, thought Linkedin's court ruling made scraping legal de-facto
by jascii on 7/7/22, 2:16 PM
by postalrat on 7/7/22, 3:53 PM
by samsoftstuff on 7/7/22, 2:11 PM
by neya on 7/7/22, 2:21 PM
Well, color me surprised /s
Fuck Facebook. Meta. Or whatever you want to call it.
by Hedepig on 7/7/22, 1:30 PM
by throw20220707 on 7/7/22, 2:10 PM
3rd parties don't have the consent from users. Users don't even have an idea these companies might be holding their data.
by uhtred on 7/7/22, 4:07 PM
by Komodai on 7/7/22, 1:53 PM
by jacooper on 7/7/22, 1:37 PM