by pixelmonkey on 10/13/19, 5:41 PM
The summary here is that LinkedIn tried to argue that it could prevent scraping of public LinkedIn profile data under their ToS, but the courts have ruled that if data is public and provided by users, it can be scraped/crawled, that is, it isn’t LinkedIn property. This is generally a positive outcome for people/companies turning web text and HTML into structured data, e.g. tools like Puppeteer and Scrapy can be used more freely on sites like LinkedIn, Twitter, and Reddit. Now, you might still get into trouble if you re-publish that data, but you can, at least, safely use the data ”internally”, and the act of scraping/crawling (politely) is not, per se, something unlawful.
by echelon on 10/13/19, 5:20 PM
This is fantastic. I would like to see wider legislation allowing scraping of IMDB, Genius, Reddit, Facebook, and Google made legal. These services receive free input from users. The data should remain free.
Edit (sort of off topic): There's still value in the building and providing services at scale, but this lowers the barrier to cross the moat for small players. The first step is data liberation. Then we can work to bring down the other cost barriers. It's a lot easier to build services that scale in 2019 than it was in 2005.
The semantic web was misguided in 200X, but we might want to take another swing at it in the future.
by perspective1 on 10/13/19, 5:27 PM
I'm torn. On the one hand, scraping helps break down walled gardens. On the other, we're talking about personal details being used in novel ways that no LinkedIn user probably understands. I doubt any LinkedIn user writes their profile expecting HiQ to scrape it, assign a "flight risk" score and alert your bosses.
by undefined3840 on 10/13/19, 5:50 PM
I recently learned from a recruiter that one license for one recruiter for LinkedIn is $10k a year, so that is what they are protecting.
by phs318u on 10/13/19, 9:36 PM
I’m a very active user of LinkedIn, effectively cultivating my “professional brand” on it. I’ve been contracting for years and use my network to find gigs. While I don’t have an issue with the business that HiQ are in (informing businesses of employee flight risk), I do believe there’s a qualitative difference between data that I publish for consumption by human eyeballs for free (a use of my data that I’ve authorised), and someone harvesting such data and en-mass for commercial purposes that I have not authorised. HiQ have not asked for my permission to use my data, they have not made any commitments about how they will use and not use my data. Given that they have access to my contact details (even via LI itself), they are capable of contacting me to request permission to use my data.
by danielrhodes on 10/13/19, 7:19 PM
LinkedIn has played a very poor strategy here. The value of the service should be in the network, which is quite defensible. Instead, they’ve made the value in the profiles, which is not defensible. Few people curate their network on LinkedIn because you can't see profiles unless you are closely connected, so you are incentivized to add as many people as possible, thus devaluing the entire network. Then they go and sell unlimited access to profiles to recruiters and sales people. Thus, when other services come around and scrape their data, which LinkedIn needs to make somewhat publicly available for SEO juice, it becomes an existential threat.
If you look at Facebook, there is some limited profile data publicly available, but they will go to the wall to prevent people from seeing how those people are connected. In addition, they started from a very walled-off position, so they didn't become reliant on SEO traffic.
by crazygringo on 10/13/19, 5:42 PM
Question:
This seems to mean LinkedIn can't sue to prevent scraping.
I assume it's still legal for them to implement technological anti-scraping measures? So the two companies can play cat-and-mouse if they wish with rate-limiting, IP addresses, etc...
by tempestn on 10/13/19, 7:34 PM
by lr4444lr on 10/13/19, 6:27 PM
What cracks me up about this is how these massive companies go to such lengths to call themselves mere platforms in order to avoid liability for content, and then when someone actually takes the content in this case they cry, "Foul! That's ours!" Can't have it both ways.
by playing_colours on 10/13/19, 7:06 PM
I do not like a hide and seek game with who viewed your profile functionality: upgrade to a paid subscription to see who viewed, upgrade to another tier to hide that you looked at someone.
It looks like the lack of imagination or business prowess to come up with more advanced, valuable, and less annoying ways for monetisation. If only they could make it easier to connect people with matching mutual interests, more flexible than plain traditional job board and the database of CVs.
by datelinereader on 10/13/19, 6:45 PM
by xupybd on 10/13/19, 9:25 PM
After finding this
https://github.com/Greenwolf/social_mapper, I strongly recommend against having a profile photo on linkedin. It has caused me to be far more careful about my presence on the internet.
In the post privacy age I don't want my personal opinions to come back and haunt me. I grow as a person but the internet remembers all. If I make a dumb mistake and it's published online that's not a problem for me in 10 years if that fades away. But people are collecting and correlating info now. I don't like it one bit. It means someone you've never met, in a country you've never been to could extort you. It's getting very scary.
by gist on 10/13/19, 5:41 PM
I think also what most people don't realize is that linkedin's current model makes it difficult to access someone's profile without them knowing (if they pay for it and have the option on their account) to see who is looking at their profile. As such the user wanting to look at a person's profile has no privacy that they have done so. There could be many reasons someone looks at someone else's profile (even just some kind of curiosity or mistake) so this to me is an issue in itself.
Sure there are ways around this (you can make up a fake profile and some info is public but normally what I run into is a request to login to linkedin to view something that I am interested in).
by ChrisMarshallNY on 10/13/19, 8:18 PM
Personally, this doesn't bother me too much. I use LinkedIn specifically because it is public. I'm an "open kimono" type of person. Not particularly interested in hiding stuff.
However, the general principle of "Data Scraping as a Business Model" bothers me. This is by no means the only company that does it (I suspect that MS does it with their access to LinkedIn).
There are far more egregious instances, and many of them have ways to get users to voluntarily cede information (can you think of a rather obvious example?).
LinkedIn is a sandwich board. It's meant to be a public showcase. If you want private, I suspect there are much more focused (and probably valuable) venues that cater to particular communities.
by hooloovoo_zoo on 10/13/19, 6:06 PM
What if LinkedIn adds a visibility option in addition to public/private profile that says "I want LinkedIn to prevent robots from scraping my profile."? What if LinkedIn enables that mode by default? Can they then continue preventing scrapers?
by myth_buster on 10/13/19, 9:11 PM
by conjectures on 10/14/19, 8:10 AM
IP aside, anyone else concerned about the business of HiQ?
I presume what they are doing is:
* Scrape profiles.
* Calculate time delta in jobs.
* 'Predict' churn rate for (prospective) employee.
With respect to prospective employees in particular this seems likely to entail lots of risks. Average job time delta is going to be a massively overdetermined variable, and noisy wrt 'next job delta'. I'm worried how they're going to sell that to employers.
by mminer237 on 10/13/19, 8:29 PM
by spider-mario on 10/13/19, 10:42 PM
> “And as to the publicly available profiles, the users quite evidently intend them to be accessed by others”
How is it evident that the users intend them to be accessed by scrapers and not just humans? Since the ToS forbid scraping, it seems very reasonable to me to imagine users making their profiles public because of that assumption that scraping is not tolerated.
by alkonaut on 10/13/19, 8:00 PM
What is the limit for what is "user provided"? My entire facebook profile, including my social graph is "user provided".
Does this mean that it would likely be possible for a competing network to have a "click here to import your friend list" for example?
by brushfoot on 10/13/19, 5:52 PM
This is great news. The data is public; it shouldn't matter whether you hire humans to parse it or develop a bot. LinkedIn was trying to have its cake and eat it too.
by Causality1 on 10/13/19, 5:41 PM
Would it really be that difficult for LinkedIn to requires users to be logged in before viewing profiles and include anti-automation rules in the EULA?
by donohoe on 10/13/19, 7:25 PM
In case its not clear, this is from September.
by mherdeg on 10/14/19, 5:00 AM
Hmm, how does this compare versus the Craigslist/3Taps/Radpad litigation? Are these similar issues?
by EGreg on 10/13/19, 7:28 PM
It sounded like this was going to be an opinion piece about how LinkedIn is losing its appeal to users.
by atombender on 10/13/19, 5:48 PM
Anyone versed in U.S. law who can comment on whether the judgement in this case sets a precedent?
by Barrin92 on 10/13/19, 6:19 PM
As expected a lot of people here talking about public data and whatnot, but that is a horrible decision.
"Circuit Judge Marsha Berzon said hiQ, which makes software to help employers determine whether employees will stay or quit, showed it faced irreparable harm absent an injunction because it might go out of business without access.[...]
“LinkedIn has no protected property interest in the data contributed by its users, as the users retain ownership over their profiles,” Berzon wrote. “And as to the publicly available profiles, the users quite evidently intend them to be accessed by others,” including prospective employers."
This isn't some sort of empowerment of the public, it's surveillance capitalism. No end-user in their right mind publishes data on LinkedIn with the expectation that the information is bought up by a third party, analysed, and then sold back to your employer in a way that exposes your personal intent and may even threaten your job. The only thing this accomplishes is enabling shady business models that feed of a sort of internet voyeurism, and at the end of the day it'll lead to people turning their profiles private and making LinkedIn more difficult to use if you're someone who is looking for information in good faith.
by onetimemanytime on 10/13/19, 6:54 PM
>>
that required LinkedIn, a Microsoft Corp unit with more than 645 million members, to give hiQ Labs Inc access to publicly available member profiles.Not sure this is a win for the web. Sure it's user submitted but the users agreed that Linked in owns that after they submit.
by rgross1 on 10/13/19, 6:33 PM
Are there any useful bots for scraping LI profile out there?
by buboard on 10/13/19, 7:49 PM
OK how does is that going to work for Facebook?
by NKosmatos on 10/13/19, 9:23 PM
This whole situation with public data, personal information, data scrapping, GDPR and us putting our own info on various sites displaying them publicly and then complaining if someone collects them and uses them, has gotten out of hand :-(
I think I’ll have to side with hiQ on this.
by pkilgore on 10/13/19, 7:17 PM
> September 9, 2019 / 1:34 PM / a month ago