from Hacker News

Show HN: A structured list of jobs from “Who is hiring?”, parsed with GPT

by marcotm on 3/22/23, 12:15 PM with 28 comments

  • by marcotm on 3/22/23, 12:16 PM

    I wanted to share a little side project of mine that I created while tinkering around with GPT-3.

    The project uses the Algolia HN Search API [1] to retrieve the "Who is hiring?" posts from HN and then parses them with the help of GPT-3 / GPT-3.5 (I do not have API access to GPT-4, yet, but it already works quite well even with the older models). It then puts the job postings into a structured list that is hopefully easier to skim than the original postings. There are some additional features like sorting jobs by semantic similarity (based on the text embeddings from OpenAI). Filtering, sorting and saving favorites is implemented client-side, so your data and preferences remain local to your browser.

    Originally, this wasn't even meant to be a public product, but if people find it useful (and HN is fine with it), I'll try to keep it running. I've also written a short article about how the parsing works behind the scenes [2]. It's quite amazing how easy many of the classic NLP tasks have become with the newer LLMs.

    Happy to answer any questions about the project!

    [1] https://hn.algolia.com

    [2] https://marcotm.com/articles/information-extraction-with-lar...

  • by flanbiscuit on 3/22/23, 2:23 PM

    This is cool! I am definitely going to use this

    A couple of small things.

    First a request, would you be able to add filtering by location other than the #remote? Say I wanted to see only jobs in US, there's no way of doing that. That would also mean that "Santa Monica, CA" should also show up in the US filter so that could get tricky. Same thing I see for Europe where "Munich, DE" should also show up in a filter for "Europe".

    and 2nd, the first three icon buttons at the beginning of each row are not accessible.

    - You are using <div><img></div>, but since they are clickable items that perform actions on the page they should be <button type="button">s OR the less recommended way is to use aria attributes + tabindex + role="button" (but honestly you don't need to really do that because buttons come with it built in, just use buttons and css). If you go the non-button route: https://developer.mozilla.org/en-US/docs/Web/Accessibility/A...

    - your icons need some kind of accessibility text because they are not obvious to me what they do. The only one that is clear (to me) is the star, the other two do something that I did not expect. I thought I was sorting but then it popped up above the table, confusing

      a. add screen reader friendly text for them using the visually-hidden css class in the link below
      b. add a `title` attribute for everyone else. 
    
    https://www.a11yproject.com/posts/how-to-hide-content/

    awesome work!

  • by rwhyan on 3/22/23, 4:05 PM

    Looks great!

    I've been playing around GPT information extraction, and I think your prompt can be simplified to save on token costs:

    Instead of:

    `The company name (field name: "companyName", field type: string)`

    I use a prompt that looks like:

    `... The JSON should consist of the following information, using the format <field name: field type>: The company name <companyName: string>`

    I've also played around using JSON structure in the prompt, such as:

    `Return a JSON object with following model, with the format <field type: instructions to extract> { "companyName": <string: The company name>, ... }`

    In my experience, often the attribute name is enough and GPT can infer how to extract the information (i.e. { "companyName": string ... }

  • by ordx on 3/22/23, 1:07 PM

    It would be great to add location and role filters.
  • by devstein on 3/22/23, 3:46 PM

    This is awesome! Well done and thanks for sharing. Just subscribed :)

    I'm actually working on something similar, but specific to inbound job opportunities (Email and LinkedIn). The goal is to use GPT to parse unstructured, unstandardized jobs into a structured, standardized job format that makes it easy for candidates to search and review once they start their job search.

    It's in it's early stages, but you can check it out here and let me know what you think: https://sharedrecruiting.co/

    I'd love to chat about more about this if you up for it! You can reach me at team at sharedrecruiting.co

  • by jkmcf on 3/22/23, 2:37 PM

    Pretty cool. While looking around I noticed

      https://news.ycombinator.com/item?id=34984365
    
    Was flagged as part-time, possible because they mention "PART REMOTE".
  • by chrisan on 3/22/23, 1:17 PM

    Very cool!

    Would be nice to remove "competitive salary" from the "salary stated" filter. Are we going to assume everyone else is a noncompetitive salary?

    Maybe require a number for salary stated to count

  • by candleknight on 3/22/23, 2:28 PM

    Looks awesome! A filter for internships would be really helpful
  • by moneywoes on 3/22/23, 1:41 PM

    What’s the advantage of GPT parsing here, the has comp filter?
  • by sfc32 on 3/22/23, 1:29 PM

    Nice work. A filter against job title would be helpful.