from Hacker News

Nemotron-4-340B

by bcatanzaro on 6/14/24, 4:01 PM with 46 comments

  • by vineyardmike on 6/14/24, 7:35 PM

    > The Nemotron-4 340B family includes base, instruct and reward models that form a pipeline to generate synthetic data used for training and refining LLMs.

    I feel like everyone is missing this from the announcement. They explicitly are releasing this to help generate synthetic training data. Most big models and APIs have clauses that ban its use to improve other models. Sure it maybe can compete with other big commercial models at normal tasks, but this would be a huge opportunity for ML labs and startups to expand training data of smaller models.

    Nvidia must see a limit to the growth of new models (and new demand for training with their GPUs) based on the availability of training data, so they're seeking to provide a tool to bypass those restrictions.

    All for the low price of 2x A100s...

  • by observationist on 6/14/24, 4:31 PM

    This is (possibly) a GPT-4 level dense model with an open source license. Nvidia has released models with issues before, but reports on this so far indicate it's a solid contender without any of the hiccups of previous releases.

    A 340B model should require around 700GB vram or ram to run inference. To train or finetune, you're looking at almost double, which is probably why Nvidia recommends 2xA100 nodes with 1.28TB vram.

    Jensen Huang is the king of AI summer.

  • by diggan on 6/14/24, 4:55 PM

    The "open" and "permissive" license has an interesting section on "AI Ethics":

    > AI Ethics. NVIDIA is committed to safety, trust and transparency in AI development. NVIDIA encourages You to (a) ensure that the product or service You develop, use, offer as a service or distributes meets the legal and ethical requirements of the relevant industry or use case, (b) take reasonable measures to address unintended bias and to mitigate harm to others, including underrepresented or vulnerable groups, and (c) inform users of the nature and limitations of the product or service. NVIDIA expressly prohibits the use of its products or services for any purpose in violation of applicable law or regulation, including but not limited to (a) illegal surveillance, (b) illegal collection or processing of biometric information without the consent of the subject where required under applicable law, or (c) illegal harassment, abuse, threatening or bullying of individuals or groups of individuals or intentionally misleading or deceiving others

    https://developer.download.nvidia.com/licenses/nvidia-open-m...

    Besides limiting the freedom of use (making it less "open" in my eyes), it's interesting that they tell you to meet "ethical requirements of the relevant industry or use case". Seems like that'd be super hard to pin down in a precise way.

  • by kirilligum on 6/27/24, 12:54 PM

    it's 5x the price of llama3/qwen2 70b. the performance on the benchmark is similar. but with 70b you can break a task in steps and do 5+ steps. doesn't seem like it is worth it in general cases for the price. is 340 better for synthetic data generation (which is my primary usecase) are there tests for that? seems like synthetic data would benefit from multi step reasoning and reduction of hallucination and in those tests, the difference is small.
  • by option on 6/14/24, 4:47 PM

    3 models are included: base, instruct, and reward. All under license permitting synthetic data generation and commercial use.
  • by ilaksh on 6/14/24, 5:59 PM

    Has anyone runs evaluations to compare the instruct version with gpt-4o or llama3-70b etc.? It's so much larger than the leading open source models. So one would hope it would perform significantly better?

    Or is this in one of the chat arenas or whatever? Very curious to see some numbers related to the performance.

    But if it's at least somewhat better than the existing open source models then that is a big boost for open source training and other use cases.

  • by belter on 6/15/24, 11:40 AM

    https://d1qx31qr3h6wln.cloudfront.net/publications/Nemotron_...

    "...Nemotron-4-340B-Base was trained using 768 DGX H100 nodes"

    That is 350 million dollars for you...Poor Startups, better have a rich sponsor.

  • by hilux on 6/20/24, 12:49 AM

    I'm so confused.

    Isn't "training LLMs on LLM output" the very definition of "model collapse" or "model poisoning"?

  • by WithinReason on 6/14/24, 7:33 PM

    "...and were sized to fit on a single DGX H100 with 8 GPUs when deployed in FP8 precision"

    OK I see the goal is to sell more H100s, they made it big enough so it's not compatible with a cheaper GPU

  • by bguberfain on 6/14/24, 6:51 PM

    "Nemotron-4-340B-Instruct is a chat model intended for use for the English language" - frustrating
  • by Something1234 on 6/14/24, 6:18 PM

    What is it? Is it an llms or what?
  • by vosper on 6/14/24, 7:15 PM

    Why does nvidia release models that compete with its customers businesses but don’t make any money for nvidia?

    Are they commodotising their complements?