from Hacker News

Ask HN: Are AI trained on GPL code subject to the GNU?

by andrewclunn on 12/28/23, 2:13 PM with 11 comments

Because the AI would then be considered to be a derivative of said code, would that then require that resulting trained AI also be readily accessible and (to whatever extent is possible) its source, weights, and bit code made freely available?
  • by samspenc on 12/28/23, 7:15 PM

    I am not a lawyer, but I'm wondering if the same argument can be made about derived works such as text - the AI is learning from open-source code, and outputting not the exact same code, but just 'derived' code based on patterns in the original code.

    In other words, somewhat similar to what a programmer studying existing and open-source code would do: read the code, understand it, and try to reimplement by writing their own code.

  • by cjbprime on 12/28/23, 8:24 PM

    There are many open questions here. It is unclear that an LLM that has "seen" GPL code is a derived work of that code. Your brain can "see" GPL code without violating the license, even when you later write different code. Redistributing the GPL licensed code that was seen earlier is what would be most obviously infringing.
  • by randombits0 on 12/28/23, 8:24 PM

    AI output is not copyrightable.