from Hacker News

TypeChat

by DanRosenwasser on 7/20/23, 4:41 PM with 169 comments

  • by verdverm on 7/20/23, 8:24 PM

    I don't see the value add here.

    Here's the core of the message sent to the LLM: https://github.com/microsoft/TypeChat/blob/main/src/typechat...

    You are basically getting a fixed prompt to return structured data with a small amount of automation and vendor lockin. All these LLM libraries are just crappy APIs to the underlying API. It is trivial to write a script that does the same and will be much more flexible as models and user needs evolve.

    As an example, think about how you could change the prompt or use python classes instead. How much work would this be using a library like this versus something that lifts the API calls and text templating to the user like: https://github.com/hofstadter-io/hof/blob/_dev/flow/chat/llm...

  • by andy_xor_andrew on 7/21/23, 3:44 AM

    Here's one thing I don't get.

    Why all the rigamarole of hoping you get a valid response, adding last-mile validators to detect invalid responses, trying to beg the model to pretty please give me the syntax I'm asking for...

    ...when you can guarantee a valid JSON syntax by only sampling tokens that are valid? Instead of greedily picking the highest-scoring token every time, you select the highest-scoring token that conforms to the requested format.

    This is what Guidance does already, also from Microsoft: https://github.com/microsoft/guidance

    But OpenAI apparently does not expose the full scores of all tokens, it only exposes the highest-scoring token. Which is so odd, because if you run models locally, using Guidance is trivial, and you can guarantee your json is correct every time. It's faster to generate, too!

  • by paxys on 7/20/23, 6:28 PM

    I swear I think of something and Anders Hejlsberg builds it.

    Structured requests and responses are 100% the next evolution of LLMs. People are already getting tired of chatbots. Being able to plug in any backend without worrying about text parsing and prompts will be amazing.

  • by dvt on 7/20/23, 10:38 PM

    This is my hot take: we're slowly entering the "tooling" phase of AI, where people realize there's no real value generation here, but people are so heavily invested in AI, that money is still being pumped into building stuff (and of course, it's one of the best way to guarantee your academic paper gets published). I mean, LangChain is kind of a joke and they raised $10M seed lol.

    DeFi/crypto went through this phase 2 years ago. Mark my words, it's going to end up being this weird limbo for a few years where people will slowly realize that AI is a feature, not a product. And that its applicability is limited and that it won't save the world. It won't be able to self-drive cars due to all the edge cases, it won't be able to perform surgeries because it might kill people, etc.

    I keep mentioning that even the most useful AI tools (Copilot, etc.) are marginally useful at best. At the very best it saves me a few clicks on Google, but the agents are not "intelligent" in the least. We went through a similar bubble a few years ago with chatbots[1]. These days, no one cares about them. "The metaverse" was much more short-lived, but the same herd mentality applies. "It's the next big thing" until it isn't.

    [1] https://venturebeat.com/business/facebook-opens-its-messenge...

  • by bottlepalm on 7/20/23, 8:06 PM

    How does no voice assistant (Apple, Google, Amazon, Microsoft) integrate LLMs into their service yet, and how has OpenAI not released their own voice assistant?

    Also like RSS, if there were some standard URL a websites exposed for AI interaction, using this TypeChat to expose the interfaces, we'd be well on our way here.

  • by joefreeman on 7/20/23, 6:33 PM

    > It's unfortunately easy to get a response that includes { "name": "grande latte" }

        type Item = {
            name: string;
            ...
            size?: string;
    
    I'm not really following how this would avoid `name: "grande latte"`?

    But then the example response:

        "size": 16
    
    > This is pretty great!

    Is it? It's not even returning the type being asked for?

    I'm guessing this is more of a typo in the example, because otherwise this seems cool.

  • by 33a on 7/20/23, 7:00 PM

    Looks like it just runs the LLM in a loop until it spits out something that type checks, prompting with the error message.

    This is a cute idea and it looks like it should work, but I could see this getting expensive with larger models and input prompts. Probably not a fix for all scenarios.

  • by garrett_makes on 7/20/23, 9:16 PM

    I built and released something really similar to this (but smaller scope) for Laravel PHP this week: https://github.com/adrenallen/ai-agents-laravel

    My take on this is, it should be easy for an engineer to spin up a new "bot" with a given LLM. There's a lot of boring work around translating your functions into something ChatGPT understands, then dealing with the response and parsing it back again.

    With systems like these you can just focus on writing the actual PHP code, adding a few clear comments, and then the bot can immediately use your code like a tool in whatever task you give it.

    Another benefit to things like this, is that it makes it much easier for code to be shared. If someone writes a function, you could pull it into a new bot and immediately use it. It eliminates the layer of "converting this for the LLM to use and understand", which I think is pretty cool and makes building so much quicker!

    None of this is perfect yet, but I think this is the direction everything will go so that we can start to leverage each others code better. Think about how we use package managers in coding today, I want a package manager for AI specific tooling. Just install the "get the weather" library, add it to my bot, and now it can get the weather.

  • by katamaster818 on 7/20/23, 8:30 PM

    Hang on, so this is doing runtime validation of an object against a typescript type definition? Can this be shipped as a standalone library/feature? This would be absolutely game changing for validating api response payloads, etc. in typescript codebases.
  • by parentheses on 7/21/23, 12:09 AM

    I'm very surprised that they're not using `guidance` [0] here.

    It not only would allow them to suggest that required fields be completed (avoiding the need for validation [1]) and probably save them GPU time in the end.

    There must be a reason and I'm dying to know what it is! :)

    Side-note, I was in the process of building this very thing and good ol' Misrocoft just swung in and ate my lunch.. :/

    [0] https://github.com/microsoft/guidance

    [1] https://github.com/microsoft/TypeChat/blob/main/src/typechat...

  • by Zaheer on 7/20/23, 8:28 PM

    It's not super clear how this differs from another recently released library from Microsoft: Guidance (https://github.com/microsoft/guidance).

    They both seem to aim to solve the problem of getting typed, valid responses back from LLMs

  • by tlrobinson on 7/21/23, 12:43 AM

        const schema = fs.readFileSync(path.join(__dirname, "sentimentSchema.ts"), "utf8");
        const translator = typechat.createJsonTranslator<SentimentResponse>(model, schema, "SentimentResponse"); 
    
    It would have been much nicer if they took this an an opportunity to build generic runtime type introspection into TypeScript.
  • by mahalex on 7/20/23, 8:49 PM

    So, it's a thing that appends "please format your response as the following JSON" to the prompt", then validates the actual response against the schema, all in a "while (true)" loop (literally) until it succeeds. This unbelievable achievement is a work of seven people (authors of the blog post).

    Honestly, this is getting beyond embarrassing. How is this the world we live in?

  • by gigel82 on 7/21/23, 12:28 AM

    I agree with comments saying this is basically a 10-line "demo script" everyone could write and it is weird to have big names associated with it.

    But I heard from MS friends that AI is an absolute "need to have". If you're not working on AI, you're not getting (as much) budget. I suspect this is more about ticking the box than producing some complex project. Unfortunately, throughout the company, folks are doing all kinds of weird things to tick the box like writing a "copilot" (with associated azure openai costs) fine-tuned on a handful of documentation articles :(

  • by huac on 7/21/23, 1:02 AM

    I've written a version of this in Golang (tied to OpenAI API, mostly): https://github.com/stillmatic/gollum/blob/main/dispatch.go

    Define a struct and tag it with golang's json comments. Then, give it a prompt and ...

        type dinnerParty struct {
            Topic       string   `json:"topic" jsonschema:"required" jsonschema_description:"The topic of the conversation"`
            RandomWords []string `json:"random_words" jsonschema:"required" jsonschema_description:"Random words to prime the conversation"`
        }
        completer := openai.NewClient(os.Getenv("OPENAI_API_KEY"))
        d := gollum.NewOpenAIDispatcher[dinnerParty]("dinner_party", "Given a topic, return random words", completer, nil)
        output, _ := d.Prompt(context.Background(), "Talk to me about dinosaurs")
    
    and you should get a response like

        expected := dinnerParty{
            Topic:       "dinosaurs",
            RandomWords: []string{"dinosaur", "fossil", "extinct"},
        }
  • by trafnar on 7/20/23, 6:53 PM

    It's not clear to me how they ensure the responses will be valid JSON, are they just asking for it, then parsing the result with error checking?
  • by robbie-c on 7/20/23, 10:03 PM

    This is funny, I have something pretty similar in my code, except it's using Zod for runtime typechecking, and I convert Zod schemas to json schemas and send that to gpt-3.5 as a function call. I would expect that using TypeScript's output is better for recovering from errors than with Zod's output, so I can definitely see the advantage of this.
  • by sandkoan on 7/20/23, 10:19 PM

    Relevant: Built this which generalizes to arbitrary regex patterns / context free grammars with 100% adherence and is model-agnostic — https://news.ycombinator.com/item?id=36750083
  • by _andrei_ on 7/21/23, 3:31 PM

    Just use function calling, declare your function schema using Zod, and convert it to JSONSchema automatically. You don't have to write your types more than once, you get proper validation with great error messages, and can extend it.
  • by abhinavkulkarni on 7/21/23, 6:16 AM

    There already are techniques to guade LLMs into producing output that adhere to a schema. For e.g. forcing LLMs to stick to a Context-Free Grammar: https://matt-rickard.com/context-free-grammar-parsing-with-l...

    Just like many similar methods, this is based on logit biasing, so it may have an impact on quality.

  • by geysersam on 7/20/23, 11:50 PM

    Anyone knows in what situations this approach is superior to jsonformer (https://github.com/1rgs/jsonformer) and vice versa?

    Or are they solving different problems?

    It seems jsonformer has some advantages such as only generating tokens for the values and not the structure of the JSON. But this project seems to have more of a closed feedback loop prompt the model to do the right thing.

  • by waffletower on 7/21/23, 10:25 PM

    At least for llama.cpp users, this recently introduced PR -- https://github.com/ggerganov/llama.cpp/pull/1773 -- introducing grammar-based sampling could potentially improve structural reliability of LLaMA output. They provide an example JSON grammar as well.
  • by rvz on 7/20/23, 6:48 PM

    Someone should just get this working on Llama 2 instead of O̶p̶e̶n̶AI.com [0]

    All this is it's just talking to a AI model sitting on someone else's server.

    [0] https://github.com/microsoft/TypeChat/blob/main/src/model.ts...

  • by canadaduane on 7/21/23, 5:02 AM

    "Using Zod to Build Structured ChatGPT Queries"[1] is a pattern I found useful. This doesn't seem too different.

    [1] https://medium.com/@canadaduane/using-zod-to-build-structure...

  • by jensneuse on 7/20/23, 9:52 PM

    This looks quite similar to how were using OpenAI functions and zod (JSON Schema) to have OpenAI answer with JSON and interact with our custom functions to answer a prompt: https://wundergraph.com/blog/return_json_from_openai
  • by xigoi on 7/21/23, 7:58 AM

    Why are we trying to get structured output out of something that was specifically designed to produce natural-language output?
  • by davrous on 7/20/23, 6:19 PM

    This is a fantastic concept! It's going to be super useful to map users' intent to API / code in a super reliable way.
  • by waffletower on 7/21/23, 5:36 PM

    Reliance on strong typing for LLM output coercion is a potentially lossy and inefficient approach that can introduce redundant LLM queries and costs. LLM output is far more subtle than this. But the strongly typed hammer is very attractive to many developers, particularly those in the Typescript ecosystem.
  • by nurettin on 7/21/23, 1:31 AM

    This is rather trivial. The real challenge would be to make it choose what type to return. The function api does that, but then natural conversations sometimes involve calling multiple functions, and there isn't a good schema for that.
  • by vbezhenar on 7/21/23, 10:42 AM

    That's interesting way to validate JSON. Basically they run the whole compiler (making it a runtime dependency). Hopefully this horrible implementation would nudge TypeScript developers into a direction of implementing RTTI.
  • by TillE on 7/21/23, 12:20 AM

    I wish Copilot did something like this. I've found it'll regularly invent C# methods which don't exist, an error which seems trivial to catch and hide from the user. No output is better than bad output.
  • by phillipcarter on 7/20/23, 9:00 PM

    I'd love to see a robust study on the effectiveness of this and several other ways to coax a structured response out:

    - Lots of examples / prompt engineering techniques

    - MS Guideance

    - TypeChat

    - OpenAI functions (the model itself is tuned to do this, a key differentiator)

    - ...others?

  • by obiefernandez on 7/20/23, 11:10 PM

    If I can use this instead of functions, it's gonna save me a buttload of API usage, because the Typescript interface syntax is so concise. Can't wait to try it.
  • by ianzakalwe on 7/20/23, 9:23 PM

    I am not sure why this exist, maybe I am missing something, and it does not seem like there is much value past “hey check this out this is possible”
  • by ameyab on 7/20/23, 6:41 PM

    Here's a relevant paper that folks may find interesting: <snip>Semantic Interpreter leverages an Analysis-Retrieval prompt construction method with LLMs for program synthesis, translating natural language user utterances to ODSL programs that can be transpiled to application APIs and then executed.</snip>

    https://arxiv.org/abs/2306.03460

  • by bestcoder69 on 7/20/23, 6:54 PM

    Why this instead of GPT Functions?
  • by yanis_t on 7/20/23, 7:50 PM

    TL;DR: This is ChatGPT + TypeScript.

    I'm totally happy to be able to receive structured queries, but I'm also not 100% sure TypeScript is the right tool, it seems to be an overkill. I mean obviously you don't need the power of TS with all its enums, generics, etc.

    Plus given that it will run multiple queries in loop, it might end up very expensive for it abide by your custom-mage complex type

  • by nchase on 7/21/23, 12:30 AM

    this is going to create space for some hilarious and funky input attacks.
  • by arc9693 on 7/20/23, 9:05 PM

    TL;DR: It's asking ChatGPT to format response according to a schema.