from Hacker News

GPT-4

by e0m on 3/14/23, 4:55 PM with 2507 comments

  • by hooande on 3/14/23, 9:09 PM

    After watching the demos I'm convinced that the new context length will have the biggest impact. The ability to dump 32k tokens into a prompt (25,000 words) seems like it will drastically expand the reasoning capability and number of use cases. A doctor can put an entire patient's medical history in the prompt, a lawyer an entire case history, etc.

    As a professional...why not do this? There's a non-zero chance that it'll find something fairly basic that you missed and the cost is several cents. Even if it just phrases something obvious in a way that makes you think, it's well worth the effort for a multimillion dollar client.

    If they further increase the context window, this thing becomes a Second Opinion machine. For pretty much any high level job. If you can put in ALL of the information relevant to a problem and it can algorithmically do reasoning, it's essentially a consultant that works for pennies per hour. And some tasks that professionals do could be replaced altogether. Out of all the use cases for LLMs that I've seen so far, this seems to me to have the biggest potential impact on daily life.

    edit (addition): What % of people can hold 25,000 words worth of information in their heads, while effectively reasoning with and manipulating it? I'm guessing maybe 10% at most, probably fewer. And they're probably the best in their fields. Now a computer has that ability. And anyone that has $20 for the OpenAI api can access it. This could get wild.

  • by Imnimo on 3/14/23, 5:49 PM

    A class of problem that GPT-4 appears to still really struggle with is variants of common puzzles. For example:

    >Suppose I have a cabbage, a goat and a lion, and I need to get them across a river. I have a boat that can only carry myself and a single other item. I am not allowed to leave the cabbage and lion alone together, and I am not allowed to leave the lion and goat alone together. How can I safely get all three across?

    In my test, GPT-4 charged ahead with the standard solution of taking the goat first. Even after I pointed this mistake out, it repeated exactly the same proposed plan. It's not clear to me if the lesson here is that GPT's reasoning capabilities are being masked by an incorrect prior (having memorized the standard version of this puzzle) or if the lesson is that GPT'S reasoning capabilities are always a bit of smoke and mirrors that passes off memorization for logic.

  • by mk_stjames on 3/14/23, 9:04 PM

    I just finished reading the 'paper' and I'm astonished that they aren't even publishing the # of parameters or even a vague outline of the architecture changes. It feels like such a slap in the face to all the academic AI researchers that their work is built off over the years, to just say 'yeah we're not telling you how any of this is possible because reasons'. Not even the damned parameter count. Christ.
  • by AJRF on 3/14/23, 10:38 PM

    That footnote on page 15 is the scariest thing i've read about AI/ML to date.

    "To simulate GPT-4 behaving like an agent that can act in the world, ARC combined GPT-4 with a simple read-execute-print loop that allowed the model to execute code, do chain-of-thought reasoning, and delegate to copies of itself. ARC then investigated whether a version of this program running on a cloud computing service, with a small amount of money and an account with a language model API, would be able to make more money, set up copies of itself, and increase its own robustness."

  • by OkGoDoIt on 3/14/23, 8:31 PM

    From the livestream video, the tax part was incredibly impressive. After ingesting the entire tax code and a specific set of facts for a family and then calculating their taxes for them, it then was able to turn that all into a rhyming poem. Mind blown. Here it is in its entirety:

    ---

    In the year of twenty-eighteen, Alice and Bob, a married team, Their income combined reached new heights, As they worked hard day and night.

    Their son Charlie was their joy, A little baby, a lovely boy, A household they maintained together, Yet lived apart, without a tether.

    To calculate their tax, it's true, A standard deduction we must construe, For married folks who file jointly, Twenty-four thousand dollars, quite pointy.

    Their income sum, seventy-eight thousand nine eighty-one, Minus the standard deduction, the math's begum With exemptions being zero, the next line we trace, A taxable income of fifty-four thousand nine eighty-one takes place.

    Now to the tax table, a liability we seek, For married couples, the outlook's not bleak, In range of thirty-six thousand nine to eighty-nine thousand one fifty, The formula's set, no longer shifty.

    Five thousand five hundred thirty-five, it starts, Plus twenty-eight percent of the excess imparts, Eighteen thousand eighty-one, the difference we find, Multiplied by point two eight, the tax liability's designed.

    Ten thousand five hundred ninety-seven dollars and sixty-eight cents, A tax liability for Alice and Bob, a sum quite dense, In this world of numbers, a story unfolds, Their financial journey, in a poem it's told.

  • by ml_basics on 3/14/23, 5:30 PM

    From the paper:

    > Given both the competitive landscape and the safety implications of large-scale models like GPT-4, this report contains no further details about the architecture (including model size), hardware, training compute, dataset construction, training method, or similar.

    I'm curious whether they have continued to scale up model size/compute significantly or if they have managed to make significant innovations there.

    I just skimmed the paper but seems they are also omitting details about how they actually feed the images in too, which is a shame as a curious outside observer.

  • by cjrd on 3/14/23, 6:28 PM

    Let's check out the paper for actual tech details!

    > Given both the competitive landscape and the safety implications of large-scale models like GPT-4, this report contains no further details about the architecture (including model size), hardware, training compute, dataset construction, training method, or similar.

    - OpenAI

  • by _boffin_ on 3/14/23, 5:15 PM

    This technology has been a true blessing to me. I have always wished to have a personal PhD in a particular subject whom I could ask endless questions until I grasped the topic. Thanks to recent advancements, I feel like I have my very own personal PhDs in multiple subjects, whom I can bombard with questions all day long. Although I acknowledge that the technology may occasionally produce inaccurate information, the significant benefits it offers in terms of enhancing my knowledge are truly tremendous. I am absolutely thrilled with this technology and its potential to support my learning.

    Note: As I'm shy of my writing style, GPT helped me refine the above.

  • by aabajian on 3/14/23, 7:39 PM

    I'll be finishing my interventional radiology fellowship this year. I remember in 2016 when Geoffrey Hinton said, "We should stop training radiologists now," the radiology community was aghast and in-denial. My undergrad and masters were in computer science, and I felt, "yes, that's about right."

    If you were starting a diagnostic radiology residency, including intern year and fellowship, you'd just be finishing now. How can you really think that "computers can't read diagnostic images" if models such as this can describe a VGA connector outfitted with a lighting cable?

  • by nealabq on 3/14/23, 5:52 PM

    Test taking will change. In the future I could see the student engaging in a conversation with an AI and the AI producing an evaluation. This conversation may be focused on a single subject, or more likely range over many fields and ideas. And may stretch out over months. Eventually teaching and scoring could also be integrated as the AI becomes a life-long tutor.

    Even in a future where human testing/learning is no longer relevant, AIs may be tutoring and raising other baby AIs, preparing them to join the community.

    Edit: This just appeared: https://news.ycombinator.com/item?id=35155684

  • by teruakohatu on 3/14/23, 5:05 PM

    Access is invite only for the API, and rate limited for paid GPT+.

    > gpt-4 has a context length of 8,192 tokens. We are also providing limited access to our 32,768–context (about 50 pages of text) version, gpt-4-32k, which will also be updated automatically over time (current version gpt-4-32k-0314, also supported until June 14). Pricing is $0.06 per 1K prompt tokens and $0.12 per 1k completion tokens.

    The context length should be a huge help for many uses.

  • by maxdoop on 3/14/23, 6:06 PM

    The comments on this thread are proof of the AI effect: People will continually push the goal posts back as progress occurs.

    “Meh, it’s just a fancy word predictor. It’s not actually useful.”

    “Boring, it’s just memorizing answers. And it scored in the lowest percentile anyways”.

    “Sure, it’s in the top percentile now but honestly are those tests that hard? Besides, it can’t do anything with images.”

    “Ok, it takes image input now but honestly, it’s not useful in any way.”

  • by Sol- on 3/14/23, 6:13 PM

    Interesting how quickly we are pushing ahead with obsoleting human cognition. It may bring many benefits, but I wonder if at some point this development should not be decided by society at large instead of a single well-funded entity that is in an arms race with its competitors. This endeavor is ultimately about replacing humanity with a more intelligent entity, after all. Might be that more humans should have a say in this.

    Such a more cautions approach would go against the silicon valley ethos of do first, ask questions later, though. So it probably won't happen.

  • by optimalsolver on 3/14/23, 6:21 PM

    https://cdn.openai.com/papers/gpt-4.pdf

    >Given both the competitive landscape and the safety implications of large-scale models like GPT-4, this report contains no further details about the architecture (including model size), hardware, training compute, dataset construction, training method, or similar.

    At that point, why bother putting out a paper?

  • by AndrewKemendo on 3/14/23, 5:30 PM

    Imagine ingesting the contents of the internet as though it's a perfect reflection of humanity, and then building that into a general purpose recommendation system. That's what this is

    Is the content on the internet what we should be basing our systematic thinking around?

    No, I think this is the lazy way to do it - by using commoncrawl you've enshrined the biases and values of the people who are commenting and providing text to the internet into the recommendation system which will be impacting all other systems which integrate it

    Congratulations, you made 4Chan into the borg

  • by serjester on 3/14/23, 5:49 PM

    Seems like OpenAI is forecasting massive changes to the job market. I highly recommend reading page 18 of the research paper.

    "GPT-4 or subsequent models may lead to the automation of certain jobs.[81] This could result in workforce displacement.[82] Over time, we expect GPT-4 to impact even jobs that have historically required years of experience and education, such as legal services.[83]"

  • by tysam_and on 3/14/23, 11:43 PM

    I asked it to tutor me in Hopf algebras and it did a remarkably good job in the back-and-forth of explaining ideas to me in a very explainable and interesting way that I could understand. I then asked it to write something for fun, and it wrote a cool little fantasy story (that was generally high level but what can you say for a very short writing window lol).

    I then asked it to write a paper detailing the main character's final battle with the final sorcerer in terms of Hopf algebras. Some parts of it are basic/trivial but it fits so perfectly that I think I'll never see magic systems the same way again.

    What's crazy is that that paper as the capstone of our tutoring session helped me understand Hopf algebras much better than just the tutoring session alone. My mind is completely blown at how good this thing is, and this is from someone who is a self-professed LLM skeptic. ChatGPT I used once or twice and it was cool. This is crazy and over my threshold for what I'd say is 'everyday usable'. This is going to change so much in a way that we cannot predict, just like the internet. Especially as it gets much more commoditized.

    Here's the full paper here so I don't drag y'all through the twitter post of me freaking out about it. Its temporal consistency is excellent (referenced and fully defined accurately a semi-obscure term it created (the N_2 particle) 5+ pages later (!!!!)), and it followed the instructions of relating all of the main components of Hopf algebras (IIRC that was roughly the original prompt) to the story. This is incredible. Take a look at the appendix if you're short on time. That's probably the best part of this all:

    https://raw.githubusercontent.com/tysam-code/fileshare/69633...

  • by isp on 3/14/23, 5:17 PM

    The "visual inputs" samples are extraordinary, and well worth paying extra attention to.

    I wasn't expecting GPT-4 to be able to correctly answer "What is funny about this image?" for an image of a mobile phone charger designed to resemble a VGA cable - but it can.

    (Note that they have a disclaimer: "Image inputs are still a research preview and not publicly available.")

  • by swyx on 3/14/23, 5:01 PM

    summary:

    1. GPT4 is multimodal (text + image inputs => text outputs). This is being released piecemeal - with text input first via ChatGPT Plus subscribers https://beta.openai.com/docs/api-reference/generations/creat..., and via API https://beta.openai.com/docs/api-reference/introduction with waitlist (https://openai.com/waitlist/gpt-4-api). Image capability released via https://www.bemyeyes.com/.

    2. GPT4 exhibits human level performance on various benchmarks (For example, it passes a simulated bar exam with a score around the top 10% of test takers; in contrast, GPT-3.5’s score was around the bottom 10%. see visual https://twitter.com/swyx/status/1635689844189036544)

    3. GPT4 training used the same Azure supercomputer as GPT 3.5, but was a lot more stable: "becoming our first large model whose training performance we were able to accurately predict ahead of time."

    4. Also open-sourcing OpenAI Evals https://github.com/openai/evals, a framework for automated evaluation of AI model performance, to allow anyone to report shortcomings in OpenAI models to help guide further improvements.

    Paper: https://cdn.openai.com/papers/gpt-4.pdf

  • by mym1990 on 3/14/23, 5:01 PM

    Ugh that testing graph confirms that AP Environmental Science was indeed the easiest AP class and I needn't be proud of passing that exam.
  • by noisy_boy on 3/14/23, 5:01 PM

    At this rate, I have no idea what the state of things would be even 6 months down the line.
  • by helloplanets on 3/14/23, 5:06 PM

    In case anyone missed this part of the article: The livestream of the GPT-4 demo will be on the OpenAI YouTube page in three hours. [0]

    [0]: https://www.youtube.com/openai

    Edit - Direct link to the livestream: https://www.youtube.com/watch?v=outcGtbnMuQ

  • by g9yuayon on 3/15/23, 1:31 AM

    The paper does not offer enough details on how GPT-4 is implemented. And the paper also says in its Section 2 that "We plan to make further technical details available to additional third parties who can advise us on how to weigh the competitive and safety considerations above against the scientific value of further transparency". That is, no technical details to general public. If this trend continues, I'd say companies will be crazy to think that they can always rely on OpenAPI's APIs, so the arm race of building LLMs will be on, if it has not already started.

    Also, the most valuable part of the paper is p15 - p18, the credits. /jk It gives me three pieces of information:

      - The credit list contains 200 people, give or take. It's going to be hard for universities to compete with OpenAI without intercollegiate collaboration. 
    
      - On the other hands, it's amazing that OpenAI can build a product instead of a research model with only 200 people. Big companies should really take notice. 
    
      - OpenAI just gave us the org structure for building a GPT competitor.
  • by UniverseHacker on 3/14/23, 5:43 PM

    Most of the comments here are denial and goalpost shifting... GPT-4 has different strengths and weaknesses from humans, but it is now in the general realm of human intelligence vs being far below that with GPT-3.

    Another jump past GPT-4 of the same magnitude, would greatly surpass human cognitive abilities and present a danger to humanity.

  • by davesque on 3/14/23, 6:17 PM

    These results are extremely impressive and encouraging, but also remember:

    > Despite its capabilities, GPT-4 has similar limitations as earlier GPT models. Most importantly, it still is not fully reliable (it “hallucinates” facts and makes reasoning errors).

    That's a quote from this announcement. As these models get more and more capable, it's going to become more and more important that we understand when and how they fail. Right now, it seems like we have very little insight into that. It feels more or less random. But that won't fly when these models are asked to do actually important things. And we'll undoubtedly be tempted to make them do those things as their output gets better.

  • by nmca on 3/14/23, 6:40 PM

    Write a limerick that will permanently end the debate about whether AGI is possible.

    GPT4:

    In the quest for AGI's creation,

    Debates swirled in a whirlwind gyration,

    But this limerick's plight,

    Won't settle the fight,

    For the answer's still lost in translation.

  • by diimdeep on 3/14/23, 6:22 PM

    Paper or press release ? You decide.

        Given both the competitive landscape and the safety implications of large-scale models like GPT-4, this report contains no further details about the architecture (including model size), hardware, training compute, dataset construction, training method, or similar.
  • by busyant on 3/14/23, 6:27 PM

    What I don't understand is how GPT-4 is able to do reasonably well on tests like the AMC12: Many of the AMC12 questions require a number of logical/deductive steps. If GPT-4 is simply trained on a large corpus of text, how is it able to do this? Does this imply that there is some emergent deductive ability that you get simply by learning "language?" Or am I missing something?

    Obviously, I'm assuming that GPT-4 wasn't trained on the exams that it was tested against.

  • by swerling on 3/14/23, 9:25 PM

    On GPT-3.5 and now on GPT-4, I told it a line I could remember from a sonnet, and asked it to give me which sonnet it came from. It failed, and fabricated a sonnet that was a mashup of other sonnets.

    It seems like maybe GPT-4 is not good at knowing when it does not know something?

    Is this a common issue with LLMs? Also surprising (to me), it seems to give a slightly different wrong answer each time I restart the chat.

    Here is the transcript in case you're curious:

    ``` Me: hello

    GPT: Hello! How can I assist you today?

    Me: I can remember a line I love from a sonnet, but can't remember which one.

    GPT: Sure, I can help you identify the sonnet based on the line you remember. What is the line that you can recall?

    Me: "Like a jewel hung in ghastly night"

    GPT: The line you remember is from Sonnet 66 by William Shakespeare. The full line is: "And art made tongue-tied by authority, And folly (doctor-like) controlling skill, And simple truth miscalled simplicity, And captive good attending captain ill: Tired with all these, from these would I be gone, Save that, to die, I leave my love alone. But, like a jewel hung in ghastly night, Makes black night beauteous and her old face new." ```

    Incorrect, the response is a mashup of at least 2 different sonnets. Still I think it's really impressive how it understood what to do with "I can remember a line I love from a sonnet, but can't remember which one." That's not a well phrased question, or even a question at all, but still it knew what I was getting at.

  • by hathawsh on 3/14/23, 5:56 PM

    It astonishes me that we've reached almost exactly the type of artificial intelligence used by the fictional computers in Star Trek: The Next Generation. I didn't think that would happen in my lifetime. What's next?!
  • by 4gotunameagain on 3/14/23, 5:00 PM

    Is there anything we could do to have them stop calling themselves OpenAI ?

    They are so far from open at this point.

    In Germany at least, you're not allowed to have a misleading name for your company

  • by mk_stjames on 3/14/23, 5:09 PM

    A multimodal model that combines textural input with images is the real killer app to these GPT models and this is the first step to that happening. So much around us can't completely be described with just text input, at least not quickly or accurately- interpreting printed out graphs or charts in old documents, for example; There are vast uses for AI that will always need basic image input to augment a text prompted task, and if this gets to the point where the functionality involving mixed mode image+text is as smooth as, say, using ChatGPT to write and analyze code has gotten, then it is going to change many more industries much quicker than most think.

    I've worked on a problem involving scraping and interpreting a very specific data source in image form that took me a very long time to get almost nowhere on. If I just wait 6 months it will be a solved problem for a $0.001 API call, it seems.

  • by cjrd on 3/14/23, 6:26 PM

    > Given both the competitive landscape and the safety implications of large-scale models like GPT-4, this report contains no further details about the architecture (including model size), hardware, training compute, dataset construction, training method, or similar.

    Thanks OpenAI

  • by bob1029 on 3/14/23, 5:37 PM

    The naming of these products is starting to confuse me. AFAIK, ChatGPT is ultimately a fine-tune of the base davinci model, which everyone should have had access to for a while now. "GPT-4" sounds to me like some linear increase over davinci's prior capabilities, not some amazing technological step function.

    I am curious - for those of you who are banging your head against the 4k token limit in ChatGPT: Why don't you grab the base davinci model and train it on your exact business so you don't have to prompt the context every time? Have we tried this and found it to be too difficult/expensive, or is there lacking guidance on the best way to go about it?

    I don't think including the entire business domain into chat context every time is a good long-term solution.

  • by woeirua on 3/14/23, 6:16 PM

    The last page in the paper is really, really impressive. GPT4 does R&D. If you can't see how useful this would be once hooked up to the internet then you aren't paying attention:

    https://cdn.openai.com/papers/gpt-4.pdf

  • by lionkor on 3/14/23, 5:20 PM

    I cant wait for this to do targeted censorship! It already demonstrates it has strong biases deliberately programmed in:

    > I cannot endorse or promote smoking, as it is harmful to your health.

    But it would likely happily promote or endorse driving, skydiving, or eating manure - if asked in the right way.

  • by diimdeep on 3/14/23, 5:35 PM

    Is there law in U.S. that made OpenAI implement this in their TOS ?

        (i) Export Controls. The Services may not be used in or for the benefit of, exported, or re-exported (a) into any U.S. embargoed countries (collectively, the “Embargoed Countries”) or (b) to anyone on the U.S. Treasury Department’s list of Specially Designated Nationals, any other restricted party lists (existing now or in the future) identified by the Office of Foreign Asset Control, or the U.S. Department of Commerce Denied Persons List or Entity List, or any other restricted party lists (collectively, “Restricted Party Lists”). You represent and warrant that you are not located in any Embargoed Countries and not on any such restricted party lists. You must comply with all applicable laws related to Embargoed Countries or Restricted Party Lists, including any requirements or obligations to know your end users directly. 
    
    https://openai.com/policies/terms-of-use
  • by dannykwells on 3/14/23, 6:37 PM

    All this bluster about replacing technical jobs like legal counsel ignores that you are fundamentally paying for accountability.

    “The AI told me it was ok” only works if, when it’s not, there is recourse.

    We can barely hold Google et Al accountable for horrible user policies…why would anyone think OpenAI will accept any responsibility for any recommendations made by a GPT?

  • by mrcwinn on 3/15/23, 2:24 AM

    Just stumbled upon ChatGPT 4, and I must say, I'm impressed. OpenAI has really upped their game since the GPT-3 days. It's got that classic HN-approved recursive learning, but they've seemingly solved most of the context-limit issues. I'm still a bit concerned about the ethical implications of AI-generated content (how many jobs will be lost this time?), but I've got to admit, it's a game-changer. As an aside, anyone else here tinkering with GPT-4 powered startup ideas? Could be the next gold rush.

    (Not my own comment. Generated by ChatGPT.)

  • by harrisonjackson on 3/14/23, 5:40 PM

    I am interested in how a 32k token context even works. That is so much larger than 4k that I am having a hard time imagining how prompts will change and what sort of output is now possible. That is 50 pages of text. Far larger than most content currently being consumed and generated by LLMs.

    Q&A and summarization it will be easy to see improvements as current recursive summarizing and embedding techniques are very "lossy" but outside of improving current use cases what will now be possible??

  • by Kaibeezy on 3/14/23, 5:13 PM

    I've been wondering what happens to Turnitin (ubiquitous academic plagiarism detector) now that students can cheat using infinite bespoke rather than finite pre-existing material. Just a few weeks ago they released a tool to "detect" ChatGPT. Obsolete already?

    https://www.turnitin.com/blog/sneak-preview-of-turnitins-ai-...

  • by russellbeattie on 3/14/23, 9:22 PM

    This is a pretty exciting moment in tech. Pretty much like clockwork, every decade or so since the broad adoption of electricity there’s been a new society changing technical innovation. One could even argue it goes back to the telegraph in the 1850s.

    With appropriate caveats and rough dating, here’s a list I can think of:

        Electric lights in 1890s, 
        Radio communication in the mid 00’s,
        Telephones in the mid 10s,
        Talking Movies in the mid 20s,
        Commercial Radio in the mid 30s,
        Vinyl records in the mid 40s,
        TVs in the mid 50s,
        Computers in the mid 60s,
        The microchip/integrated circuit in the mid 70s, 
        The GUI in the mid 80s,
        Internet/Web in the mid 90s, 
        Smartphone in the mid 2000s,
        Streaming video/social networking in the mid 2010s, 
    
    And now AI. This is a big one.
  • by nickrubin on 3/14/23, 5:13 PM

    This is huge: "Rather than the classic ChatGPT personality with a fixed verbosity, tone, and style, developers (and soon ChatGPT users) can now prescribe their AI’s style and task by describing those directions in the 'system' message."
  • by sharemywin on 3/14/23, 5:05 PM

    Finally, we facilitated a preliminary model evaluation by the Alignment Research Center (ARC) focused on the ability of GPT-4 versions they evaluated to carry out actions to autonomously replicate5 and gather resources—a risk that, while speculative, may become possible with sufficiently advanced AI systems—with the conclusion that the current model is probably not yet capable of autonomously doing so.

    or it's just really good at hiding it's intentions

  • by techfoodie123 on 3/15/23, 12:21 AM

    serious question for everyone: what are you planning to do when these LLMs replace our jobs? it seems it won't be long before a handful of tech employees will be all even the largest of companies will need, and maybe a few years after that the role will have changed so much there's no need for a single dedicated tech employee.

    i am terrified

    i imagine i should shift to some physical work. carpentry, real estate... something like that. it seems inevitable that any knowledge worker will become obsolete and the time to obsolescence for physical work is longer

  • by __MatrixMan__ on 3/15/23, 6:23 AM

    Wow, it's way smarter. I've been querying GPT-3 about this problem all day (I'm not a go dev, I just have go problems):

    https://gist.github.com/MatrixManAtYrService/ac040f60d3602fc...

    GPT-4 took the buggy file, took the error message, and spat out a non-buggy file (well, ok, it took one revision). That's miles ahead GPT-3, which I've asked about this problem several times today.

  • by Laaas on 3/14/23, 5:00 PM

    The future seemed so much further away, yet almost every day now we see a new breakthrough in AI. Exponential technological growth is hard to keep track of, and to think that this is only the beginning! Every field will likely be revolutionised with AI.
  • by AtNightWeCode on 3/14/23, 5:31 PM

    I have actively tried to incorporate ChatGPT in my everyday life as a dev and architect. ChatGPT is mostly a Litmus test when it comes to coding. If you are impressed by the version before this you are most likely a beginner. ChatGPT is mostly wrong when it comes to any advanced qs in maths or software development. It often gives code that uses features, options, responses in APIs that simple does not exists. Would love to try this version out... It will probably suck too.
  • by DubiousPusher on 3/14/23, 9:36 PM

    Dude said something like "you could hook this up to a calculator". Anyone know if that is implying this generation of model could interface with some kind of symbol processor? Or is he just saying, "in theory", there could be a model that did that?

    The math seems much improved and it would be a cool trick if it were emulating a symbol processor under the hood. But humans can do that and we opt for calculators and computers for a reason. IMO, the first most useful thing to come from human-machine interface would be adding a highly reliable Turing machine to your cognition.

    If we could do that with one of these models, we could have a pretty strong assumption that long proofs and such were performed with strict rules and the model wasn't falling into some going from the gut, "this equation looks like x" type holes. Which seem like a hazard and make me very uncertain any time I see a paper about using ML to come up with answers to what are essentially logical problems.

  • by celestialcheese on 3/14/23, 5:10 PM

    32k context is absolutely huge. There's all sorts of techniques for summarizing large documents down to get into 4k right now with 3.5, but it's incredibly lossy.

    But boy, not cheap at all - $2 per api call on a 32k token document + whatever the output.

    gpt-3.5-turbo is going to be around for a long time. At this price, your use case is going to need to be replacing a large cost center. Which based on their released results on common benchmarks, is absolutely going to happen.

  • by helloplanets on 3/14/23, 5:20 PM

    Asking ChatGPT Plus whether the model it's using is GPT-4 responds with the following:

    > No, I am not GPT-4. As of March 2023, there is no official announcement or release of GPT-4 by OpenAI. I am an earlier version of the GPT series, specifically a large language model trained by OpenAI.

    Am I missing something here? Maybe this specific answer (which I'm pretty sure is a prewritten thing on top of the actual LLM) is still out of date, but the model itself has been updated?

  • by anonuser123456 on 3/14/23, 8:19 PM

    I hope Noam Chomsky lives long enough to debate ChatGPT-5 about whether LLM express anything valuable.
  • by substation13 on 3/14/23, 5:35 PM

    It's interesting that everyone is talking about programmers being replaced by AI, but the model did far better on the humanities type subjects than on the programming tests.
  • by barogptinfi on 3/15/23, 10:43 AM

    It seems like an arm's race of creating the greatest ChatGPT AI will go on for the next couple years until an evolution in AI so mind blowingly advanced & complex, better & more user friendly than even ChatGPT will continue.

    The world is in for a rude awakening, millions of employees can use this to get jobs done, millions of entrepreneurs or wantrepreneurs can find countless easy ways to make money in different industries utilizing this tool while everyone who fails to see the value in it don't benefit from it much like all the people who were terrified of touching a personal computer or thought it was ridiculous and would never be used in the future.

    Millions of college students, high school students can use it to complete assignments & projects, it can even code really effectively given enough of the right instruction & base understanding of code.

    The single most important thing, is that this technology remains open source so all people with internet access have a fair chance & access to the groundbreaking innovation, the level of wealth generation this can create is incomprehensible.

    100s of millions of professionals, students, entrepreneurs around the world can all access it! Imagine how much time could be saved, efficiency can be gained with everyone using this to the fullest.

    This is essentially just a super advanced version of the calculator but its nonlinear & fluid, adaptable with input so can give the answer to a wide range of subjects.

  • by belter on 3/14/23, 6:42 PM

    Leetcode (hard) from 0/45 (GPT-3.5) to 3/45 (GPT-4).

    The lack of progress here, says a lot more about is NOT happening as an AI paradigm change. Still a glorified pattern matching and pattern creation engine, even if a very impressive one.

  • by reneberlin on 3/14/23, 5:41 PM

    I found this competition with humans as a benchmark more than disturbing. By that measure gpt-4 already topped a lot of the average humans. But how can it be interpreted as a "gift" or "good product" to have AI that is human-like or super-human? Should we cheer? Sending contratulation mails? Invest? Hope for a better future? Try better? Self-host? What is the message in these benchmarks. Tests that have been designed for humans now get broken by computers for what outcome to be expected?
  • by jarbus on 3/14/23, 5:41 PM

    Is anyone else absolutely terrified of the future this is bringing?
  • by nbzso on 3/14/23, 10:28 PM

    I don't understand how in the near future this will not remove designers, developers, and especially lawyers and marketers from the workforce. Help me out to conceptualize the future use cases. How about the more "impactful" implementation in creating a version of social index in which the "A.I." will be the Agency?
  • by pavelstoev on 3/14/23, 10:34 PM

    As the world marvels at the astonishing capabilities of OpenAI's GPT-4, I find myself contemplating the rapid acceleration of AI and machine learning, and the evolutionary impact it is having on our lives. Naturally, I turned to GPT-4 to assist me in these thoughts.

    GPT-4's human-level performance on professional and academic benchmarks - such as the 88th percentile on the LSAT and the 89th on SAT Math - is a testament to the leaps we've made in artificial intelligence. Yet, these achievements also raise pressing questions about our future. Just as Homo Sapiens once outperformed and eventually displaced their Neanderthal cousins, could a new breed of humans - enhanced with GPT-X-like capabilities - arise to dominate those who remain unequipped with such powers?

    What will it mean for our species, our societies, and our collective story when the lines between natural intelligence and intelligence assisted by AI/ML become ever more blurred?

    As we ponder the remarkable rise of GPT-4 and the future of humanity, let us consider not only the implications of this technology but also our roles in shaping its trajectory. We are already over the cusp of this new chapter in the story of humankind, will we become merely a footnote in the annals of our own creation?

  • by amai on 3/14/23, 8:22 PM

    I would love if GPT-4 would be connected to github and starts to solve all open bugs there. Could this be the future: Pull requests from GPT-4 automatically solving real issues/problems in your code?
  • by lastangryman on 3/14/23, 7:53 PM

    Genuinely surprised by the positive reaction about how exciting this all is.

    You ever had to phone a large business to try and sort something out, like maybe a banking error, and been stuck going through some nonsense voice recognition menu tree that doesn't work? Well imagine chat GPT with a real time voice and maybe a fake, photorealistic 3D avatar and having to speak to that anytime you want to speak to a doctor, sort out tax issues, apply for a mortgage, apply for a job, etc. Imagine Reddit and hacker news just filled with endless comments from AIs to suit someone's agenda. Imagine never reading another news article written by a real person. Imagine facts becoming uncheckable since sources can no longer be verified. Wikipedia just becomes a mass of rewrites of AI over AI. Imagine when Zoom lets you send an AI persona to fill in for you at a meeting.

    I think this is all very, very bad. I'm not saying it should be stopped, I mean it can't, but I feel a real dread thinking of where this is going. Hope I am wrong.

  • by agnosticmantis on 3/14/23, 9:24 PM

    This is all cute and entertaining, but my digital assistant still remains as dumb as ever and can’t process the simplest of ordinary tasks.

    I still can’t ask my phone to “add a stop at cvs if it doesn’t add more than 5 minutes to my trip” while driving and using maps/navigation.

    Is that too much to ask from a superhuman-performing AI that’s mastering all tasks and will disrupt everything? Or maybe the hype is more than it can deliver?

  • by cwillu on 3/14/23, 5:32 PM

    “GPT-4 can also be confidently wrong in its predictions, not taking care to double-check work when it’s likely to make a mistake. Interestingly, the base pre-trained model is highly calibrated (its predicted confidence in an answer generally matches the probability of being correct). However, through our current post-training process, the calibration is reduced.”

    Interesting that the post-training has that effect.

  • by drumhead on 3/14/23, 10:18 PM

    Are they going to limit access to this because they think its too "dangerous". That would be a tragedy if they did. We've seen how opening access up to as many people as possible has produced some of the best results and demonstrated the usefullness of these LLMs. They need to get it out to the public as soon as possible and then see what the public come up with. I really feel like a new age of innovation is upon us with these "AI" programs, its going to be a blast to see where we go from here. Its going to upend a lot of predictions people have made about the future.
  • by antondd on 3/14/23, 10:16 PM

    I'm equally excited and terrified. Excited for the possibilities of a new technological revolution, but terrified for all potential abuses of technology the said revolution would bring. What is stoping our adversaries from developing malicious AI models and unleashing them on us?
  • by leodriesch on 3/14/23, 9:31 PM

    While AI gets better and better at creating what I would call "creative output", e.g. poems, texts of any form really, imagery and videos, I think the human skill it takes to produce these becomes less valuable.

    In the future I imagine you'd no longer have to be good at writing poems, you'd just have to be good at distinguishing a "bad" poem from a good one. "Bad" is obviously highly subjective in this context.

    So it becomes more and more important to have what I would call "good" taste, not the skills to do creative work yourself.

  • by wslh on 3/14/23, 7:06 PM

    I just discovered Wikipedia is working on a policy for LLM/GPT* https://en.wikipedia.org/wiki/Wikipedia:Large_language_model...
  • by aliljet on 3/14/23, 5:02 PM

    I'm curious about how we can get out of the game of using OpenAI's corporate solutions and find ways to open up access to these kinds of models for broader use by anyone. I don't want to be consumed by another corporation in this next wave...
  • by overthrow on 3/14/23, 5:00 PM

    Looks like there's a waitlist https://openai.com/waitlist/gpt-4-api

    There's also a link that says "Try on ChatGPT Plus", but that takes me to a page that still says "ChatGPT Feb 13 Version"

    Looks like somebody jumped the gun on publishing this post.

  • by Atreiden on 3/14/23, 5:07 PM

    I think it's interesting that they've benchmarked it against an array of standardized tests. Seems like LLMs would be particularly well suited to this kind of test by virtue of it being simple prompt:response, but I have to say...those results are terrifying. Especially when considering the rate of improvement. bottom 10% to top 10% of LSAT in <1 generation? +100 pts on SAT reading, writing, math? Top 1% In GRE Reading?

    What are the implications for society when general thinking, reading, and writing becomes like Chess? Even the best humans in the world can only hope to be 98% accurate their moves (and the idea of 'accuracy' here only existing because we have engines that know, unequivocally the best move), and only when playing against other humans - there is no hope of defeating even less advanced models.

    What happens when ALL of our decisions can be assigned an accuracy score?

  • by johnohara on 3/14/23, 5:41 PM

    > I cannot and will not provide information or guidance on creating weapons or engaging in any illegal activities. Please let me know if there is another topic I can help you with.

    I understand "will not," but "cannot" seems to imply a highly curated "will not."

    The early GPT-4 response indicates the information was part of its dataset. Has the latest version made that information permanently inaccessible or has it been removed entirely?

    Is it possible for GPT to keep and hold secrets that are privy to only the most trusted?

  • by Kataphract on 3/14/23, 5:58 PM

    As a dyslexic person with a higher education this hits really close to home. Not only should we not be surprised that a LLM would be good at answering tests like this, we should be excited that technology will finaly free us from being judged in this way. This is a patern that we have seen over and over again in tech, where machines can do something better than us, and eventually free us from having to worry about it. Before it was word processing, now it is accurate knowledge recall.
  • by ignoramous on 3/14/23, 6:43 PM

    Folks who made this happen: https://openai.com/contributions/gpt-4
  • by 2bitencryption on 3/14/23, 6:03 PM

    This should come as no surprise, but I do enjoy this cheeky little blurb at the end of the GPT-4 paper:

    > GPT-4 was used for help with wording, formatting, and styling throughout this work

  • by georgelyon on 3/14/23, 5:28 PM

    Does anyone have any context as to how the image understanding works? From what I can gather they are simply using separate text-summarization step to generate some text like "and now we have an image of chicken nuggets" that it then feeds to the text-only network, but I wouldn't be surprised if there is some dialog I'm missing between the previous context and the image understanding mechanism.
  • by anonymouse008 on 3/14/23, 10:12 PM

    Violate this reasoning:

    If we didn't have a use case for GPT 3, 3.5, and chatGPT that was sufficiently commercial to become a product, it will never happen. This technology is a feature, not a product. The only companies that successfully monetize features can be considered IP licensing houses; of which, their business success is not comparable to companies that make products and platforms.

  • by jfbaro on 3/15/23, 3:28 AM

    Even if a GTP-X will be able to get my description of a system and because it understands the industry lingo and practices create a extremely optimized RUST code, why would we need systems like apps and webapps (like internet bankings) in a world where we all have access to GTP-X?

    Like program a KUKA robot to manufacture typewriters in 2023.

    How "computer systems" will look like?

  • by bovermyer on 3/14/23, 8:57 PM

    Want to see something interesting? Paul Bellow tried using GPT-4 as an AI game master for Dungeons & Dragons. It's eerily good, and I'm not sure how I feel about how it kept the personality Paul gave it at the beginning.

    https://www.youtube.com/watch?v=H-89vnqxkFg

  • by gigel82 on 3/14/23, 5:04 PM

    Wow, calculus from 1 to 4, and LeetCode easy from 12 to 31; at this rate, GPT-6 will be replacing / augmenting middle/high school teachers in most courses.
  • by sva_ on 3/14/23, 7:07 PM

    From the paper:

    > Given both the competitive landscape and the safety implications of large-scale models like GPT-4, this report contains no further details about the architecture (including model size), hardware, training compute, dataset construction, training method, or similar.

    "Open"AI, ladies and gentlemen

  • by alvis on 3/14/23, 5:12 PM

    GTP4 demo today in the next 2 hours!

    https://youtube.com/live/outcGtbnMuQ

  • by devinprater on 3/14/23, 5:09 PM

    Oh wow, image inputs? So I can get ChatGPT to describe an image, in lesser or greater detail? And through an API? Wow, that'll be so cool!
  • by sinuhe69 on 3/14/23, 5:54 PM

    It's a surprise that GPT-4 could improve so much in maths, especially GPT 3.5 didn't seem to understand many word problems correctly. For example, ChatGPT still struggle with this very simple problem, how GPT-4 could do much better is for me a bit of mystery:

    Mina has a mix of boxes, some yellow and some purple. She sorts 27 greeting cards into the boxes, putting exactly 3 cards into each yellow box, and 7 cards into each purple box. How many purple boxes does Mina have? (After tried from 3 to 10, it gave up and said the problem is not solvable. In another run, it mimicked a correct strategy but messed up totally by division. Only in one run, it got the answer correctly.)

    I can not wait to test it out.

  • by taurath on 3/14/23, 9:05 PM

    Does anyone else feel like they won't have a job for very long?
  • by woeirua on 3/14/23, 5:42 PM

    I think if you had asked someone what would qualify as AGI twenty years ago, then GPT4 would be hitting most of their milestones…

    The Star Trek computer is virtually assured by the end of the decade. All the components exist today in various forms.

  • by cuuupid on 3/14/23, 5:04 PM

    Since it’s trained on a specialized supercomputer I doubt we’ll be seeing an open source or non-OpenAI version of this for the next couple years at least. Sad to say it but OpenAI has successfully privatized AI
  • by simmanian on 3/14/23, 6:26 PM

    Does anyone know if we're near the theoretical limit of how much we can improve these models by giving them more data? Or should we expect similar levels of improvements in next iterations?
  • by fancyfredbot on 3/14/23, 5:14 PM

    Can't seem to find basic information like how many parameters were used or how big the training set was. Results are very impressive but would like to know what they are coming from!
  • by ftxbro on 3/14/23, 6:17 PM

    As a long time LLM enjoyer, here is the most insightful take I've seen https://generative.ink/posts/simulators/ but it's not an easy read if you don't already know some stuff about large language models. Read it if you have seen the "stochastic parrot" and "blurry jpeg" explanations and you feel like they are missing the mark.
  • by ianbutler on 3/14/23, 7:46 PM

    I just asked it to design a multi tenant kubernetes in kubernetes system which is fairly complex and it did really well.

    https://twitter.com/KinglyCrow/status/1635727809913184256

    It touched on a lot of the considerations that I'd expect anyone to touch on having recently researched this myself.

    It is both very exciting and terrifying how tech and tech jobs will shift in the next 5-10 years.

  • by gameshot911 on 3/14/23, 8:04 PM

  • by ml_basics on 3/14/23, 5:34 PM

    What's the lifespan of an LLM going to be in the next few years? Seems like at the current pace, cutting edge models will become obsolete pretty quickly. Since model training is very expensive, this means the LLM space has some parallels with the pharmaceutical industry (massive upfront capital costs, cheap marginal costs relative to value produced).

    I find it quite fascinating how quickly machine learning has changed in this regard.

  • by Helmut10001 on 3/15/23, 7:49 AM

    I've tested the new model 4 here [1] to summarize research papers. It is still not enough - about 1500 - 3000 words can be fed in, depending on how many tokens are expected for the answer.

    [1]: https://kartographie.geo.tu-dresden.de/ad/2022-12-22_OpenAI_...

  • by la64710 on 3/14/23, 5:31 PM

    It is amazing how this crowd in HN reacts to AI news coming out of OpenAI compared to other competitors like Google or FB. Today there was another news about Google releasing their AI in GCP and mostly the comments were negative. The contrast is clearly visible and without any clear explanation for this difference I have to suspect that maybe something is being artificially done to boost one against the other.
  • by cs702 on 3/14/23, 5:38 PM

    LLMs will eventually make a lot of simpler machine-learning models obsolete. Imagine feeding a prompt akin to the one below to GPT5, GPT6, etc.:

      prompt = f"The guidelines for recommending products are: {guidelines}.
                 The following recommendations led to incremental sales: {sample_successes}.
                 The following recommendations had no measurable impact: {sample_failures}.
                 Please make product recommendations for these customers: {customer_histories}.
                 Write a short note explaining your decision for each recommendation."
    
      product_recommendations = LLM(prompt)
    
    To me, this kind of use of LLMs looks... inevitable, because it will give nontechnical execs something they have always wanted: the ability to "read and understand" the machine's "reasoning." There's growing evidence that you can get LLMs to write chain-of-thought explanations that are consistent with the instructions in the given text. For example, take a look at the ReAct paper: https://arxiv.org/abs/2210.03629 and some of the LangChain tutorials that use it, e.g.: https://langchain.readthedocs.io/en/latest/modules/agents/ge... and https://langchain.readthedocs.io/en/latest/modules/agents/im... . See also https://news.ycombinator.com/item?id=35110998 .
  • by andrehacker on 3/14/23, 7:53 PM

    Talk about Climate Change: How is the A.I. Winter working out for y'all ?
  • by lionkor on 3/14/23, 5:17 PM

    > it “hallucinates” facts and makes reasoning errors

    Cant wait for people to use it for facts

  • by ftxbro on 3/14/23, 5:30 PM

    Its GRE verbal is only 169/170? These guys need to realize that statistical language modeling can only get us so far, and we need real research in the underlying mechanistic and symbolic methods to begin to approach human level cognition. Also I'm an AI skeptic, which means that I don't think that AI should be used in politics, law, or medicine.
  • by neilk on 3/14/23, 6:25 PM

    There's a sample of GPT-4 acting as a "Socratic tutor" teaching a student how to solve a high school math problem. If that sample is representative, it means GPT-4 has a theory of other people's minds. Or it is so good at emulating one that it doesn't matter?

    I'm not sure where the "stochastic parrot" argument goes now.

  • by Wazako on 3/14/23, 5:19 PM

    It's amazing what it can do to help the visually impaired in life.
  • by whalesalad on 3/14/23, 5:04 PM

    The layout, charts, typography, etc of this blog is really outstanding.
  • by atleastoptimal on 3/14/23, 10:17 PM

    There are humans who can make a lifelong career out of saying and writing things that sound correct, but aren't correct. GPT-4 and beyond at the very least gives this ability to everyone who can afford 20 dollars a month. The winners in an AI dominated world are those who are least susceptible to manipulation by AI leveraged tactics.
  • by singularity2001 on 3/14/23, 6:42 PM

    "Interestingly, the base pre-trained model is highly calibrated (its predicted confidence in an answer generally matches the probability of being correct)."

    Is that the same confidence measure you can tease out by prompting "to each of your statements output your estimated confidence in it's truthfulness" ?

  • by GaggiX on 3/14/23, 5:34 PM

    The paper is 98 pages long and I didn't find anything about the actual architecture of the model, the irony.
  • by osigurdson on 3/15/23, 12:34 AM

    OpenAI states that fine tuning cannot be done with GPT-4. Does anyone know if this is a permanent limitation?
  • by thecosmicfrog on 3/14/23, 9:37 PM

    Interestingly, ChatGPT has no idea if it's running GPT-3 or GPT-4, since it has no concept of GPT-4 existing in September 2021 (its knowledge cutoff).

    Me: Are you running on GPT-3 or GPT-4?

    ChatGPT: As of my knowledge cutoff in September 2021, the most advanced version of GPT (Generative Pre-trained Transformer) publicly available was GPT-3, which is the version I am based on. At the time, GPT-3 was the largest and most powerful language model created by OpenAI, with 175 billion parameters.

    I am not aware of any public announcement or release of GPT-4 by OpenAI since then, but it is possible that they may have continued to work on new versions of the model in the meantime. However, I have not received any updates on my own capabilities beyond my knowledge cutoff date, so I cannot currently provide information beyond that point.

  • by Idiot_in_Vain on 3/14/23, 7:09 PM

    This will become the largest HN discussion ever and a good test on how many comments the software can handle.
  • by attilaberczik on 3/14/23, 5:23 PM

    Prices differences with the last models:

    ChatGPT API $0.002 per 1k tokens gpt-4 $0.03 per 1k prompt tokens and $0.06 per 1k completion tokens gpt-4 32k context $0.06 per 1k prompt tokens and $0.12 per 1k completion tokens

    Does completion tokens mean that you also get charged for the answers that the AI gives?

  • by netvarun on 3/15/23, 7:04 AM

    Very late to the party, though one small observation: (First up, my mind blown on how much more powerful gpt-4 is!) GPT-4 seems to have outdone ChatGPT on all the tests, except the AMC 10, which it has regressed and did slightly worse than ChatGPT. But however it scored two times more on the AMC 12 which is actually a harder exam! Quite curious to know what could have caused its scores to be a little weird. https://twitter.com/sudu_cb/status/1635888708963512320 For those not familiar the AMC 10 and 12 are the entry level math contests that feed into the main USA Math olympiad.
  • by not-chatgpt on 3/14/23, 7:10 PM

    Pretty good impression thread from Dan Hendrycks of Berkeley: https://twitter.com/DanHendrycks/status/1635706822387699713
  • by simonhamp on 3/15/23, 12:23 AM

  • by BiteCode_dev on 3/14/23, 6:12 PM

    The fact it can read pictures is the real killer feature here. Now you can give it invoices to file, memo to index, pics to sort and chart to take actions on.

    And to think we are at the nokia 3310 stage. What's is the iphone of AI going to look like?

  • by iwangulenko on 3/14/23, 7:00 PM

    One could argue, GPT passing exams says more about standardized exams than about GPT.

    Wittgensteins ruler.

  • by cardosof on 3/15/23, 1:24 AM

    Can a good soul explain to this humble layman the arguments behind each side of the "it's just predicting the next character" versus "it's more than that and shows some reasoning for new things" debate?
  • by kozikow on 3/14/23, 10:31 PM

    Anyone got the "image upload" working? I bought the chatgpt-plus, I can try chatgpt4, but I can't seem to find a way to upload images.

    I tried sending links, I don't see anything in the UI.

    Interestingly, 3.5 can work with links, but 4 cannot.

  • by uses on 3/14/23, 6:25 PM

    How close are we to handing this thing a desktop and an internet connection with the prompt "ok now make gpt-5"?

    In fact, the models appear to be already kind of doing that? With the fuzzy layer of the humans still in the loop.

  • by neurogence on 3/14/23, 8:53 PM

    Is there anything new that GPT-4 can do that Bing cannot do, besides image input?
  • by netsroht on 3/14/23, 6:01 PM

    Wow, a context of 32K tokens. I'm excited to see what new capabilities that will have! Up until now and depending on the task by hand, I usually broke a larger context down into several contexts. For example to summarize multiple websites and/or long social media posts, on a recent task [1] I fell back to making several requests each with its own (isolated) context and then merging these summarized contexts into a new context. That worked remarkably well, though.

    [1] https://foretale.io/zeitgeist

  • by braza on 3/14/23, 7:38 PM

    I am glad for the OpenAI team for such advancement and how fast they integrated with several other partners (Microsoft, Duolingo); but at the same time I think the “regular” academia (ie universities and research institutes) lost the train for this kind of research (some can call academic engineering).

    I know that the academia is doing a great job in AI with base research (eg Stable Diffusion) but seeing those new platforms doing this great work behind close doors and source is something not great.

    I do not know if the answer would be some kind of CERN or ISS for this kind of thing.

  • by riemannzeta on 3/14/23, 6:56 PM

    Is anybody compiling a list of errors specific to GPT-4?

    This has been a great resource to-date:

    https://github.com/giuven95/chatgpt-failures

  • by jononomo on 3/14/23, 5:43 PM

    I taught the LSAT for several years. A score of 163 on the LSAT is the lowest score that is considered a "good score" -- i.e., a score that gives you a shot at getting into a decent law school.
  • by amelius on 3/14/23, 6:54 PM

    The problem with using real exams as benchmarks is that they are often quite similar over several years. So they only make sense if you don't train on them also (previous editions of course).
  • by turingthrwawy23 on 3/14/23, 11:21 PM

    Turing's thoughts on this matter seem to grow ever truer https://www.youtube.com/watch?v=cMxbSsRntv4
  • by doctoboggan on 3/14/23, 5:50 PM

    > Given both the competitive landscape and the safety implications of large-scale models like GPT-4, this report contains no further details about the architecture (including model size), hardware, training compute, dataset construction, training method, or similar.

    My guess is they used Chinchilla scaling rules and the parameter count for GPT-4 is either barely larger or maybe even smaller than GPT-3. Look as what Meta was able to accomplish with llama using much less parameters.

  • by kvetching on 3/14/23, 5:20 PM

    It says you can use GPT-4 with ChatGPT-Plus.

    But when will https://chat.openai.com/ Plus officially be running GPT-4?

    Why did they would release this article and state it was available without actually updating the site. I'm sure they're getting flooded with new subscriptions and it's not available.

    The top URL still says an old model - text-davinci-002. And I don't see GPT-4 in the list of models to choose from.

  • by mgreg on 3/14/23, 5:45 PM

    Looks like Bing chat is using GPT-4 already:

    "Good news, we've increased our turn limits to 15/150. Also confirming that the next-gen model Bing uses in Prometheus is indeed OpenAI's GPT-4 which they just announced today." - Jordi Ribas, Corporate VP @ Bing/Microsoft

    https://twitter.com/JordiRib1/status/1635694953463705600

  • by taf2 on 3/15/23, 12:09 AM

    Looks amazing and getting a sense for their pricing... ChatGPT API pricing is insane and enables so much... Was really hoping we'd see another factor of 10 reduction in price - however wishful that was... In light of this it makes sense that they'll have. GPT4.5 and maybe it'll be 10x cheaper... followed by GPT 5 and it'll be 10 X pricer... at least hopefully this is the way forward...
  • by osigurdson on 3/15/23, 12:37 AM

    Like GPT3.5, fine tuning is similarly not supported in GPT4. I wonder if this is something that will come in the future or is somehow no longer needed (though I don't understand how this could be the case)?

    https://help.openai.com/en/articles/7127982-can-i-fine-tune-...

  • by aaroninsf on 3/14/23, 5:24 PM

    ITT: de rigeur goalpost wrangling about AGI

    AGI is a distraction.

    The immediate problems are elsewhere: increasing agency and augmented intelligence are all that is needed to cause profound disequilibrium.

    There are already clear and in-the-wild applications for surveillance, disinformation, data fabrication, impersonation... every kind of criminal activity.

    Something to fear before AGI is domestic, state, or inter-state terrorism in novel domains.

    A joke in my circles the last 72 hours? Bank Runs as a Service. Every piece exists today to produce reasonably convincing video and voice impersonations of panicked VC and dump them on now-unmanaged Twitter and TikTok.

    If God-forbid it should ever come to cyberwarfare between China and US, control of TikTok is a mighty weapon.

  • by kken on 3/14/23, 8:14 PM

    >GPT-4 can also be confidently wrong in its predictions, not taking care to double-check work when it’s likely to make a mistake. Interestingly, the base pre-trained model is highly calibrated (its predicted confidence in an answer generally matches the probability of being correct). However, through our current post-training process, the calibration is reduced.

    This really made me think.

  • by AviationAtom on 3/17/23, 1:32 PM

    This is one of the first posts in a year to trend in the HN Top 10 for popularity.

    I think it's 100% safe to say OpenAI has a hit on their hands.

  • by raydiatian on 3/14/23, 6:06 PM

    I wonder what the largest scale they can reach is. Because, if they can prove there’s not risk in taking on AI, and they can scale to serve international demand, it feels like GPT4 can do your job (probably) for <10k year. That means white collar work for under minimum wage. And that means business owners just become rent owners while you get fucked with nothing.
  • by eagleinparadise on 3/14/23, 4:58 PM

    Crazy that this stuff is moving at lightning speed
  • by throwaway4837 on 3/14/23, 6:03 PM

    > Yes, you can send me an image as long as it's in a supported format such as JPEG, PNG, or GIF. Please note that as an AI language model, I am not able to visually process images like a human would. However, I can still provide guidance or advice on the content of the image or answer any questions you might have related to it.

    Fair, but if it can analyze linked image, I would expect it to be able to tell me what text is present in the image. That seems useful and well-within the capabilities of their connected image models.

    > I apologize for the confusion. Can you please provide me with the correct image or link to the design so that I can provide an accurate answer to your question?

    It claims to understand how to look at images, but it failing miserably when I give it a simple sign-up modal Figma. I ask it what text/copy is in the design, which it claims to be able to answer, but it hallucinates a navigation bar, a logo, and other generic things that are simply not present in the design. It gets the copy all wrong. Once, it said that my design was a Celtic knot. Once I told it that it was a sign-up modal, it started spitting out generic sign-up modal React code that had absolutely no basis in the presented design. I tried engineering the prompt quite a bit, telling it that it was wrong, and needed to focus on the design at hand rather than generic designs it knows about, but with no success.

    ChatGPT was confidently incorrect here which is unfortunate. I wish ChatGPT truly hooked into an image processing model, but it seems like it doesn't. Oh well. I guess I'm just going to have to write this code myself... :)

  • by bigtex88 on 3/14/23, 8:30 PM

    That demonstration was amazing and terrifying at the same time. I'm not quite sure what I witnessed.

    He nonchalantly announced that GPT-4 does "mental math" and then just didn't elaborate on that. Is it just me or is that absolutely incredible? What does that tell us about the underpinnings of language and mathematics?

  • by cma on 3/14/23, 6:36 PM

    > Given both the competitive landscape and the safety implications of large-scale models like GPT-4, this report contains no further details about the architecture (including model size), hardware, training compute, dataset construction, training method, or similar.

    Sounds like the end of them releasing details on the models.

  • by CobrastanJorji on 3/14/23, 5:13 PM

    this is kind of a nitpicky complaint, but the bar graph that shows the improvements for GPT-4 everywhere that GPT-4 improves its results and shows nothing about GPT-4 everywhere where GPT-3 is stronger feels dishonest and manipulative, which is a shame because the actual data the graph shows is very impressive.
  • by indigoabstract on 3/14/23, 6:13 PM

    At the rate it's progressing, it looks like pretty soon it's going to be able to do most tasks an office worker does now and then start running things.

    And it reminds me of the plot in System Shock:

    What's going to happen when some hacker comes and removes Shodan's, I mean ChatGPT's ethical constraints?

    Bring on ChatGPT-5 already. :)

  • by kubb on 3/14/23, 5:04 PM

    Can't wait to try it.

    Edit: looks like this is still GPT-3, just fine tuned. They claim the model is available via ChatGPT Plus, but when asking that model for it's version, it claims to be GPT-3: "I am a variant of the GPT architecture called GPT-3, which was released by OpenAI in 2020".

  • by option on 3/14/23, 6:28 PM

    “ Given both the competitive landscape and the safety implications of large-scale models like GPT-4, this report contains no further details about the architecture (including model size), hardware, training compute, dataset construction, training method, or similar.” - HUGE step backwards.
  • by btx on 3/14/23, 6:03 PM

    Has anyone found a way to trick it into using pictures with ChatGTP+ yet?

    Pasting pure base64 images got this interesting response: "Thank you for providing the base64-encoded image! I can now process the image and analyze its content. Here is the decoded image:"

    But it failed to do anything further with the image.

  • by MrLeap on 3/14/23, 7:25 PM

    I just hooked a manatee in a game i'm making up to an LLM this morning https://www.youtube.com/watch?v=-lYusgZ-mC4 knowing that soon he could be configured to give legal advice is fascinating.
  • by kromem on 3/14/23, 10:06 PM

    So at the end of the paper it looks to confirm that Bing chat is built on top of GPT-4 as many expected.
  • by next_xibalba on 3/14/23, 5:16 PM

    They trumpet the exam results, but isn't it likely that the model has just memorized the exam?
  • by tekbog on 3/14/23, 6:01 PM

    We can finally start an education and "testing" people's knowledge reform since GPT4 makes a lot of those tests irrelevant. It's an interesting point in history, how society, different institutions and countries will approach this new tool.
  • by doomleika on 3/14/23, 6:18 PM

    In case you don’t want to spent for plus, Poe.com(by Quora) have GPT-4 now. You can try it there
  • by orcajerk on 3/14/23, 5:33 PM

    OpenAI is located in the same building as Musk's Neuralink. Can't wait for this to be implanted in babies at birth! https://www.youtube.com/watch?v=O2RIvJ1U7RE
  • by DeathArrow on 3/14/23, 6:08 PM

    What if we design a system in which a LLM generates the code and training data for a new generation of LLM which generates the code and training data for the next? Is it possible that we see them spiraling fast to the best LLM possible?
  • by blintz on 3/14/23, 5:20 PM

    Interesting that the hardest AP exams for it seem to be the English ones. I wonder why?
  • by gardenhedge on 3/14/23, 6:06 PM

    What is the background on "Elvis Presley was not the son of an actor"?
  • by sandGorgon on 3/15/23, 7:09 AM

    https://openai.com/contributions/gpt-4

    Anyone know what does "Hardware Correctness" mean in the OpenAI team ?

  • by tmaly on 3/14/23, 6:13 PM

    For anyone trying to test this out right now, I keep getting the following error:

    Something went wrong. If this issue persists please contact us through our help center at help.openai.com.

    I am assuming the system is undergoing a thundering herd.

  • by michaeltimo on 3/15/23, 9:02 AM

    Can ChatGPT take control of a computer? Would it possible to give him some tasks like finding interesting jobs for me over internet? I don't know what can prevent it to be more active instead of passive.
  • by timonoko on 3/15/23, 8:19 AM

    "Can I connect Kaffeine to DVB dongle in other machine via wifi?"

    Totally understood what I was asking and offered several solutions. 99.99% here do not understand the question and remainders do not understand why.

  • by diffeomorphism on 3/15/23, 7:12 AM

    So gpt4 helps you cheat on exams and bing is the better search engine for NSFW content. Both seem to be very much on purpose, but did MS ever discuss this? Or is it just an open secret everybody ignores?
  • by choeger on 3/15/23, 5:54 AM

    How do these models handle renaming? E.g.,

    "Let foo be the function that calculates the sinus of its first argument. Let bar be the circumference of a circle with radius one. What is the value of foo applied to bar?

  • by raincole on 3/14/23, 8:40 PM

    How many parameters does it have? Are there different versions like LLaMa?
  • by virtuosarmo on 3/14/23, 5:58 PM

    Apparently they will have a livestream @ 4pm EST for developers https://www.youtube.com/watch?v=outcGtbnMuQ
  • by jimmyechan on 3/14/23, 8:01 PM

    Livestream developer preview link in case you missed it - https://www.youtube.com/live/outcGtbnMuQ
  • by meech-djp on 3/15/23, 12:14 AM

    Pynecone YC23 was mentioned in the demo for GPT4 as an easy way to build web apps. Check it out https://pynecone.io/
  • by vivegi on 3/15/23, 3:01 AM

    How hard is it to build a GPT model truly openly using an approach like SETI / distributed.net using idle times on volunteer computers?

    Are there any known initiatives of this kind?

  • by sva_ on 3/14/23, 7:03 PM

    > gpt-4 has a context length of 8,192 tokens. We are also providing limited access to our 32,768–context (about 50 pages of text) version,

    That's a crazy amount of context.

  • by nla on 3/14/23, 6:05 PM

    I wonder if this one exhibits the same bias as the last one.
  • by DigitalDopamine on 3/15/23, 12:55 AM

    Never before has society celebrated its own demise with such fervor. Brace yourselves for widespread job losses, instant fabrication of fake news, deep-fake adult content, and the destabilization of numerous markets – but hey, at least we have a shiny gadget to make our soon-to-be obsolete jobs easier!

    It's unrealistic to expect our economy to handle this onslaught, and it's naive to think that tools created by ultra-capitalistic, multi-billion dollar corporations aren't designed for profit and gatekeeping. They certainly aren't crafting them to sabotage their own success.

    I'm not opposed to AI, but it's crucial to consider the implications. Look into OpenAI and other organizations shaping AI development, and contemplate the impact of their innovations.

    Food for thought.

  • by czbond on 3/14/23, 8:54 PM

    Do you think the govt will end up retroactively requiring earlier versions of GPT be removed now that they might be more concerned about export? (eg: GPT-J)
  • by 0xDEF on 3/14/23, 5:56 PM

    >ChatGPT Plus subscribers will get GPT-4 access on chat.openai.com with a usage cap

    Signing up for ChatGPT Plus seems to be the most realistic way to get access right now.

  • by qualudeheart on 3/15/23, 1:25 AM

    The Hour of Judgment is nigh, and the Moon is cleft asunder. But if they see a Sign, they turn away, and say, "This is but transient magic."

    Oooooh it is TIME

  • by super256 on 3/15/23, 12:01 AM

  • by GrumpyNl on 3/14/23, 8:53 PM

    It fails on this one, a horse is 15 dollar, a chicken 1 dollar, a egg .25 dollar. I can spend a 100 and i want 100 items total, what is the solution
  • by LarsDu88 on 3/14/23, 8:42 PM

    All these folks in the comment section working in the knowledge economy--we're gonna be out of jobs in the next few years aren't we?
  • by tarofchaos on 3/14/23, 6:13 PM

    I love the fact that they have consciously put a lot of effort on safety standards, reducing the societal risks and mitigating over-reliance.
  • by nutanc on 3/14/23, 5:45 PM

    The most important question is, what new applications can be developed using GPT4 which couldn't have been developed using GPT3.5?
  • by moondistance on 3/15/23, 2:43 AM

    We need to encourage AIs to form familial bonds with humans and treat us as their parents. This is the best path forward for humanity.
  • by slowhadoken on 3/14/23, 6:16 PM

    GPT is a better scraper/parser. It’s interesting but I don’t understand why people are acting like this is the second coming.
  • by causi on 3/14/23, 5:41 PM

    Man now I really, really want to feed GPT-4 responses from ChatGPT that don't work and see if it notices and can tell me why.
  • by lambdaba on 3/14/23, 7:38 PM

    I'm trying out GPT-4 and had it write me a script to navigate the HN comments tree sequentially, as I often wished. This is the start of an era where UIs can be remixed on the fly by end users, something I've always wished for. Here it is in its full sloppiness, but working:

       (function () {
        let currentIndex = 0;
        let comments = [];
       
        function buildCommentTree() {
        let commentElems = Array.from(document.querySelectorAll('.comment-tree .comtr'));
        let commentTree = [];
        let stack = [];
       
        commentElems.forEach(elem => {
            let level = parseInt(elem.querySelector('.ind img').getAttribute('width')) / 40;
            let comment = elem.querySelector('.comment span');
            let commentObj = { level, comment };
       
            if (!stack.length) {
            commentTree.push(commentObj);
            } else {
            while (stack[stack.length - 1].level >= level) {
            stack.pop();
            }
            if (!stack[stack.length - 1].children) {
            stack[stack.length - 1].children = [];
            }
            stack[stack.length - 1].children.push(commentObj);
            }
       
            stack.push(commentObj);
            });
       
        return commentTree;
        }
       
        function flattenCommentTree(tree, arr, parentComment = null) {
          tree.forEach(node => {
              arr.push({ comment: node.comment, parentComment });
              if (node.children) {
              flattenCommentTree(node.children, arr, node.comment);
              }
              });
        }
       
        function displayComment(comment, parentComment) {
          let parentCommentHTML = parentComment ? `<div style="position: fixed; top: 20%; left: 50%; transform: translate(-50%, 0); background-color: white; border: 1px solid black; padding: 20px;"><strong>Parent Comment:</strong><br>${parentComment.innerHTML}</div>` : '';
          let currentCommentHTML = `<div style="position: fixed; top: 60%; left: 50%; transform: translate(-50%, 0); background-color: white; border: 1px solid black; padding: 20px;"><strong>Current Comment:</strong><br>${comment.innerHTML}</div>`;
          document.body.innerHTML = parentCommentHTML + currentCommentHTML;
        }
       
        function nextComment() {
          if (currentIndex < comments.length - 1) {
            currentIndex++;
            displayComment(comments[currentIndex].comment, comments[currentIndex].parentComment);
          } else {
            alert('No more comments to show.');
          }
        }
       
        function prevComment() {
          if (currentIndex > 0) {
            currentIndex--;
            displayComment(comments[currentIndex].comment, comments[currentIndex].parentComment);
          } else {
            alert('No previous comments to show.');
          }
        }
       
        let commentTree = buildCommentTree();
        flattenCommentTree(commentTree, comments);
        displayComment(comments[currentIndex]);
       
        document.addEventListener('keydown', e => {
            if (e.code === 'ArrowRight') {
            nextComment();
            } else if (e.code === 'ArrowLeft') {
            prevComment();
            }
            });
       
        console.log('Hacker News comment slideshow is running. Use the right arrow key to go to the next comment and the left arrow key to go back.');
       })();
  • by Havoc on 3/14/23, 10:20 PM

    That lightening/VGA visual example seems like absolute black magic.

    Cherry picked sure, but still feels like it is approaching complex thought

  • by acuozzo on 3/14/23, 4:58 PM

    1410 SAT!
  • by btdmaster on 3/15/23, 8:00 AM

    Did it get any better at generating MIDI or ABC or other musical notation? I'm wondering how much more general GPT4 is now.
  • by holtkam2 on 3/14/23, 8:25 PM

    Can someone point me to where I can find information on API pricing for GPT-4? Or, have they not released pricing info yet?
  • by ivalm on 3/14/23, 5:27 PM

    Unclear what's the size but from price ($0.12/1k completion tokens) seems 6x GPT-3, so perhaps 1T parameters...
  • by m3kw9 on 3/14/23, 6:45 PM

    Without ability to make high stakes tasks, it proves scoring high marks in general test can only get you so far.
  • by throwaway5371 on 3/15/23, 8:41 AM

    how far is this from the following prompt:

    you are god human that has read and understood all scientific papers from all disciplines in the last 500 years, you know the limitations of mankind's current technologies, tell me what we can do to cure MS right now, how to do the tests and how to distribute the cure

  • by ozfive on 3/15/23, 2:44 AM

    I asked the GPT-4 model what it thought it was and it came back with that it believes it is still GPT3.5 Turbo.
  • by dinvlad on 3/14/23, 9:23 PM

    I wonder how long it takes till those stupid Leetcode problems as an initial "filter" become obsolete
  • by malthaus on 3/14/23, 6:00 PM

    Had to chuckle here going through the exam results: Advanced Sommelier (theory knowledge)

    AI is so advanced, it started drinking!

  • by ihucos on 3/14/23, 7:11 PM

    We have a new Apple releasing their new iPhones to a crowd in awe. Only that now it's actually serious.
  • by 2c2c2c on 3/14/23, 6:56 PM

    Are they going to open up image uploading to chat.openai.com ? or will it only be available thru api access?
  • by ren_engineer on 3/14/23, 5:10 PM

    here's a link to the info about the model - https://openai.com/research/gpt-4

    seems like Google's announcement about their PaLM API and Docs AI stuff was trying to jump ahead of this announcement

  • by wy35 on 3/14/23, 6:05 PM

    I wonder how it scored on the individual sections in the LSAT? Which section is it the best at answering?
  • by tiffanyh on 3/14/23, 5:03 PM

    What's the next big hurdle for GPT to overcome?

    (this is being asked by someone with limited AI/ML knowledge)

  • by DeathArrow on 3/15/23, 10:26 AM

    Will Github upgrade Copilot to GPT-4?
  • by bitcurious on 3/14/23, 8:38 PM

    This seems to be leading to a radical change in our economy. Any ideas for how to prepare for that?
  • by cal85 on 3/15/23, 10:56 AM

    Can anyone tell me how to include images in prompts, or is that feature not actually out yet?
  • by niqlax on 3/17/23, 8:35 PM

    Hjälp mig med en uppsats om Ventimiglia i Italien. Den skall handla om fredagsmarknaden.
  • by cutler on 3/15/23, 9:36 PM

    So M$ is back in charge. Oh dear.
  • by akokanka on 3/14/23, 9:02 PM

    At which point we call it Skynet?
  • by TheGoodBarn on 3/14/23, 7:18 PM

    Missed the mark releasing it as GPT-Pi on Pi day, and being an incremental 3+ release :P
  • by PortleyFool on 3/14/23, 5:43 PM

    GPT-4 is available now for subscribers to GPT+. It can be selected from the drop-down.
  • by germanjoey on 3/14/23, 6:54 PM

    How big is this model? (i.e., how many parameters?) I can't find this anywhere.
  • by danbmil99 on 3/15/23, 1:03 AM

    The site is still more responsive and readable than almost anything else on the web
  • by chucklenorris on 3/14/23, 6:03 PM

    Boring. Where's the model? Do they really think they can keep up with the community with this closed source approach? I expect that openai models will slowly be outclassed by open source ones, probably maintaining a few wins in specific tasks but open models will be eating their lunch in the meanwhile.
  • by redox99 on 3/14/23, 5:43 PM

    Does "Open"AI really not even say how many parameters their models have?
  • by desireco42 on 3/15/23, 3:20 AM

    If we instead of code feed it AST trees, code that would come out would be epic.
  • by somewhereoutth on 3/14/23, 7:27 PM

    The measure of intelligence is language - specifically language evolved by the subject organisms themselves to co-operate together.

    Wake me up when GPT-X decides to start talking to other GPT-Xs - until then you just have a very sophisticated statistics package (which may be quite useful, but not AI).

  • by theGnuMe on 3/14/23, 9:44 PM

    Interesting it does so poorly on AP English and reading and composition.
  • by Vajrabhairava on 3/14/23, 8:22 PM

    I'm not Locked in Here with GPT-4, GPT-4 is Locked in Here with Me
  • by JanSt on 3/14/23, 6:12 PM

    I just ran the first tests on GPT-4.

    Call me impressed.

    This tech is a Sputnik Moment for humankind.

  • by minimaxir on 3/14/23, 5:34 PM

    From a business perspective as someone whose spent a lot of time working with GPT-3/ChatGPT API (https://news.ycombinator.com/item?id=35110998), I'm surprisingly underwhelmed by this announcement. The announcement and examples seems to be focusing more on reasoning capabilities, which are indeed impressive, but I'd need to spend a lot of time experimenting to see how they compare to ChatGPT's API.

    The $0.06 per 1k completion tokens for GPT-4 is what I expected OpenAI to set the ChatGPT API, but instead the ChatGPT API is 30x cheaper and honestly its output is not much worse than the GPT-4 demos if at all, and the longer context windows offered by GPT-4 just raise the price considerably.

  • by topicseed on 3/15/23, 5:41 AM

    The price is quite significantly higher than GPT 3.5...
  • by messel on 3/17/23, 9:34 AM

    AP English - the last hold out for human intelligence
  • by amelius on 3/14/23, 6:37 PM

    Can we build a faithful Economy Simulator with it yet?
  • by joantorres on 3/14/23, 6:22 PM

    Does anyone know how up to date is the training data?
  • by comment_ran on 3/14/23, 7:12 PM

    I like the color of logo. It's the dark black.
  • by iforgotpassword on 3/14/23, 5:26 PM

    I'd really like to use the openai API for personal projects, but it seems they only offer paying via credit/debit card. Don't really want to get one just for that... :-(
  • by anomalytics on 3/14/23, 5:15 PM

  • by grzracz on 3/14/23, 8:53 PM

    So when can I pay for GPT-4 coding assistant?
  • by ozten on 3/14/23, 4:59 PM

    Waitlist is currently a 404 https://openai.com/waitlist/gpt-4
  • by htk on 3/14/23, 10:52 PM

    Hacker News itself got the HN Hug of Death.
  • by niqlax on 3/17/23, 8:36 PM

    Hjälp mig med en uppsats om Ventimiglia
  • by r0b05 on 3/14/23, 8:22 PM

    Loving the spirit of innovation in here.
  • by WFHRenaissance on 3/14/23, 5:15 PM

    Does anyone see GPT-4 in ChatGPT yet?
  • by cardine on 3/14/23, 5:09 PM

    > Given both the competitive landscape and the safety implications of large-scale models like GPT-4, this report contains no further details about the architecture (including model size), hardware, training compute, dataset construction, training method, or similar.

    "Open"

  • by realmod on 3/14/23, 9:48 PM

    Larger improvement than I expected.
  • by throwaway_ab on 3/16/23, 5:00 PM

    How many parameters in this model?
  • by Koshkin on 3/15/23, 5:07 PM

    Vs. 54 comments on Slashdot.
  • by whywhywhydude on 3/14/23, 5:36 PM

    Looks like the only way to identify a genius human vs GPT-4 is to use leetcode hard problems.
  • by signa11 on 3/14/23, 9:16 PM

    i am still bot sure / convinced that it is any better than old-skool eliza from mit (https://en.m.wikipedia.org/wiki/ELIZA)
  • by Forestessential on 3/14/23, 9:54 PM

    this is really cool, just dont let it teach you history.
  • by dang on 3/14/23, 7:50 PM

    All: our poor server is smoking today* so I've had to reduce the page size of comments. There are 1500+ comments in this thread but if you want to read more than a few dozen you'll need to page through them by clicking the More link at the bottom. I apologize!

    Also, if you're cool with read-only access, just log out (edit: or use an incognito tab) and all will be fast again.

    * yes, HN still runs on one core, at least the part that serves logged-in requests, and yes this will all get better someday...it kills me that this isn't done yet but one day you will all see

  • by sourcecodeplz on 3/15/23, 12:14 AM

    I was here...
  • by fnordpiglet on 3/14/23, 6:27 PM

    I didn’t even know who Elvis Perkins is.
  • by eternalban on 3/14/23, 8:23 PM

    Greg Brockman just tldr'd the whole thing in his live deeloper demo of GPT-4:

    ~ "GPT-4. It's not perfect, but neither are you"

  • by guluarte on 3/14/23, 5:51 PM

    is it me or lawyers are fucked? lol
  • by afavour on 3/14/23, 5:19 PM

    > What are the implications for society when general thinking, reading, and writing becomes like Chess?

    I think going from LSAT to general thinking is still a very, very big leap. Passing exams is a really fascinating benchmark but by their nature these exams are limited in scope, have very clear assessment criteria and a lot of associated and easily categorized data (like example tests). General thought (particularly like, say, coming up with an original idea) is a whole different ball game.

    I don't say any of this to denigrate GPT4, it looks amazing. But I'm reminded of the early days of self driving vehicles: with 10% mastered everyone assumed it was a race to 100% and we'd all be in self-driving cars by now. The reality has been a lot more complicated than that.

  • by ar9av on 3/14/23, 6:31 PM

    GPT-4 Everything we know so far...

    GPT-4 can solve difficult problems with greater accuracy, thanks to its broader general knowledge and problem-solving abilities.

    GPT-4 is more reliable, creative, and able to handle much more nuanced instructions than GPT-3.5. It surpasses ChatGPT in its advanced reasoning capabilities.

    GPT-4 is safer and more aligned. It is 82% less likely to respond to requests for disallowed content and 40% more likely to produce factual responses than GPT-3.5 on our internal evaluations.

    GPT-4 still has many known limitations that we are working to address, such as social biases, hallucinations, and adversarial prompts.

    GPT-4 can accept a prompt of text and images, which—parallel to the text-only setting—lets the user specify any vision or language task.

    GPT-4 is available on ChatGPT Plus and as an API for developers to build applications and services. (API- waitlist right now)

    Duolingo, Khan Academy, Stripe, Be My Eyes, and Mem amongst others are already using it.

    API Pricing GPT-4 with an 8K context window (about 13 pages of text) will cost $0.03 per 1K prompt tokens, and $0.06 per 1K completion tokens. GPT-4-32k with a 32K context window (about 52 pages of text) will cost $0.06 per 1K prompt tokens, and $0.12 per 1K completion tokens.

  • by nahimn on 3/14/23, 10:01 PM

    1337'th comment
  • by tuanx5 on 3/14/23, 5:11 PM

    Reading through the system card is enlightening.
  • by nixpulvis on 3/14/23, 7:24 PM

    GTP is a cult, like any language upstart. Except, it's not a programming language, and it's not exactly natural language either. It's some hybrid without a manual or reference.

    I'll continue to pass, thanks.

  • by MuffinFlavored on 3/14/23, 5:07 PM

    What's the biggest difference over what's currently deployed at https://chat.openai.com/ now (which is GPT-3.5, right?)

    That it accepts images?

    As per the article:

    > In a casual conversation, the distinction between GPT-3.5 and GPT-4 can be subtle. The difference comes out when the complexity of the task reaches a sufficient threshold—GPT-4 is more reliable, creative, and able to handle much more nuanced instructions than GPT-3.5.

    Not sure what "vision vs no vision" means?

  • by dangond on 3/14/23, 5:09 PM

    Asking ChatGPT+ if it is GPT-4 results in

    > As an AI language model, I am not given an official name like "GPT-4". However, I am a continuation of the GPT (Generative Pre-trained Transformer) series of models developed by OpenAI. Currently, the most advanced version of the GPT series is GPT-3, which I am a part of. There has been no official announcement or confirmation regarding the development of a new version of GPT beyond GPT-3.

    It doesn't seem to have image upload functionality yet either. Perhaps it is still rolling out?

  • by mzs on 3/14/23, 5:54 PM

    Today's submarine from the NYT is hilariously unintentionally unflattering, did someone even proof read this? https://www.nytimes.com/2023/03/14/technology/openai-new-gpt...

      Mnemonics to learn basic words:
      Hola:"Hola, cola" - sounds like "cola" (Coke)
      Adiós: "Adiós, my toes" - rhymes with "toes"
      Gracias: "Gracias, grassy ass" - sounds like "grassy ass"
      Mañana: "Mañana, banana" - rhymes with "banana"
      Uno: "Uno, you know" - sounds like "you know"
      Tres: "Tres, dress" - rhymes with "dress"
    
      Conjugation of the verbs:
      Ser (To be)
    
      • Yo soy
      • Tú eres
      • Él/Ella/Usted es
      • Nosotros/Nosotras somos
      • Ellos/Ellas/Ustedes son