from Hacker News

How we got Stable Diffusion XL inference to under 2 seconds

by varunshenoy on 8/31/23, 8:20 PM with 5 comments

  • by cwillu on 9/1/23, 7:32 AM

    Playing around with cfg technique, I'm finding that turning off guidance at the 40% mark causes requested fine details to not appear in the final image. This sorta implies that switching cfg midway and/or switching prompt vectors might be interesting from a prompting standpoint, but it kinda kills it as a performance optimization.
  • by Tenoke on 9/1/23, 10:17 AM

    It's a bit weird to talk about steps but not about the sampler (20 steps with Euler vs 20 steps in DPM+2M Karras are pretty different beasts both in terms of speed and quality).

    I also see compiling but no AITemplate, which seems to be the among the hottest way to speed-up SD recently.

  • by yieldcrv on 9/1/23, 4:02 PM

    This could save alot of money on Replicate.ai

    Especially if you are charging your users the same 1,000% markup while your own costs have been cut into 1/3rd and deliver results faster

  • by gmerc on 9/1/23, 7:40 AM

    I don’t know man, out of the box on SD-Next it’s about 3-4 secs for a picture at 1024 with UniPC and 20 steps on a 4090