from Hacker News

AudioGen: Textually Guided Audio Generation

by pierre on 9/30/22, 7:02 PM with 16 comments

  • by solardev on 9/30/22, 11:29 PM

    The last thing you'll hear before the AI eats you: https://felixkreuk.github.io/text2audio_arxiv_samples/large_...
  • by iamthemonster on 10/1/22, 8:19 AM

    It would be very interesting indeed to have an ebook reader paired with bluetooth earphones, and it simultaneously feeds the words into this to make an ambient soundtrack, perhaps also choosing music appropriate to the word-choice on the page.
  • by nudpiedo on 9/30/22, 9:11 PM

    That could be another missing piece to videogame generational art, sfx sounds and soon soundtracks.
  • by kevmo314 on 9/30/22, 9:55 PM

    The speech samples are really funny. Very Sims-esque.
  • by karmasimida on 9/30/22, 9:20 PM

    It will be more useful if it can narrate text along with those background effects.
  • by youssefabdelm on 10/1/22, 11:18 AM

    -__- I wish researchers would train a stereo 44.1kHz version...why always 16kHz? I know I know 16kHz saves more compute but come ooooon you're Meta
  • by fragmede on 10/3/22, 4:54 AM

    Text2audio is impressive, but I wanna see dance2audio. Just need a million dollars in funding to pay for cameras and dancers.
  • by fuzzythinker on 9/30/22, 8:39 PM

    [code] redirects to the same page
  • by uwagar on 10/1/22, 3:20 AM

    s/textually/sexually

    i giggled :)