from Hacker News

How can we make robotics more like generative modeling?

by ericjang on 7/30/22, 4:16 PM with 3 comments

by ilaksh on 7/30/22, 7:25 PM
It seems like these things could perform much better if the modeling was deliberately decomposed more and there was less emphasis on doing everything in one parallel step.
For example, translating the visual sampling into a 3d model first. Or maybe some neural representations that can generate the 3d models. Then train the movement on that rather than raw pixels.
Similarly, for textual prompts of interactions, first create a model that relates the word embedding to the same 3d modeling and physics interactions.
Obviously much easier said than done.