Currently we have LSD-SLAM working, and that’s cool for us humans to see stuff, but having an object mesh to work with makes more sense. I don’t know if there’s really any difference, but at least in terms of simulator integration, this makes sense. I’m thinking, there’s object detection, semantic segmentation, etc, etc, and in the end, I want the robot to have a relative coordinate system, in a way. But robots will probably get by with just pixels and stochastic magic.
But the big idea for me, here, is transform monocular camera images into mesh objects. Those .obj files or whatever, could be imported into the physics engine, for training in simulation.
arxiv: https://arxiv.org/pdf/1809.05910v2.pdf
github: https://ranahanocka.github.io/MeshCNN/
The PhD candidate: https://www.cs.tau.ac.il/~hanocka/ – In the Q&A at the end, she mentions AtlasNet https://arxiv.org/abs/1802.05384 as only being able to address local structures. Latest research looks interesting too https://arxiv.org/pdf/2003.13326.pdf
ShapeNET https://arxiv.org/abs/1512.03012 seems to be a common resource, and https://arxiv.org/pdf/2004.15004v2.pdf and these obj files might be interesting https://www.dropbox.com/s/w16st84r6wc57u7/shrec_16.tar.gz