Elon Musk’s AI startup training robots in VR

OpenAI, the research company set up by Elon Musk to build safe AI, has revealed a new method for teaching robots using VR.

The non-profit company, which is currently made up of 60 researchers and engineers, has released research showing that it trained a self-learning algorithm to complete a task using VR demonstrations.

The company wanted to teach the machine how to stack blocks in particular orders.

Once the algorithm had been shown how to complete the task once in a VR simulation by a human, it was deployed in a physical robot. The machine was then able to solve the task from an arbitrary stating configuration of coloured blocks.

The company is calling the new algorithm an example of ‘one-shot imitation learning’.

The aim of this kind of machine learning is to allow robots to learn from a limited number of demonstrations but then, crucially, be able to generalise what it has learnt to new situations of the same task.  

Two neural networks

The system makes use of two distinct kinds of neural networks, one relating to vision and another relating to imitation.

The vision network takes images from the camera to represent the position of the objects laid out before it. This network is trained with large amounts of simulated images with different examples of lighting, texture and objects.

Interestingly, the vision network is never trained using a ‘real’ image.

The imitation network, on the other hand, views a demonstration and then processes it to try and ascertain the basic intent of the task. The network must then try to generalise the intent of the task to new settings.

With regards to block stacking, the machine was shown demonstrations that stacked the blocks into the same order, but from different starting configurations. The network then tries to match the order of coloured blocks from different starting positions.  

You can learn more about this project here

Related Post

Leave a Reply

Your email address will not be published. Required fields are marked *