I've continued working on my lightfields project, and with some help I've had relative success, which I am happily publishing here. My warning is that I'm 100% amateur so expect a lot of errors and weird things though, and the code is not that clean either (lack of time or too lazy?). I've published both a github repo and a Colab notebook with the code to train the Deepview model, both with the Spaces dataset and the Real Estate 10K dataset. In fact the RE10k dataset is very large and needs pre-processing (downloading all youtube videos and getting screenshots at certain frames) so I've also published the dataset in a more readily consumable format, as a set of 39 gitlab repos. Here's the first repo (change the number to up to 39 for the other ones).
Here's the viewer for the MPI for the first Spaces scene, after some training. I've used 200x200 px tiles and just 10 depth layers. NOTE: drag your mouse or click on the buttons to change the camera POV!
The first conclusion, and the most important for me, is that the system is highly sensitive to the camera parameters. If the inferred camera position/rotation are not correct enough, the model just won't be able to create a working MPI. So my takeaway is that I need to work harder on establishing or inferring the camera positions/extrinsics to be able to use the model if I want to be able to create MPIs from images/videos recorded by myself/others. I already have some ideas about how to go about it. As the camera rig I'm using will be on a plane surface, that's a strong prior, and as they're fixed I could even calculate the extrinsics somewhat manually. Also, before doing that, I'm going to try that “manual” method to infer the extrinsics for the cameras of the first spaces scene.
However, apart from that, in my book I've been successful. I've trained the model and it works well enough with both the trained Spaces dataset as well as with other datasets like RE10k even when it's not trained for them. Not bad for an amateur as his first ML project!
Furthermore, there are a number of considerations or improvements that can be explored.
This adventure has not ended yet, so expect a follow-up at some point in the future. I still need to get to a point where I can produce MPIs from my own images/videos. Also, I welcome comments and advice!