You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm using 'vitl' encoder and 'vkitti' for outdoor model. I used random outdoor image which contains cars and the depth was not that accurate.
The image which I used is:
I viewed the image using opencv and got the (x,y) coordinates of the point for which I want the depth. I passed that coordinate to the model and it gave me 6.5 meters. If you see the car in the image, we can easily say that the car is at least 80 or 100 meters away.
Can this model only predict CORRECTLY upto 80 meters?
The text was updated successfully, but these errors were encountered:
I you refer to the Metric depth estimation demo code, at the end it was mentioned that the output is a depth map in meter in numpy. depth = model.infer_image(raw_img) # HxW depth map in meters in numpy
Unfortunately no. I guess this model gives good results(meters) for indoor environments. Although, finding depth is a very high complex task in itself which requires many factors to determine Accurate Depth. If you want to find depth of the points outdoor, then try learning about camera calibration where you'll understand how actually depth is computed.
Stereo vision setup is perfect for finding actual depth compared to these models. May be in future these models might produce great results but even then stereo vision setup will be at the top(at least in my opinion). The datasets which these models use are the results from these stereo vision setup cameras which will be used to train the model. Though setting up the stereo vision is a hassle, the outputs will be accurate. Learn about how to setup stereo cameras at a certain baseline, taking frames from left and right cameras, finding the correspondences and their disparities. Once the disparities are found of all the pixels, then even if you have a one or two actual depth reference points in the scene, you can calculate the entire image pixel's accurate depth.
But nonetheless, depthAnythingV2 is very good model to find indoor scene's depth map and for outdoor I guess its just upto some meters in what I tried.
I'm using 'vitl' encoder and 'vkitti' for outdoor model. I used random outdoor image which contains cars and the depth was not that accurate.
The image which I used is:
I viewed the image using opencv and got the (x,y) coordinates of the point for which I want the depth. I passed that coordinate to the model and it gave me 6.5 meters. If you see the car in the image, we can easily say that the car is at least 80 or 100 meters away.
Can this model only predict CORRECTLY upto 80 meters?
The text was updated successfully, but these errors were encountered: