You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello,
I am interested in using the Kitti360 dataset and I have some questions.
Firstly, what is the structure of the 3D Bounding Boxes? This is how I understand it: for every object, there is a matrix of vertices (basically the corner points of the bounding boxes). This gives us a Bounding box in the origin of the coordinate frame (camera). These bounding boxes are then translated and rotated using the transform matrix. Is this correct?
Are the 3D bounding boxes created using a convention? For example in 2D the DOTA dataset contains 4 points, where the first point represent the front left of the object (or at least in most cases). Does the order of vertices matter in this sense? Or does the transform matrix contain this information? (I am mostly interested if the Bounding Boxes are symmetrical or not? -- does it matter if the rotation is 0 or 180 degrees?)
Thanks
The text was updated successfully, but these errors were encountered:
I made a few tests:
Basically, you can have the angle of rotation along the Z axis or each object from its rotation matrix. This rotation angle represents the direction of the front of the object (at least for most categories) in the global coordinate system. Then the cam2world rotation matrix can give you the rotation angle of the camera in the current frame (rotated 90 degrees), also in the global coordinate system. Subtracting one from the other results in the object direction angle compared to the current camera position.
This works for most vehicles (car, rider, bicycle, trucks, etc.), not so much for pedestrians. Building types seem to be oriented towards the street. There are a few mistakes (e.g. some motorcycles are wrongly annotated, especially if under blanket).
Waiting for a confirmation from the authors on my results.
Hello,
I am interested in using the Kitti360 dataset and I have some questions.
Firstly, what is the structure of the 3D Bounding Boxes? This is how I understand it: for every object, there is a matrix of vertices (basically the corner points of the bounding boxes). This gives us a Bounding box in the origin of the coordinate frame (camera). These bounding boxes are then translated and rotated using the transform matrix. Is this correct?
Are the 3D bounding boxes created using a convention? For example in 2D the DOTA dataset contains 4 points, where the first point represent the front left of the object (or at least in most cases). Does the order of vertices matter in this sense? Or does the transform matrix contain this information? (I am mostly interested if the Bounding Boxes are symmetrical or not? -- does it matter if the rotation is 0 or 180 degrees?)
Thanks
The text was updated successfully, but these errors were encountered: