Using Multifocal Data (distinct cameras and resolutions) for Cultural Heritage #11

dberga · 2024-04-10T09:44:01Z

Issues posted in:

Nerfstudio
nerfstudio-project#3059
nerfstudio-project#3057

SDFstudio
autonomousvision/sdfstudio#307

dberga · 2024-04-16T16:04:32Z

Nerfstudio fails with images with different size regardless if they were correctly processed by colmap or hloc.

The transforms.json when converted considers 1 camera:
https://github.com/dberga/nerfstudio/blob/main/nerfstudio/process_data/colmap_utils.py#L461
Thus, any transforms.json that consider the first camera but different image sizes will crash on training (error on matrix size)

RuntimeError: Error(s) in loading state_dict for NerfactoModel:                                                                                                                                            
        size mismatch for field.embedding_appearance.embedding.weight: copying a param with shape torch.Size([10, 32]) from checkpoint, the shape in current model is torch.Size([17, 32]).                
        size mismatch for camera_optimizer.pose_adjustment: copying a param with shape torch.Size([10, 6]) from checkpoint, the shape in current model is torch.Size([17, 6]).

A possible solution is to add a padding mechanism (conforming a unique size for all images, padding with zeroes to the smaller images). This can be agnostic to dataset.

dberga · 2024-04-16T16:14:27Z

Here is an example of transforms.json from LandMark https://github.com/InternLandMark/LandMark

single focal example

{
    "camera_model": "SIMPLE_PINHOLE",
    "fl_x": 427,
    "fl_y": 427,
    "w": 547,
    "h": 365,
    "frames": [
        {
            "file_path": "./images/image_0.png",
            "transform_matrix": []
        }
    ]
}

multi focal example

{
    "camera_model": "SIMPLE_PINHOLE",
    "frames": [
        {
            "fl_x": 1116,
            "fl_y": 1116,
            "w": 1420,
            "h": 1065,
            "file_path": "./images/image_0.png",
            "transform_matrix": []
        }
    ]
}

dberga · 2024-04-25T14:27:05Z

Merged in nerfstudio-project@db93476

dberga · 2024-04-26T20:10:42Z

Another solution (padding + mask path) nerfstudio-project#1465

dberga · 2024-05-23T13:43:23Z

An issue regarding colmap to transforms
nerfstudio-project#2784

dberga closed this as completed Apr 25, 2024

dberga reopened this Apr 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using Multifocal Data (distinct cameras and resolutions) for Cultural Heritage #11

Using Multifocal Data (distinct cameras and resolutions) for Cultural Heritage #11

dberga commented Apr 10, 2024

dberga commented Apr 16, 2024

dberga commented Apr 16, 2024

dberga commented Apr 25, 2024

dberga commented Apr 26, 2024 •

edited

Loading

dberga commented May 23, 2024

Using Multifocal Data (distinct cameras and resolutions) for Cultural Heritage #11

Using Multifocal Data (distinct cameras and resolutions) for Cultural Heritage #11

Comments

dberga commented Apr 10, 2024

dberga commented Apr 16, 2024

dberga commented Apr 16, 2024

single focal example

multi focal example

dberga commented Apr 25, 2024

dberga commented Apr 26, 2024 • edited Loading

dberga commented May 23, 2024

dberga commented Apr 26, 2024 •

edited

Loading