From bc21e853f26136189cc94ae63e09406ca3b70a5c Mon Sep 17 00:00:00 2001 From: yyyyzy1 <1191867281@qq.com> Date: Thu, 1 Aug 2024 16:53:14 +0800 Subject: [PATCH] 3.0 --- index.html | 92 ++++++++++++++++++++++++++---------------------------- 1 file changed, 45 insertions(+), 47 deletions(-) diff --git a/index.html b/index.html index a46dee3..f2851e4 100644 --- a/index.html +++ b/index.html @@ -2,25 +2,25 @@
- - + + - - + + - - + + -detect:Cup"
+Seg "Cup"
detect:"Broom"
+Seg "Broom"
detect:"Toy"
+Seg "Toy"
detect:"Cookie""
+Seg "Cookie""
detect:"Chocolate"
+Seg "Chocolate"
detect:"Mitts"
+Seg "Mitts"
- Dynamic 3D semantic Gaussians can be used for reconstructing and understanding dynamic scenes captured from a monocular camera, + Semantic 4D Gaussians can be used for reconstructing and understanding dynamic scenes captured from a monocular camera, resulting in a better handling of target information with temporal variations than static sences. - However, most current work focuses on static scenes, directly applying static methods for dynamic scenes is impractical, - as static methods fail to capture the temporal behaviors and features of dynamic targets. - To the best of our knowledge, only one existing work focuses on semantic comprehension of dynamic scenes based on 3DGS. - While this work demonstrates promising capabilities in simple scenes, - it struggles to achieve high-fidelity rendering and accurate semantic features in scenarios where the background contains significant noise and the dynamic foreground exhibits substantial deformation and intricate textures. - Because it simply combines dynamic reconstruction and understanding without considering the difference between static and dynamic Gaussians, leading to the mixture of static background and dynamic foreground features. - To address these limitations, we propose SDHD-G,consists of hierarchical Gaussian flows and hierarchical rendering weights. The former realizes effective separation of static and dynamic rendering and their features. - The former realizes effective separation of static and dynamic rendering and their features. - The latter is employed in scenes with complex background noise (e.g. the “broom” scene in Hypernerf) to enhance the rendering quality of dynamic foregrounds. - Extensive experiments show that our method consistently outperforms previous method on synthetic and real-world datasets. - + However, most recent work focuses on the semantics of static scenes. Directly applying them to understand dynamic scenes is impractical, + which fail to capture the temporal behaviors and features of dynamic targets. + To the best of our knowledge, few existing works focus on semantic comprehension of dynamic scenes based on 3DGS. + While demonstrating promising capabilities in simple scenes, it struggles to achieve high-fidelity rendering and accurate semantic features in scenarios where the static background contains significant noise and the dynamic foreground exhibits substantial deformation with intricate textures. + Because a uniform update strategy is applied to all Gaussians, overlooking the distinctions and interaction between dynamic and static distributions. + This leads to artifacts and noise during semantic segmentation, especially between dynamic foreground and static background. + To address these limitations, we propose the Dual-Hierarchical Optimization(DHO), + which consists hierarchical Gaussian flow and hierarchical rendering guidance. The former implements effective separation of static and dynamic rendering and their features. + The latter helps mitigate the issue of dynamic foreground rendering distortion in scenes where the static background has complex noise (e.g. the “broom” scene in HyperNeRF dataset). + Extensive experiments show that our method consistently outperforms previous method on both synthetic and real-world datasets.
- Dynamic 3D Gaussians Distillation utilizes 3D Gaussian representation and optimizes - spatial parameters of the Gaussians and their deformation, concurrently with - appearance properties with a semantic feature per Gaussian. Our learned representation - enables efficient semantic understanding and manipulation of dynamic 3D scenes. + The overall pipeline of our model. We add semantic properties to each Gaussian and obtain the geometric deformation of the Gaussian at each timestamp t through the deformation field. + In the coarse stage, Gaussians are subjected to geometric constraints, while in the fine stage, geometric constraints are relaxed and semantic feature constraints are introduced. + We utilize dynamic foreground masks obtained from scene priors for hierarchical weighted rendering of the scene, enhancing the rendering quality of dynamic foreground in complex backgrounds.
- The following results show the novel view rendering views and the extracted semantic feature maps using our method, - evaluated on both the real-world HyperNeRF dataset and the synthetic D-NeRF dataset. The visualization of the feature maps is displayed using PCA for dimensionality reduction. + The following results show the novel rendering views and the extracted semantic feature maps using our method, + evaluated on both the real-world HyperNeRF dataset and the synthetic D-NeRF dataset. The visualization of the feature maps is displayed using PCA for dimension reduction.
remove "Cookie" | +Remove "Cookie" | - | remove "Lemon" | +Remove "Lemon" |