You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am currently fine-tuning the Metric3D-small model on a drone perspective dataset and have several technical questions about the training process.
Questions
1. Learning Rate Configuration
Is the initial learning rate of 1e-6 (mentioned in the paper) applied equally to both encoder and decoder?
Do I need to scale the learning rate based on the batch size?
Are there recommended parameter settings specific to the small model?
2. Normal Branch Loss
In my training dataset, some data includes normal annotations. Should I enable NormalBranchLoss?
When I enable NormalBranchLoss, the loss becomes negative after training for some time. What could be causing this?
3. Depth Scaling and Max Value
Because the absolute depth values from drone perspectives are relatively large, the max_val in RAFTDepthNormalDPT5 would be set to a larger value. Does this mean the regress_scale needs corresponding adjustment? What is the relationship between these two settings?
4. Depth Normalization Challenges
After applying LabelScaleCononical and RandomResize transforms, depth labels can exceed original depth values. This means max_val (depth_normalize) cannot be set based on the original dataset depth range. Is the only solution to pre-calculate max_val based on the transform pipeline?
I've tried adaptive max_val, but it leads to NaN gradients. Is an adaptive approach feasible?
Additional Context
Model: Metric3D-small
Dataset: Drone perspective dataset, such as Wilduav、Mid-air.
I would appreciate insights into these technical challenges. If my understanding is incorrect, please provide guidance. Thank you in advance!
The text was updated successfully, but these errors were encountered:
I am currently fine-tuning the Metric3D-small model on a drone perspective dataset and have several technical questions about the training process.
Questions
1. Learning Rate Configuration
2. Normal Branch Loss
3. Depth Scaling and Max Value
4. Depth Normalization Challenges
max_val
(depth_normalize) cannot be set based on the original dataset depth range. Is the only solution to pre-calculatemax_val
based on the transform pipeline?max_val
, but it leads to NaN gradients. Is an adaptive approach feasible?Additional Context
I would appreciate insights into these technical challenges. If my understanding is incorrect, please provide guidance. Thank you in advance!
The text was updated successfully, but these errors were encountered: