Zhenyu Li1, Mykola Lavreniuk2, Jian Shi1, Shariq Farooq Bhat1, Peter Wonka1.
KAUST1, Space Research Institute NASU-SSAU2
- 2024-12-01: Initially release project page, paper, codes, and offline demo.
Install environment using environment.yaml
:
Using mamba (fastest):
mamba env create -n amodaldepth --file environment.yaml
mamba activate amodaldepth
Using conda :
conda env create -n amodaldepth --file environment.yaml
conda activate amodaldepth
Install pix2gestalt:
git clone https://github.com/cvlab-columbia/pix2gestalt.git
mv ./pix2gestalt ./pix2gestalt_raw
mv ./pix2gestalt_raw/pix2gestalt ./pix2gestalt
rm -rf ./pix2gestalt_raw
Install taming-transformers and CLIP:
git clone https://github.com/CompVis/taming-transformers.git
pip install -e taming-transformers/
git clone https://github.com/openai/CLIP.git
pip install -e CLIP/
Model | Checkpoint | Description |
---|---|---|
Base Depth-Anything-V2 Model | Download | Base depth model. Save to work_dir/ckp/amodal_depth_anything_base.pth |
Amodal-Depth-Anything Model | HuggingFace Model | Amodal depth model. Automatically downloading |
SAM | Download | Segmentation model. Save to work_dir/ckp/pix2gestalt/sam_vit_h.pth |
pix2gestalt | Download | Amodal segmentation model. Save to work_dir/ckp/pix2gestalt/epoch=000005.ckpt |
wget -c -P ./work_dir/ckp/ https://huggingface.co/zhyever/Amodal-Depth-Anything/resolve/main/base_depth_model/amodal_depth_anything_base.pth
wget -c -P ./work_dir/ckp/pix2gestalt/ https://gestalt.cs.columbia.edu/assets/sam_vit_l.pth
wget -c -P ./work_dir/ckp/pix2gestalt/ https://gestalt.cs.columbia.edu/assets/epoch=000005.ckpt
Before executing the code, make sure the folder structure is as follows:
Amodal-Depth-Anything
├── assets
├── CLIP
├── config
├── data_split
├── data_split
├── pix2gestalt
│ ├── configs
│ ├── ldm
│ ├── ... other files
├── src
├── taming-transformers
├── work_dir
│ ├── ckp
│ │ ├── amodal_depth_anything_base.pth
│ │ ├── pix2gestalt
│ │ │ ├── epoch=000005.ckpt
│ │ │ ├── sam_vit_h.pth
├── app.py
├── ... other files
Run our infer.py
to estimate amodal depth based on the input image and amodal mask:
python ./infer.py --input_image_path ./assets/inference_examples/case1.jpg --input_mask_path ./assets/inference_masks/case1_mask.png --output_folder ./assets/results/
Have no idea how to get amodal masks? Try our offline demo that implements both Model Heuristics (with the power of pix2gestalt) and Human Heuristics (drawing masks manually) modes
Run our offline demo app.py
to start the offline demo:
python ./app.py
After that, you would see the following message:
... tons of logs
Running on local URL: http://127.0.0.1:7860
Running on public URL: https://xxxx.gradio.live
Simply open the local URL in your browser to start the demo.
- 2024-12-01: Online Gradio Demo
- 2024-12-01: Dataset Preparation and Training Docs