# GS-CPR

**Repository Path**: junhao_wei/gs-cpr

## Basic Information

- **Project Name**: GS-CPR
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2025-09-25
- **Last Updated**: 2025-09-25

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# GS-CPR: Efficient Camera Pose Refinement via 3D Gaussian Splatting
**[Changkun Liu](https://lck666666.github.io/),
[Shuai Chen](https://scholar.google.com/citations?user=c0xTh_YAAAAJ&hl=en), 
[Yash Bhalgat](https://scholar.google.com/citations?user=q0VSEHYAAAAJ&hl=en),
[Siyan Hu](https://scholar.google.com/citations?user=S56rLU4AAAAJ&hl=en), 
[Ming Cheng](https://scholar.google.com/citations?user=MPyUxv4AAAAJ&hl=en),
[Zirui Wang](https://scholar.google.com/citations?user=zCBKqa8AAAAJ&hl=en), 
[Victor Prisacariu](https://scholar.google.com/citations?user=GmWA-LoAAAAJ&hl=en) 
and [Tristan BRAUD](https://scholar.google.com/citations?user=ZOZtoQUAAAAJ&hl=en)**

**International Conference on Learning Representations (ICLR) 2025**

**[Project Page](https://xrim-lab.github.io/GS-CPR/) | [Paper](https://openreview.net/forum?id=mP7uV59iJM) | [Video](https://youtu.be/9xhcpLLu7Kg)**

[![GS-CPR](framework_imgs/Method.jpg)](https://openreview.net/forum?id=mP7uV59iJM)
[![GS-CPR_rel](framework_imgs/Method_rel.jpg)](https://openreview.net/forum?id=mP7uV59iJM)

## Installation
### ACT Scaffold-GS environment
We tested our code based on CUDA 12.1, PyTorch 2.5.1, and Python 3.11+.

Clone this repo:

```
git clone https://github.com/XRIM-Lab/GS-CPR.git
cd GS-CPR
```
### Install dependencies for ACT Scaffold-GS rendering

```
cd ACT_Scaffold_GS
conda create -n scaffold_act python=3.11
conda activate scaffold_act
conda install pytorch torchvision pytorch-cuda=12.1 -c pytorch -c nvidia 
pip install -r requirements.txt

# install Tiny-cuda-nn
pip install ninja
pip install git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch

# install depth rendering for 3DGS
git clone git@github.com:leo-frank/diff-gaussian-rasterization-depth.git
cd diff-gaussian-rasterization-depth
python setup.py install
```

### Pretrained 3DGS models, COLMAP files and ACT weights
You can download the pretrained 3DGS models from the provided [link](https://hkustconnect-my.sharepoint.com/:f:/g/personal/cliudg_connect_ust_hk/ElfOnz0vRm9Ot6j47CDzFaoBJrGKoqKGLfb6xYSuMwf7WQ?e=Rrc98i) and unzip them in the folder `GS-CPR/ACT_Scaffold_GS/data/`. You can download pretrained ACT MLP models from the provided [link](https://hkustconnect-my.sharepoint.com/:f:/g/personal/cliudg_connect_ust_hk/ElfOnz0vRm9Ot6j47CDzFaoBJrGKoqKGLfb6xYSuMwf7WQ?e=Rrc98i) and put them in the folder `GS-CPR/ACT_Scaffold_GS/logs/`.
```
ACT_Scaffold_GS
├── data
│   ├── cambridge
│   ├── 7scenes
│   ├── 12scenes
|   ├── Cambridge_semantic
├── logs
|   ├──paper_models
```


And then run the following command to render the synthetic images based on the `coarse_poses`.
```
# generate rendered images based on coarse poses for 7Scenes
bash script_render_pred_7s.sh
# generate rendered images based on coarse poses for 12Scenes
bash script_render_pred_12s.sh
```

For Cambridge Landmarks dataset, we also need calibrated camera intrinsics files for each image before rendering, it will be prepared in [Datasets](#datasets-raw-images--poses--intrinsics) section.
```
# generate rendered images based on coarse poses for Cambridge Landmarks
bash script_render_pred_cam.sh
```

NOTE: For 7scenes COLMAP files, we improve the accuracy of the [sparse point cloud](https://github.com/cvg/Hierarchical-Localization/tree/master/hloc/pipelines/7Scenes) courtesy of Torsten Sattler, using rendered dense depth maps in [HLoc](https://github.com/cvg/Hierarchical-Localization/tree/master/hloc/pipelines/7Scenes) tool box courtesy of Eric Brachmann for [DSAC*](https://github.com/vislearn/dsacstar). Then, we align all poses in `sparse/0/images.txt` to SfM poses in [ICCV 2021](https://github.com/tsattler/visloc_pseudo_gt_limitations). For 12scenes COLMAP files, we utilize SfM models provided by [ICCV 2021](https://github.com/tsattler/visloc_pseudo_gt_limitations). For Cambridge Landmarks, we use SfM models from [HLoc](https://github.com/cvg/Hierarchical-Localization/tree/master/hloc/pipelines/Cambridge) toolbox, courtesy of Torsten Sattler. **All these COLMAP files and 3DGS pretrained models have been prepared in the above download link.**

## Train Scaffold-GS models

If you want to train new Scaffold-GS models, you need COLMAP format `sparse/` file (please refer to the examples and data structure in the pretrained models). Additionally, when training new models, remember to comment out the line `if len(cam_infos) <= 1:` in `GS-CPR/ACT_Scaffold_GS/scene/dataset_readers.py` to ensure proper loading of camera info.
```python
def readColmapCameras(cam_extrinsics, cam_intrinsics, images_folder):
    cam_infos = []
    for idx, key in enumerate(cam_extrinsics):
        #if len(cam_infos) <= 1:                #comment when training new 3DGS models
            sys.stdout.write('\r')
            # the exact output you're looking for:
            sys.stdout.write("Reading camera {}/{}".format(idx+1, len(cam_extrinsics)))
            sys.stdout.flush()
```

Run
```
#training Scaffold-gs models with ACT modules
bash train_scaffold_act.sh

#training Scaffold-gs models without ACT modules
bash train_scaffold.sh
```

## Install dependencies for GS-CPR refinement
Create the environment as same as [MASt3R](https://github.com/naver/mast3r#demo)
```
cd GS-CPR
conda activate mast3r
```

## Datasets (raw images + poses + intrinsics)
This paper uses three public datasets:
- [Microsoft 7-Scenes](https://www.microsoft.com/en-us/research/project/rgb-d-dataset-7-scenes/)
- [Cambridge Landmarks](https://www.repository.cam.ac.uk/handle/1810/251342/)
- [Stanford 12-Scenes](https://graphics.stanford.edu/projects/reloc/)

Following [ACE](https://github.com/nianticlabs/ace), we utilize the same scripts in the `datasets` folder to automatically download and extract the data in a consistent format.

> **Important: make sure you have checked the license terms of each dataset before using it.**

#### {7, 12}-Scenes:

You can use the `datasets/setup_{7,12}scenes.py` scripts to download the data.
As mentioned in our paper, we experimented _Pseudo Ground Truth (PGT)_ camera poses obtained after running SfM on the scenes (see the [ICCV 2021 paper](https://openaccess.thecvf.com/content/ICCV2021/html/Brachmann_On_the_Limits_of_Pseudo_Ground_Truth_in_Visual_Camera_ICCV_2021_paper.html),
and [associated code](https://github.com/tsattler/visloc_pseudo_gt_limitations/) for details).

To download and prepare the datasets using the PGT poses:

```shell
cd datasets
# Downloads the data to datasets/pgt_7scenes_{chess, fire, ...}
./setup_7scenes.py --poses pgt
# Downloads the data to datasets/pgt_12scenes_{apt1_kitchen, ...}
./setup_12scenes.py --poses pgt
``` 
You can follow [ACE](https://github.com/nianticlabs/ace) to download DSLAM poses and try.

#### Cambridge Landmarks

Simply run:

```shell
cd datasets
# Downloads the data to datasets/Cambridge_{GreatCourt, KingsCollege, ...}
./setup_cambridge.py
```

## GS-CPR Refinement Evaluation
```
#For 7Scenes
python gs_cpr_7s.py --pose_estimator ace --scene chess #for a specific scene
python gs_cpr_7s.py --pose_estimator ace --test_all #for the whole dataset

#For 12Scenes
python gs_cpr_12s.py --pose_estimator ace --scene apt1_kitchen
python gs_cpr_12s.py --pose_estimator ace --test_all #for the whole dataset

#For Cambridge Landmarks
python gs_cpr_cam.py --pose_estimator ace --scene ShopFacade
python gs_cpr_cam.py --pose_estimator ace --test_all #for the whole dataset
```

## GS-CPR_rel Refinement Evaluation
```
#For 7Scenes
python gs_cpr_7s_rel.py --pose_estimator dfnet --scene chess #for a specific scene
python gs_cpr_7s_rel.py --pose_estimator dfnet --test_all #for the whole dataset

#For Cambridge Landmarks
python gs_cpr_cam_rel.py --pose_estimator dfnet --scene ShopFacade
python gs_cpr_cam_rel.py --pose_estimator dfnet --test_all #for the whole dataset
```

You can check the refined poses for each query in `txt` files and the statistic `log` results in `GS-CPR/outputs`.
## Citation
If you find our work helpful, please consider citing:

```bibtex
@inproceedings{
liu2025gscpr,
title={{GS}-{CPR}: Efficient Camera Pose Refinement via 3D Gaussian Splatting},
author={Changkun Liu and Shuai Chen and Yash Sanjay Bhalgat and Siyan HU and Ming Cheng and Zirui Wang and Victor Adrian Prisacariu and Tristan Braud},
booktitle={The Thirteenth International Conference on Learning Representations},
year={2025},
url={https://openreview.net/forum?id=mP7uV59iJM}
}
```
## Star History
If you find our work helpful, please consider star🌟 this repo:
[![Star History Chart](https://api.star-history.com/svg?repos=XRIM-Lab/GS-CPR&type=Timeline)](https://www.star-history.com/#XRIM-Lab/GS-CPR&Timeline)

## Acknowledgements
This project is developed based on several fantastic repos: [Scaffold-GS](https://github.com/city-super/Scaffold-GS), [MASt3R](https://github.com/naver/mast3r), [DUSt3R](https://github.com/naver/dust3r), [NeFeS](https://github.com/ActiveVisionLab/NeFeS), [ACE](https://github.com/nianticlabs/ace) and [Depth for 3DGS](https://github.com/leo-frank/diff-gaussian-rasterization-depth). We thank the original authors for their excellent work.