achieved more than 96 accuracy on modelnet40 test dataset with lightweighted custome model pipeline. Projecting (added gaussian splatting) 3d pointcloud dataset into 2d image and putting them to clip vit-b/16
- python==3.10
- git clone this repo
- cd 3d-vlm-gaussian-splatting-classification-on-modelnet40
- pip install -r requirements.txt
- python -m inference.py
- if it's first time running this model, then in config.py, set exist_flag = True
- if you want to do inference on more dataset, you can set max_samples in config.py
- torch in requirements.txt is cpu only. change it if you want to use cuda
- if you want to use cuda version, delete torch in requirements.txt and do (3)
- pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu<version you want e.g) 118 if cuda 11.8>
- sample test accuracy is 70% and overall validation performance for test datasrt is 95 for this pth
- you can run this even in only cpu env
Z. Wu, S. Song, A. Khosla, F. Yu, L. Zhang, X. Tang and J. Xiao 3D ShapeNets: A Deep Representation for Volumetric Shapes Proceedings of 28th IEEE Conference on Computer Vision and Pattern Recognition (CVPR2015) Oral Presentation · 3D Deep Learning Project Webpage
.png)


