3D Classification

Collection of several classifiers for 3D mesh objects using various vision and language models.

Quick Start

1. Install Dependencies

pip install -r requirements.txt

2. Download Models

The classifiers require four pre-trained models. Download them to a models/ directory in the parent folder:

.
├── 3D_classification/        (base directory)
│   ├── README.md
│   ├── ImageClassifier.py
│   ├── VLLMClassifier.py
│   ├── LlamaMeshClassifier.py
│   └── ...
└── models/                   (create this directory)
    ├── convnextv2-large-22k-384/
    ├── Qwen3-VL-8B-Instruct/
    ├── LLaMA-Mesh-model/
    └── clip-vit-large-patch14/

Available Classifiers

VLLMClassifier (main one, use this!!!)

Uses Qwen Vision-Language model for multi-modal classification.

from VLLMClassifier import VLLMClassifier

classifier = VLLMClassifier(device="cuda")
label = classifier.classify_one("path/to/mesh.obj")

# Command-line usage
python VLLMClassifier.py --file path/to/mesh.obj
python VLLMClassifier.py --dir path/to/meshes/ --output results.json
python VLLMClassifier.py --dir path/to/meshes/ --num-views 8 --limit 100 --output results.json

CLI arguments (python VLLMClassifier.py ...):

--file (str): Path to a single mesh file to classify. Mutually exclusive with --dir.
--dir (str): Path to a directory of mesh files to classify. Mutually exclusive with --file.
--output (str, optional, default: None): Output filename for batch classifications. Saved under classifications/.
--limit (int, optional, default: None): Limit number of files in batch mode.
--device (str, default: cuda:0): Device to use for inference.
--model (str, default: ../models/Qwen3-VL-8B-Instruct): Path to the VLM model directory.
--num-views (int, default: 12): Number of rendered views per mesh.
--resolution (int, default: 1024): Rendering resolution in pixels.

Arguments:

model_name (str, optional): Path to Qwen VL model. Defaults to ../models/Qwen3-VL-8B-Instruct
device (str): Device to use
num_views (int): Number of rendered views
resolution (int): Resolution of rendered views

ImageClassifier

Uses ConvNeXt V2 to classify rendered mesh views.

from ImageClassifier import ImageClassifier

classifier = ImageClassifier(device="cuda")
label = classifier.classify_one("path/to/mesh.obj")

# Or batch classification
results = classifier.classify_batch(
    folder_path="path/to/meshes/",
    save_path="results.json"
)

Arguments:

model_name (str, optional): Path to ConvNeXt model. Defaults to ../models/convnextv2-large-22k-384
device (str): Device to use ("cuda", "cpu", etc.)
num_views (int): Number of rendered views per mesh (default: 12)
resolution (int): Resolution of rendered views (default: 1024)

LlamaMeshClassifier (breaks for complex meshes)

Uses LLaMA-Mesh for direct mesh understanding.

from LlamaMeshClassifier import LlamaMeshClassifier

classifier = LlamaMeshClassifier(device="cuda")
label = classifier.classify_one("path/to/mesh.obj")

results = classifier.classify_batch(
    folder_path="path/to/meshes/",
    save_path="results.json"
)

Arguments:

model_name (str, optional): Path to LLaMA-Mesh model. Defaults to ../models/LLaMA-Mesh-model
device (str): Device to use
max_new_tokens (int): Maximum generation tokens
max_input_tokens (int): Maximum input tokens

LabelProcessor

Computes CLIP embeddings for labels to support evaluation.

from LabelProcessor import LabelProcessor

processor = LabelProcessor()
embedding = processor.compute_embedding("cat")
similarity = processor.compute_similarity(embedding1, embedding2)

Evaluation

Use EvaluationManager to evaluate classifier results:

from EvaluationManager import EvaluationManager

evaluator = EvaluationManager()

# Overall accuracy
accuracy = evaluator.accuracy("predictions.json", similarity_threshold=0.8)

# Per-class accuracy
evaluator.class_accuracy("predictions.json", similarity_threshold=0.8)

Supported File Formats

The models do not currently handle unsupported formats gracefully.

Supported out of the box

These formats work with a minimal trimesh install (trimesh + numpy):

glb, gltf
stl
ply
obj
off
dxf (ASCII only, 2D geometry)
xyz (point clouds)

Requires optional dependencies

Some formats need additional packages installed:

Format	Extra Dependencies
`3mf`	`lxml`, `networkx`
`3dxml`	`lxml`, `networkx`, `Pillow`
`dae`, `zae`	`lxml`, `Pillow`, `pycollada`
`step`, `stp`	`cascadio`
`xaml`	`lxml`
`svg`	`svg.path`

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.gitignore		.gitignore
EvaluationManager.py		EvaluationManager.py
ImageClassifier.py		ImageClassifier.py
LabelProcessor.py		LabelProcessor.py
LlamaMeshClassifier.py		LlamaMeshClassifier.py
MeshRenderer.py		MeshRenderer.py
README.md		README.md
VLLMClassifier.py		VLLMClassifier.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

3D Classification

Quick Start

1. Install Dependencies

2. Download Models

Available Classifiers

VLLMClassifier (main one, use this!!!)

ImageClassifier

LlamaMeshClassifier (breaks for complex meshes)

LabelProcessor

Evaluation

Supported File Formats

Supported out of the box

Requires optional dependencies

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

3D Classification

Quick Start

1. Install Dependencies

2. Download Models

Available Classifiers

VLLMClassifier (main one, use this!!!)

ImageClassifier

LlamaMeshClassifier (breaks for complex meshes)

LabelProcessor

Evaluation

Supported File Formats

Supported out of the box

Requires optional dependencies

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages