Skip to content

SecretLabOU/3D_classification

Repository files navigation

3D Classification

Collection of several classifiers for 3D mesh objects using various vision and language models.

Quick Start

1. Install Dependencies

pip install -r requirements.txt

2. Download Models

The classifiers require four pre-trained models. Download them to a models/ directory in the parent folder:

.
├── 3D_classification/        (base directory)
│   ├── README.md
│   ├── ImageClassifier.py
│   ├── VLLMClassifier.py
│   ├── LlamaMeshClassifier.py
│   └── ...
└── models/                   (create this directory)
    ├── convnextv2-large-22k-384/
    ├── Qwen3-VL-8B-Instruct/
    ├── LLaMA-Mesh-model/
    └── clip-vit-large-patch14/

Available Classifiers

VLLMClassifier (main one, use this!!!)

Uses Qwen Vision-Language model for multi-modal classification.

from VLLMClassifier import VLLMClassifier

classifier = VLLMClassifier(device="cuda")
label = classifier.classify_one("path/to/mesh.obj")

# Command-line usage
python VLLMClassifier.py --file path/to/mesh.obj
python VLLMClassifier.py --dir path/to/meshes/ --output results.json
python VLLMClassifier.py --dir path/to/meshes/ --num-views 8 --limit 100 --output results.json

CLI arguments (python VLLMClassifier.py ...):

  • --file (str): Path to a single mesh file to classify. Mutually exclusive with --dir.
  • --dir (str): Path to a directory of mesh files to classify. Mutually exclusive with --file.
  • --output (str, optional, default: None): Output filename for batch classifications. Saved under classifications/.
  • --limit (int, optional, default: None): Limit number of files in batch mode.
  • --device (str, default: cuda:0): Device to use for inference.
  • --model (str, default: ../models/Qwen3-VL-8B-Instruct): Path to the VLM model directory.
  • --num-views (int, default: 12): Number of rendered views per mesh.
  • --resolution (int, default: 1024): Rendering resolution in pixels.

Arguments:

  • model_name (str, optional): Path to Qwen VL model. Defaults to ../models/Qwen3-VL-8B-Instruct
  • device (str): Device to use
  • num_views (int): Number of rendered views
  • resolution (int): Resolution of rendered views

ImageClassifier

Uses ConvNeXt V2 to classify rendered mesh views.

from ImageClassifier import ImageClassifier

classifier = ImageClassifier(device="cuda")
label = classifier.classify_one("path/to/mesh.obj")

# Or batch classification
results = classifier.classify_batch(
    folder_path="path/to/meshes/",
    save_path="results.json"
)

Arguments:

  • model_name (str, optional): Path to ConvNeXt model. Defaults to ../models/convnextv2-large-22k-384
  • device (str): Device to use ("cuda", "cpu", etc.)
  • num_views (int): Number of rendered views per mesh (default: 12)
  • resolution (int): Resolution of rendered views (default: 1024)

LlamaMeshClassifier (breaks for complex meshes)

Uses LLaMA-Mesh for direct mesh understanding.

from LlamaMeshClassifier import LlamaMeshClassifier

classifier = LlamaMeshClassifier(device="cuda")
label = classifier.classify_one("path/to/mesh.obj")

results = classifier.classify_batch(
    folder_path="path/to/meshes/",
    save_path="results.json"
)

Arguments:

  • model_name (str, optional): Path to LLaMA-Mesh model. Defaults to ../models/LLaMA-Mesh-model
  • device (str): Device to use
  • max_new_tokens (int): Maximum generation tokens
  • max_input_tokens (int): Maximum input tokens

LabelProcessor

Computes CLIP embeddings for labels to support evaluation.

from LabelProcessor import LabelProcessor

processor = LabelProcessor()
embedding = processor.compute_embedding("cat")
similarity = processor.compute_similarity(embedding1, embedding2)

Evaluation

Use EvaluationManager to evaluate classifier results:

from EvaluationManager import EvaluationManager

evaluator = EvaluationManager()

# Overall accuracy
accuracy = evaluator.accuracy("predictions.json", similarity_threshold=0.8)

# Per-class accuracy
evaluator.class_accuracy("predictions.json", similarity_threshold=0.8)

Supported File Formats

The models do not currently handle unsupported formats gracefully.

Supported out of the box

These formats work with a minimal trimesh install (trimesh + numpy):

  • glb, gltf
  • stl
  • ply
  • obj
  • off
  • dxf (ASCII only, 2D geometry)
  • xyz (point clouds)

Requires optional dependencies

Some formats need additional packages installed:

Format Extra Dependencies
3mf lxml, networkx
3dxml lxml, networkx, Pillow
dae, zae lxml, Pillow, pycollada
step, stp cascadio
xaml lxml
svg svg.path

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages