Загрузка...

NTD 2026 || Joint 2D-3D Self-Supervised Learning || SRKR Engineering College, Bhimavaram

Link: https://fusion-model.vercel.app/
GitHub:
https://github.com/Charanyedida/fusion-model

Our project, “Joint 2D-3D Self-Supervised Learning for 3D Perception (Spatial-SynergyNet),” focuses on building an AI-driven computer vision system capable of accurately segmenting, classifying, and understanding 3D indoor environments without relying on massive, manually labeled 3D datasets.
​The system is designed using Python, PyTorch, PyVista, Open3D, and Hugging Face transformer architectures. A 3D point cloud environment (leveraging the ScanNet dataset) is simulated under rigorous data augmentation pipelines, including Z-axis rotation and geometric jitter, to normalize spatial boundaries. From these complex indoor scenes, important geometric and visual features—such as spatial coordinates (XYZ) and precise color values (RGB)—are extracted, standardized to a consistent point density, and processed.
​A Joint Teacher-Student model is trained to accurately fuse 2D semantic knowledge with 3D spatial geometry. A frozen 2D Vision Transformer (DINOv2) acts as the semantic teacher, while a deep 3D PointNet student backbone learns the geometric structure through a Cross-Modal Synergy (Cosine Similarity) mechanism. Once the geometric foundation is established, a deep Multi-Layer Perceptron (MLP) Task Head equipped with LayerNorm and dynamic Focal Loss is activated to intelligently classify complex object boundaries. Using this advanced error-calculation mechanism, the model continuously penalizes lazy heuristic learning and optimizes the network to recognize challenging minority classes, such as distinct furniture pieces.
​The project aims to improve 3D spatial perception, reduce dependency on costly human-annotated data, and overcome standard memory and class imbalance bottlenecks in deep learning. By integrating advanced self-supervised techniques, robust neural network optimizations, and a seamless React-powered interactive split-screen UI, the system provides a smarter, automated, and highly scalable solution for next-generation autonomous navigation and 3D scene understanding.
​Submission from SRKR Engineering College, Department of Computer Science & Engineering.

Видео NTD 2026 || Joint 2D-3D Self-Supervised Learning || SRKR Engineering College, Bhimavaram канала Charan Yedida
Яндекс.Метрика
Все заметки Новая заметка Страницу в заметки
Страницу в закладки Мои закладки
На информационно-развлекательном портале SALDA.WS применяются cookie-файлы. Нажимая кнопку Принять, вы подтверждаете свое согласие на их использование.
О CookiesНапомнить позжеПринять