About News Research People Alumni Contact |
Precision beats power.Timing beats speed.Join Us |
AI Geeks is a research group formed by a team of enthusiastic researchers who are passionate about cutting-edge computer vision and machine learning. The group's research spans multiple disciplines, including AI for Health, Multimodal LLM Agents, 3D Generative Modeling, and Robotics. The group continually pushes the boundaries of AI research, producing significant cross-disciplinary outputs and having papers accepted at prestigious international conferences and journals. If you're interested in joining us, please fill out this form, and feel free to contact us if you'd like to get in touch.
(07/30/2024) 🎉 Our paper XLIP has been highlighted by CVer!
(07/19/2024) 🎉 Our paper Motion Avatar has been accepted to BMVC 2024!
(06/18/2024) 🎉 Our paper JointViT has been selected as oral presentation at MIUA 2024!
(05/23/2024) 🎉 Our paper Motion Avatar has been highlighted by AI Bites!
(05/22/2024) 🎉 Our paper Motion Avatar has been highlighted by Language Model Digest!
(05/21/2024) 🎉 Our paper Motion Avatar has been highlighted by CSVisionPapers!
(05/14/2024) 🎉 Our paper JointViT has been accepted to MIUA 2024!
(03/02/2024) 🎉 Our paper A Deep Learning Approach to Diabetes Diagnosis has been accepted to ACIIDS 2024!
(02/10/2024) 🎉 Our paper SegReg has been accepted to ISBI 2024!
(11/16/2023) 🎉 Our paper SegReg has been highlighted by CVer!
(09/29/2023) 🎉 Our paper BHSD has been accepted to MLMI 2023!
MedDet: Generative Adversarial Distillation for Efficient Cervical Disc Herniation Detection Cervical disc herniation (CDH) is a common disorder needing expert analysis. Current automated detection methods face challenges: high computational demands and MRI noise. We propose MedDet for efficient detection, leveraging knowledge distillation, generative adversarial training, and nmODE2. Our model improves mAP by 5%, reduces parameters by 67.8%, and speeds inference fivefold. |
|
SegStitch: Multidimensional Transformer for Robust and Efficient Medical Imaging Segmentation Medical imaging segmentation, crucial for lesion analysis, has seen advances with transformers in 3D segmentation. Despite their scalability, transformers struggle with local features and complexity. We propose SegStitch, combining transformers with denoising ODE blocks, improving mDSC by up to 11.48% and reducing parameters by 36.7%, promising real-world clinical adaptation. |
|
XLIP: Cross-modal Attention Masked Modelling for Medical Language-Image Pre-Training Vision-and-language pretraining (VLP) in the medical field faces challenges with reconstructing key features due to limited data and using both paired and unpaired data. Our proposed XLIP framework improves learning by introducing attention-masked image modeling (AttMIM) and entity-driven masked language modeling (EntMLM), enhancing medical features. XLIP achieves state-of-the-art performance in classification tasks. |
|
Motion Avatar: Generate Human and Animal Avatars with Arbitrary Motion In recent years, creating 3D avatars and motions has garnered significant interest due to applications in film, video games, AR/VR, and human-robot interaction. Current efforts focus on either generating 3D avatars or motion sequences, with integration remaining a challenge. Moreover, extending these techniques to animals is difficult due to inadequate training data. Our paper addresses these gaps with three key contributions. Firstly, we introduce Motion Avatar, an agent-based approach for generating high-quality human and animal avatars with motions from text queries. Secondly, we present an LLM planner that coordinates motion and avatar generation in a customizable Q&A format. Lastly, we offer Zoo-300K, an animal motion dataset with 300,000 text-motion pairs across 65 animal categories, created using our ZooGen pipeline. |
|
JointViT: Modeling Oxygen Saturation Levels with Joint Supervision on Long-Tailed OCTA The oxygen saturation level in the blood (SaO2) is crucial for health, especially regarding sleep-related breathing disorders. However, continuous SaO2 monitoring is time-consuming and variable depending on patient conditions. Optical coherence tomography angiography (OCTA) has recently shown promise in rapidly screening eye-related lesions, potentially aiding in diagnosing sleep-related disorders. Our paper presents three contributions: Firstly, we propose JointViT, a Vision Transformer-based model with a joint loss function for supervision. Secondly, we introduce a balancing augmentation technique to improve performance on long-tail distributions within the OCTA dataset. Lastly, our method significantly outperforms state-of-the-art methods, achieving up to 12.28% improvement in accuracy. This advancement paves the way for using OCTA in diagnosing sleep-related disorders. |
|
SegReg: Segmenting OARs by Registering MR Images and CT Annotations Organ at risk (OAR) segmentation is vital in radiotherapy treatment planning, especially for head and neck tumors. However, radiation oncologists often manually segment OARs on CT scans, a time-consuming and costly process that limits patient access to timely radiotherapy. While MRI provides better soft-tissue contrast, its lengthy process is impractical for real-time planning. To address this, we propose SegReg, a method using Elastic Symmetric Normalization to register MRI for OAR segmentation. SegReg outperforms the CT-only baseline by 16.78% in mDSC and 18.77% in mIoU, combining CT's geometric accuracy with MRI's superior contrast for accurate automated OAR segmentation. |
|
BHSD: A 3D Multi-class Brain Hemorrhage Segmentation Dataset Intracranial hemorrhage (ICH) is a pathological condition involving bleeding inside the skull or brain, which can have various causes. Accurate identification, localization, and quantification of ICH are crucial for clinical outcomes. Deep learning techniques are often used in medical image segmentation, but current public ICH datasets do not support multi-class segmentation. To address this, we developed the Brain Hemorrhage Segmentation Dataset (BHSD), a 3D multi-class ICH dataset with 192 volumes having pixel-level annotations and 2200 volumes with slice-level annotations across five ICH categories. We also provide benchmarks using state-of-the-art models for supervised and semi-supervised segmentation tasks on this dataset. |
|
A Deep Learning Approach to Diabetes Diagnosis Diabetes, caused by inadequate insulin production or utilization, inflicts extensive harm. Existing diagnostics are often invasive and costly. Current machine learning models like Classwise k Nearest Neighbor (CkNN) and General Regression Neural Network (GRNN) struggle with imbalanced data. We propose a non-invasive diagnosis method using a Back Propagation Neural Network (BPNN) with batch normalization and data re-sampling for class balancing. This approach improves accuracy, sensitivity, and specificity. Experimental results show 89.81% accuracy on the Pima diabetes dataset, 75.49% on the CDC BRFSS2015 dataset, and 95.28% on the Mesra Diabetes dataset, highlighting the potential of advanced deep learning models for robust diagnosis. |