under-review
Vision-Language
Blind to Position, Biased in Language: Failure of Vision-Language Models on Spatial Reasoning
Na-min An*, Inha Kang*, Minhyun Lee, H Shim.
Investigates why VLMs fail at spatial reasoning, revealing positional blindness and linguistic bias, and proposes a diagnostic benchmark with a training-free intervention.
Spatial Reasoning
Positional Bias
Training-Free
Read paper
ICLR 2026
Vision-Language
What “Not” to Detect: Negation-Aware VLMs via Structured Reasoning and Token Merging
Inha Kang, Y Lim, S Lee, J Choi, J Choe, H Shim
Tackles the affirmative bias of VLMs when they encounter negation,
combining a token-merging module with a reasoning-aware data pipeline.
Negation Understanding
Reasoning
Data-centric AI
Read paper
CVPR 2026
Forecasting
Real-Time Long Horizon Air Quality Forecasting via Group-Relative Policy Optimization
Inha Kang, E Kim, W Ryu, J Shin, S Yu, YH Kang, S Jeong, E Kim, S Kim, H Shim
Uses GRPO with asymmetric rewards to reduce false alarms and enable
reliable five-day East Asia air quality forecasts.
GRPO
Foundation Model Fine-Tuning
Public Health
Read paper
EMNLP 2025
3D VLM
3D-Aware Vision-Language Models Fine-Tuning with Geometric Distillation
S Lee*, J Choi*, Inha Kang, J Kim, J Park, H Shim
Transfers structural knowledge from 3D models into 2D VLMs to improve
zero-shot 3D understanding without massive 3D datasets.
Knowledge Distillation
3D Spatial Understanding
Zero-Shot
Read paper
MICCAI 2022
Medical AI
Joint Embedding of 2D and 3D Networks for Medical Image Anomaly Detection
Inha Kang, J Park
Winning solution for the MICCAI MOOD Challenge, combining local 2D
detail and global 3D context for robust anomaly detection.
Challenge Winner
2D/3D Fusion
OOD Detection
Read paper