Publications and research outputs

Vision Language Model, Multimodal

Selected papers across vision-language models, multimodal learning, medical AI, robust perception, forecasting, and explainable systems. Use the search box or topic filters to browse the list quickly.

11 Papers and preprints
7 First or co-first author
5 Research themes

11 results

2026

3 papers
Blind to Position 2026
under-review First author

Blind to Position, Biased in Language: Failure of Vision-Language Models on Spatial Reasoning

Na-min An*, Inha Kang*, Minhyun Lee, H Shim.

Investigates why VLMs fail at spatial reasoning, revealing positional blindness and linguistic bias, and proposes a diagnostic benchmark with a training-free intervention.

Spatial Reasoning Positional Bias Training-Free
What Not to Detect 2026
ICLR 2026 First author

What “Not” to Detect: Negation-Aware VLMs via Structured Reasoning and Token Merging

Inha Kang, Y Lim, S Lee, J Choi, J Choe, H Shim

Identifies the affirmative bias of VLMs when processing negation and improves detection accuracy with a token-merging module and a reasoning-aware data pipeline.

Negation Understanding Reasoning Data-centric AI
Air Quality Forecasting 2026
CVPR 2026 First author

Real-Time Long Horizon Air Quality Forecasting via Group-Relative Policy Optimization

Inha Kang, E Kim, W Ryu, J Shin, S Yu, YH Kang, S Jeong, E Kim, S Kim, H Shim

Applies GRPO with asymmetric rewards to reduce false alarms and deliver reliable five-day air quality forecasts for East Asia.

Forecasting GRPO Foundation Model Fine-Tuning

2025

3 papers
Robust LiDAR Segmentation 2025
CVPR 2025 Robust perception

No Thing, Nothing: Highlighting Safety-Critical Classes for Robust LiDAR Semantic Segmentation in Adverse Weather

J Park, H Lee, Inha Kang, H Shim

Addresses safety-critical failures in weather-degraded LiDAR by emphasizing object geometry through physics-inspired augmentation.

LiDAR Segmentation Autonomous Driving Domain Generalization
3D-Aware Vision-Language Models 2025
EMNLP Findings 2025 Multimodal

3D-Aware Vision-Language Models Fine-Tuning with Geometric Distillation

S Lee*, J Choi*, Inha Kang, J Kim, J Park, H Shim

Transfers structural knowledge from 3D models into 2D VLMs, improving zero-shot 3D classification without requiring large 3D datasets.

3D Spatial Understanding Knowledge Distillation Zero-Shot
CoPatch 2025
Under Review (2025) Preprint

CoPatch: Zero-Shot Referring Image Segmentation by Leveraging Untapped Spatial Knowledge in CLIP

N. M. An, Inha Kang, M. Lee, H. Shim

Unlocks intrinsic spatial correlations in CLIP to improve training-free zero-shot referring image segmentation on RefCOCO benchmarks.

Training-Free Spatial Reasoning Referring Segmentation

2023

2 papers
Why is the winner the best? 2023
CVPR 2023 Meta-research

Why is the winner the best?

M Eisenmann, A Reinke, V Weru, …, Inha Kang, et al.

Studies the reliability of ranking systems in biomedical AI challenges and shows how unstable metrics can misrepresent performance differences.

Benchmarking Ranking Stability Statistical Significance
PM2.5 Interactions 2023
J. Hazardous Materials 2023 Co-first author

Three-Dimensional Label-Free Visualization of the Interactions of PM2.5 with Macrophages and Epithelial Cells Using Optical Diffraction Tomography

WS Lee*, Inha Kang*, SJ Yoon, et al.

Uses optical diffraction tomography for 3D, label-free visualization of fine dust uptake, enabling quantitative analysis without phototoxicity.

Optical Diffraction Tomography 3D Bio-imaging Quantitative Phase Imaging

2022

2 papers
Joint embedding of 2D and 3D networks 2022
MICCAI 2022 1st place challenge

Joint Embedding of 2D and 3D Networks for Medical Image Anomaly Detection

Inha Kang, J Park

Combines 2D texture cues and 3D volumetric context to detect subtle anomalies in brain MRI and abdominal CT, winning the MICCAI MOOD Challenge.

Challenge Winner 2D/3D Fusion OOD Detection
Vertebra CT Segmentation 2022
KTCP 2022 First author

End-to-End Vertebra CT Image Segmentation Network with the 3D Surface-Enhanced Module and the Trainable Preprocessing Method

Inha Kang, JH Cho, J Park

Integrates trainable preprocessing and 3D surface enhancement into an end-to-end segmentation network for vertebra CT analysis.

Segmentation Trainable Preprocessing 3D Surface Enhancement

2021

1 paper
Pseudoanomaly generation 2021
MICCAI 2021 Co-first author

Self-Supervised 3D Out-of-Distribution Detection via Pseudoanomaly Generation

JH Cho*, Inha Kang*, J Park

Introduces pseudoanomaly generation so 3D anomaly detectors can learn from normal data only, winning the MICCAI MOOD Challenge.

Self-Supervised Learning 3D Anomaly Detection Challenge Winner