SangEun
Publications Projects Posts CV
Home Publications Projects Posts CV

vision-language 9

  • Vision-Flan: Scaling Human-Labeled Tasks in Visual Instruction Tuning Mar 30, 2026
  • InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning Mar 30, 2026
  • [MiniGPT-4] Enhancing Vision-Language Understanding with Advanced Large Language Models Feb 22, 2026
  • Flamingo: a Visual Language Model for Few-Shot Learning Feb 22, 2026
  • BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models Feb 22, 2026
  • Improved Baselines with Visual Instruction Tuning (LLaVA-1.5) Feb 10, 2026
  • Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens Jan 31, 2026
  • Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought Reasoning Jan 5, 2026
  • LLaVA: Large Language and Vision Assistant Aug 29, 2025

Recently Updated

  • GenRecon: Bridging Generative Priors for Multi-View 3D Scene Reconstruction
  • EmoVIT: Revolutionizing Emotion Insights with Visual Instruction Tuning
  • LLaVA: Large Language and Vision Assistant
  • Why We Feel: Breaking Boundaries in Emotional Reasoning with Multimodal Large Language Models
  • Hunyuan3D 2.1: From Images to High-Fidelity 3D Assets with Production-Ready PBR Material

Trending Tags

long mllm vision-language 3d 3d-generation short emotion-mllm

© 2026 SangEun Lee. Some rights reserved.

Using the Chirpy theme for Jekyll.

Trending Tags

long mllm vision-language 3d 3d-generation short emotion-mllm

A new version of content is available.