Jinheon Baek (λ°±μ§ν) (jinheon.baek [at] kaist [dot] ac [dot] kr), and here is my CV (Curriculum Vitae)
Iβm a Ph.D. student in the Graduate school of AI at KAIST (MLAI Lab), where I am fortunate to be advised by Prof. Sung Ju Hwang, and before that I received a M.S. degree of Artifical Intelligence at KAIST in 2022. Prior to studying at KAIST, I received my B.S. (Computer Science and Engineering) degree at Korea University in 2020, where I studied machine learning under the guidance of Prof. Jaewoo Kang.
Currently, I am a research intern at Google Gemini. Previously, I worked as a research intern at IBM Research in 2024, a research intern at Microsoft Research in 2023, and an applied scientist II intern at Alexa AI, Amazon in 2022, collaborating with wonderful mentors.
My primary research interest lies in the area of machine learning for languages, knowledge, and their intersections at scale. Previous work includes modeling interconnected data structures (texts and graphs), and retrieving them to augment language models for practical natural language applications. You can refer to my Research Statement: Augmenting Large Language Models with External Knowledge, if interested in.
π Publications
(* denotes the equal contribution)
-
ResearchAgent: Iterative Research Idea Generation over Scientific Literature with Large Language Models
Jinheon Baek, Sujay Kumar Jauhar, Silviu Cucerzan, and Sung Ju Hwang
arXiv preprint -
Unified Multi-Modal Interleaved Document Representation for Information Retrieval
Jaewoo Lee*, Joonho Ko*, Jinheon Baek*, Soyeong Jeong, and Sung Ju Hwang
arXiv preprint -
Database-Augmented Query Representation for Information Retrieval
Soyeong Jeong, Jinheon Baek, Sukmin Cho, Sung Ju Hwang, and Jong C. Park
arXiv preprint -
The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
Seungone Kim, Juyoung Suk, Ji Yong Cho, Shayne Longpre, Chaeeun Kim, Dongkeun Yoon, Guijin Son, Yejin Cho, Sheikh Shafayat, Jinheon Baek, β¦, Graham Neubig, Moontae Lee, Kyungjae Lee, and Minjoon Seo
arXiv preprint -
CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark
David Romero, Chenyang Lyu, Haryo Akbarianto Wibowo, Teresa Lynn, Injy Hamed, Aditya Nanda Kishore, β¦, Jinheon Baek, β¦, Soyeong Jeong, β¦, Thamar Solorio, and Alham Fikri Aji
Conference on Neural Information Processing Systems (NeurIPS), 2024 (Oral Presentation) -
Retrieval-Augmented Data Augmentation for Low-Resource Domain Tasks
Minju Seo*, Jinheon Baek*, James Thorne, and Sung Ju Hwang
Adaptive Foundation Models Workshop at NeurIPS (AFM @ NeurIPS), 2024 -
An Empirical Study of Multilingual Reasoning Distillation for Question Answering
Patomporn Payoungkhamdee, Peerat Limkonchotiwat, Jinheon Baek, Potsawee Manakul, Can Udomcharoenchaikit, Ekapol Chuangsuwanich, and Sarana Nutanong
Empirical Methods in Natural Language Processing (EMNLP), 2024 -
Rethinking Code Refinement: Learning to Judge Code Efficiency
Minju Seo, Jinheon Baek, and Sung Ju Hwang
Findings of Empirical Methods in Natural Language Processing (Findings of EMNLP), 2024 -
Adaptive-RAG: Learning to Adapt Retrieval-Augmented Large Language Models through Question Complexity
Soyeong Jeong, Jinheon Baek, Sukmin Cho, Sung Ju Hwang, and Jong C. Park
Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2024 -
Knowledge-Augmented Large Language Models for Personalized Contextual Query Suggestion
Jinheon Baek, Nirupama Chandrasekaran, Silviu Cucerzan, Allen herring, and Sujay Kumar Jauhar
The Web Conference (WWW), 2024 -
Knowledge-Augmented Language Model Verification
Jinheon Baek, Soyeong Jeong, Minki Kang, Jong C. Park, and Sung Ju Hwang
Empirical Methods in Natural Language Processing (EMNLP), 2023 -
Test-Time Self-Adaptive Small Language Models for Question Answering
Soyeong Jeong, Jinheon Baek, Sukmin Cho, Sung Ju Hwang, and Jong C. Park
Findings of Empirical Methods in Natural Language Processing (Findings of EMNLP), 2023 -
Knowledge-Augmented Reasoning Distillation for Small Language Models in Knowledge-Intensive Tasks
Minki Kang, Seanie Lee, Jinheon Baek, Kenji Kawaguchi, and Sung Ju Hwang
Conference on Neural Information Processing Systems (NeurIPS), 2023 -
Direct Fact Retrieval from Knowledge Graphs without Entity Linking
Jinheon Baek, Alham Fikri Aji, Jens Lehmann, and Sung Ju Hwang
Annual Meeting of the Association for Computational Linguistics (ACL), 2023 -
Phrase Retrieval for Open-Domain Conversational Question Answering with Conversational Dependency Modeling via Contrastive Learning
Soyeong Jeong, Jinheon Baek, Sung Ju Hwang, and Jong C. Park
Findings of the Association for Computational Linguistics (Findings of ACL), 2023 -
Knowledge-Augmented Language Model Prompting for Zero-Shot Knowledge Graph Question Answering
Jinheon Baek, Alham Fikri Aji, and Amir Saffari
Natural Language Reasoning and Structured Explanations Workshop at ACL (NLRSE @ ACL), 2023 (Best Paper)
Matching from Unstructured and Structured Data Workshop at ACL (MATCHING @ ACL), 2023 (Oral Presentation) -
Personalized Subgraph Federated Learning
Jinheon Baek*, Wonyong Jeong*, Jiongdao Jin, Jaehong Yoon, and Sung Ju Hwang
International Conference on Machine Learning (ICML), 2023 -
Realistic Conversational Question Answering with Answer Selection based on Calibrated Confidence and Uncertainty Measurement
Soyeong Jeong, Jinheon Baek, Sung Ju Hwang, and Jong C. Park
Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2023 -
Graph Self-supervised Learning with Accurate Discrepancy Learning
Dongki Kim*, Jinheon Baek*, and Sung Ju Hwang
Conference on Neural Information Processing Systems (NeurIPS), 2022 -
Object Detection in Aerial Images with Uncertainty-Aware Graph Network
Jongha Kim, Jinheon Baek, and Sung Ju Hwang
Visual Object-oriented Learning meets Interaction Workshop at ECCV (VOLI Workshop @ ECCV), 2022 -
Knowledge Graph-Augmented Language Models for Knowledge-Grounded Dialogue Generation
Minki Kang*, Jin Myung Kwak*, Jinheon Baek*, and Sung Ju Hwang
Knowledge Retrieval and Language Models Workshop at ICML (KRLM Workshop @ ICML), 2022 -
KALA: Knowledge-Augmented Language Model Adaptation
Minki Kang*, Jinheon Baek*, and Sung Ju Hwang
Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2022 (Oral Presentation) -
Augmenting Document Representations for Dense Retrieval with Interpolation and Perturbation
Soyeong Jeong, Jinheon Baek, Sukmin Cho, Sung Ju Hwang, and Jong C. Park
Annual Meeting of the Association for Computational Linguistics (ACL), 2022 (Oral Presentation) -
Toward Accurate Learning of Graph Representations
Jinheon Baek
Masterβs Thesis, KAIST, 2022 -
Edge Representation Learning with Hypergraphs
Jaehyeong Jo*, Jinheon Baek*, Seul Lee*, Dongki Kim, Minki Kang, and Sung Ju Hwang
Conference on Neural Information Processing Systems (NeurIPS), 2021 -
Task-Adaptive Neural Network Retrieval with Meta-Contrastive Learning
Wonyong Jeong*, Hayeon Lee*, Gun Park*, Eunyoung Hyung, Jinheon Baek, and Sung Ju Hwang
Conference on Neural Information Processing Systems (NeurIPS), 2021 (Spotlight Presentation) -
Unsupervised Document Expansion for Information Retrieval with Stochastic Text Generation
Soyeong Jeong, Jinheon Baek, ChaeHun Park, and Jong C. Park
Scholarly Document Processing Workshop at NAACL (SDP Workshop @ NAACL), 2021 (Oral Presentation) -
Accurate Learning of Graph Representations with Graph Multiset Pooling
Jinheon Baek*, MinKi Kang*, and Sung Ju Hwang
International Conference on Learning Representations (ICLR), 2021 -
Exploring The Spatial Reasoning Ability of Neural Models in Human IQ Tests
Hyunjae Kim*, Yookyung Koh*, Jinheon Baek, and Jaewoo Kang
Neural Networks, 2021 -
Learning to Extrapolate Knowledge: Transductive Few-shot Out-of-Graph Link Prediction
Jinheon Baek, Dong Bok Lee, and Sung Ju Hwang
Conference on Neural Information Processing Systems (NeurIPS), 2020
π Honors and Awards
- Awarded the Presidential Science Scholarship for Graduate Study, 2024-2026
- Awarded the Travel Grant from KAIST-Google Partnership Program for WWW 2024
- Received the Best Poster Presentation Award at Samsung AI Forum 2023
- Received the Best Paper Award at NLRSE Workshop in ACL 2023
- Awarded the ICML Travel Grant for ICML 2023
- Awarded the Google Travel Grant for NeurIPS 2022
- Selected as One of the Top Reviewers (Top 10%) of NeurIPS 2022
- Selected as One of the Highlighted Reviewers (Top 10%) of ICLR 2022
- Selected as One of the Best Reviewers (Top 10%) of ICML 2021
- Received the Best Paper Award at CKAIA 2020
- Awarded the Samsung Dream Scholarship, 2016-2020
- Received the First Prize in the Graduation Project Competition at Korea University, 2019
- Received the Academic Excellence Award (highest GPA) at Korea University, 2019
- Received the Second Prize for Excellence in the Microsoft Student Partners Activities, 2018
- Nominated as the Representative of Korean for Excellence in Microsoft Student Partners Activities, 2017