Publications
I'm interested in reinforcment learning, robot learning, optimization, and representation learning.
|
|
Spectral-Risk Safe Reinforcement Learning with Convergence Guarantees
Dohyeong Kim, Taehyun Cho, Seungyub Han, Hojun Chung, Kyungjae Lee, Songhwai Oh
NeurIPS 2024, 2024
arxiv /
We propose a safe RL algorithm with spectral risk constraints, which shows convergence to an optimum in tabular settings.
|
|
D2NAS: Efficient Neural Architecture Search with Performance Improvement and Model Size Reduction for Diverse Tasks
Jungeun Lee, Seungyub Han, Jungwoo Lee
IEEE Access, 2024
paper /
We introduce D2NAS, Differential and Diverse NAS, leveraging techniques such as Differentiable ARchiTecture Search (DARTS) and Diverse-task Architecture SearcH (DASH) for architecture discovery.
|
|
Pitfall of Optimism: Distributional Reinforcement Learning by Randomizing Risk Criterion
Taehyun Cho, Seungyub Han, Heesoo Lee, Kyungjae Lee, Jungwoo Lee
NeurIPS 2023, 2023
paper /
arxiv /
We provide a perturbed distributional Bellman optimality operator by distorting the risk measure and prove the convergence and optimality of the proposed method with the weaker contraction property.
|
|
SPQR: Controlling Q-ensemble Independence with Spiked Random Model for Reinforcement Learning
Dohyeok Lee, Seungyub Han, Taehyun Cho, Jungwoo Lee
NeurIPS 2023, 2023
paper /
arxiv /
code /
By introducing a novel regularization loss for Q-ensemble independence based on random matrix theory, we propose spiked Wishart Q-ensemble independence regularization (SPQR) for reinforcement learning.
|
|
On the Convergence of Continual Learning with Adaptive Methods
Seungyub Han, Yeongmo Kim, Taehyun Cho, Jungwoo Lee
UAI 2023, 2023
paper /
arxiv /
In this paper, we provide a convergence analysis of memory-based continual learning with stochastic gradient descent and empirical evidence that training current tasks causes the cumulative degradation of previous tasks.
|
|
Adaptive Methods for Nonconvex Continual Learning
Seungyub Han, Yeongmo Kim, Taehyun Cho, Jungwoo Lee
NeurIPS 2022 Optimization for Machine Learning Workshop, 2022
paper /
We propose an adaptive method for nonconvex continual learning (NCCL), which adjusts step sizes of both previous and current tasks with the gradients.
|
|
Perturbed Quantile Regression for Distributional Reinforcement Learning
Taehyun Cho, Seungyub Han, Heesoo Lee, Kyungjae Lee, Jungwoo Lee
NeurIPS 2022 Deep RL Workshop, 2022
paper /
We provide a perturbed distributional Bellman optimality operator by distorting the risk measure in action selection, and prove the convergence and optimality of the proposed method by using the weaker contraction property.
|
|
Learning to Learn Unlearned Feature for Brain Tumor Segmentation
Seungyub Han, Yeongmo Kim, Seokhyeon Ha, Jungwoo Lee, Seunghong Choi
Medical Imaging meets NeurIPS (NeurIPS 2018 Workshop), 2018
paper /
arxiv /
One of the difficulties in medical image segmentation is the lack of datasets with proper annotations. To alleviate this problem, we propose active meta-tune to achieve balanced parameters for both glioma and brain metastasis domains within a few steps.
|
|
Tractable and Provably Efficient Distributional Reinforcement Learning with General Value Function Approximation
Taehyun Cho, Seungyub Han, Kyungjae Lee, Seokhun Ju, Dohyeong Kim, Jungwoo Lee
, 2024
arxiv /
Our theoretical results show that approximating the infinite-dimensional return distribution with a finite number of moment functionals is the only method to learn the statistical information unbiasedly including nonlinear statistical functional. Second, we propose a provably efficient algorithm, SF-LSVI, achieving a regret bound of \(\tilde{O}(d_E H^{\frac{3}{2}}\sqrt{K})\).
|
|
Generative Adversarial Trainer: Defense to Adversarial Perturbations with GAN
Hyeungill Lee, Seungyub Han, Jungwoo Lee
, 2017
arxiv /
We propose a novel technique to make neural network robust to adversarial examples using a generative adversarial network. We alternately train both classifier and generator networks. The generator network generates an adversarial perturbation that can easily fool the classifier network by using a gradient of each image.
|
|