신청자 김선우 특이사항
초청자 박형빈 초청자 이메일
일자 Aug 23(Fri), 2024 (15:00 ~ 18:00) 강의실 27동 220호
세미나 종류 응용수학
기타 세미나 종류
세미나 제목 Recent Progress on Improving Regret Bounds for Reinforcement Learning in Markov Decision Processes
Abstract We study online reinforcement learning (RL) in Markov decision processes. In particular, we consider the notion of regret of a learning algorithm defined with respect to an optimal policy, which is fundamental in online learning theory. In this talk, we discuss three different settings: safe RL, RL with linear function approximation, and RL with multinomial logistic function approximation. We develop several algorithms and optimization frameworks, by which we improve upon the best known regret upper bounds for the three regimes. Moreover, we provide improved regret lower bounds for RL with multinomial logistic function approximation.

* 강연시간: 오후4시~5시
강연자 이다빈
소속 기관명 KAIST
파일

로그인

로그인폼

로그인 유지