True Online Temporal-Difference Learning

Recently, new versions of these methods were introduced, called true online TD(\lambda) and true online Sarsa(\lambda), respectively (van Seijen & Sutton, ...

อ่านต่อ


banner