دانلود رایگان مقاله انگلیسی واحد های خطی S شکل وزن دار برای تقریب تابع شبکه عصبی در یادگیری تقویتی - الزویر 2018

قیمت خرید این محصول

رایگان

دانلود مقاله انگلیسی سفارش ترجمه این مقاله

عنوان فارسی

واحد های خطی S شکل وزن دار برای تقریب تابع شبکه عصبی در یادگیری تقویتی

عنوان انگلیسی

Sigmoid-weighted linear units for neural network function approximation in reinforcement learning

صفحات مقاله فارسی

0

صفحات مقاله انگلیسی

26

سال انتشار

2018

نشریه

الزویر - Elsevier

فرمت مقاله انگلیسی

PDF

نوع مقاله

ISI

نوع نگارش

مقالات پژوهشی (تحقیقاتی)

رفرنس

دارد

پایگاه

اسکوپوس

کد محصول

E10530

رشته های مرتبط با این مقاله

مهندسی کامپیوتر، فناوری اطلاعات

گرایش های مرتبط با این مقاله

هوش مصنوعی، شبکه های کامپیوتری

مجله

شبکه های عصبی - Neural Networks

دانشگاه

Dept. of Brain Robot Interface - ATR Computational Neuroscience Laboratories - Japan

کلمات کلیدی

یادگیری تقویتی، واحد خطی وزن سیگموئیدی، تابع تقریبی، تتریس، آتاری 2600، یادگیری عمیق

doi یا شناسه دیجیتال

https://doi.org/10.1016/j.neunet.2017.12.012

برای سفارش ترجمه این مقاله با کیفیت عالی و در کوتاه ترین زمان ممکن توسط مترجمین مجرب سایت ایران عرضه؛ روی دکمه سبز رنگ کلیک نمایید.

۰.۰ (بدون امتیاز)

امتیاز دهید

چکیده

Abstract

In recent years, neural networks have enjoyed a renaissance as function approximators in reinforcement learning. Two decades after Tesauro’s TD-Gammon achieved near top-level human performance in backgammon, the deep reinforcement learning algorithm DQN achieved human-level performance in many Atari 2600 games. The purpose of this study is twofold. First, we propose two activation functions for neural network function approximation in reinforcement learning: the sigmoid-weighted linear unit (SiLU) and its derivative function (dSiLU). The activation of the SiLU is computed by the sigmoid function multiplied by its input. Second, we suggest that the more traditional approach of using on-policy learning with eligibility traces, instead of experience replay, and softmax action selection can be competitive with DQN, without the need for a separate target network. We validate our proposed approach by, first, achieving new state-of-the-art results in both stochastic SZ-Tetris and Tetris with a small 10×10 board, using TD(λ) learning and shallow dSiLU network agents, and, then, by outperforming DQN in the Atari 2600 domain by using a deep Sarsa(λ) agent with SiLU and dSiLU hidden units.

نتیجه گیری

Conclusions

In this study, we proposed SiLU and dSiLU as activation functions for neural network function approximation in reinforcement learning. We demonstrated in stochastic SZ-Tetris that SiLUs significantly outperformed ReLUs, and that dSiLUs significantly outperformed sigmoid units. The best agent, the dSiLU network agent, achieved a new state-of-the-art in stochastic SZ-Tetris and in 10×10 Tetris. In the Atari 2600 domain, 365 a deep Sarsa(λ) agent with SiLUs in the convolutional layers and dSiLUs in the fullyconnected hidden layer outperformed DQN and double DQN, as measured by mean and median DQN normalized scores. An additional purpose of this study was to demonstrate that a more traditional approach of using on-policy learning with eligibility traces and softmax selection (i.e., 370 basically a “textbook” version of a reinforcement learning agent but with non-linear neural network function approximators) can be competitive with the approach used by DQN. This means that there is a lot of room for improvements, by, e.g., using, as DQN, a separate target network, but also by using more recent advances such as the dueling architecture (Wang et al., 2016) for more accurate estimates of the action values and 375 asynchronous learning by multiple agents in parallel (Mnih et al., 2016).

برچسب‌ها: دانلود رایگان مقالات انگلیسی مهندسی فناوری اطلاعات IT، دانلود رایگان مقالات انگلیسی مهندسی کامپیوتر، دانلود رایگان مقالات isi