مرکز منطقه ای اطلاع رسانی علوم و فناوری فصلنامه مهندسی برق و مهندسی کامپيوتر ايران 16823745 16 2 2018 9 30 Proposing a New Method for Acquiring Skills in Reinforcement Learning with the Help of Graph Clustering ارائه روشی جدید برای کسب مهارت در یادگیری تقویتی با کمک خوشه‌بندی گراف 131 141 fa مرضیه داودآبادی فراهانی ناصر مزینی 2017 3 4 Reinforcement learning is atype of machine learning methods in which the agent uses its transactions with the environment to recognize the environment and to improve its behavior.One of the main problems of standard reinforcement learning algorithms like Q-learning is that they are not able to solve large scale problems in a reasonable time. Acquiring skills helps to decompose the problem to a set of sub-problems and to solve it with hierarchical methods. In spite of the promising results of using skills in hierarchical reinforcement learning, it has been shown in some previous studies that based on the imposed task, the effect of skills on learning performance can be quite positive. On the contrary, if they are not properly selected, they can increase the complexity of problem-solving. Hence, one of the weaknesses of previous methods proposed for automatically acquiring skills is the lack of a systematic evaluation method for each acquired skill. In this paper, we propose new methods based on graph clustering for subgoal extraction and acquisition of skills. Also, we present new criteria for evaluating skills, with the help of which, inappropriate skills for solving the problem are eliminated. Using these methods in a number of experimental environments shows a significant increase in learning speed. یادگيري تقويتي، يكي از انواع يادگيري ماشين است كه در آن عامل با استفاده از تراکنش با محيط، به شناخت محیط و بهبود رفتار خود می‎پردازد. يكي از مشكلات اصلي الگوريتم‎هاي استاندارد يادگيري تقويتي مانند یادگیری Q اين است که نمی‎توانند مسایل بزرگ را در زمان قابل قبولی حل کنند. کسب خودکار مهارت‌ها می‌تواند به شکستن مسأله به زيرمسأله‎هاي کوچک‌تر و حل سلسله‌مراتبی آن کمک کند. با وجود نتایج امیدوارکننده استفاده از مهارت‌ها در یادگیری تقویتی سلسله‌مراتبی، در برخی تحقیقات دیگر نشان داده شد که بر اساس وظیفه مورد نظر، اثر مهارت‌ها بر کارایی یادگیری می‌تواند کاملاً مثبت یا منفی باشد و اگر به درستی انتخاب نشوند می‌توانند پیچیدگی حل مسأله‌ را افزایش دهند. از این رو یکی از نقاط ضعف روش‌های قبلی کسب خودکار مهارت‌ها، عدم ارزیابی هر یک از مهارت‌های کسب‌شده می‌باشد. در این مقاله روش‌های جدیدی مبتنی بر خوشه‌بندی گراف برای استخراج زیرهدف‌ها و کسب مهارت‌ها ارائه می‌گردد. همچنین معیارهای جدید برای ارزیابی مهارت‌ها مطرح می‌شود که با کمک آنها، مهارتهای نامناسب برای حل مسأله‌ حذف می‌گردند. استفاده از این روش‌ها در چندین محیط آزمایشگاهی افزایش سرعت یادگیری را به شکل قابل ملاحظه‌ای نشان می‌دهد.

http://ijece.org/ar/Article/Download/28329