Агенты ИИ | AGI_and_RL

Meta-Model-Based Meta-Policy Optimization https://arxiv.org/abs/2006.02608

Learning Memory-Based Control for Human-Scale Bipedal Locomotion: https://arxiv.org/abs/2006.02402

https://blog.tensorflow.org/

https://github.com/openai/multiagent-particle-envs

Ceres Solver: http://ceres-solver.org/index.html

https://offline-rl.github.io/

Monte Carlo Gradient Estimation in Machine Learning https://arxiv.org/pdf/1906.10652.pdf

Some classic: Proximal Policy Optimization Algorithmshttps://arxiv.org/pdf/1707.06347.pdf

Reinforcement Learning: An Introduction, second edition, 2020Richard S. Sutton and Andrew G. Bartoht...

https://github.com/edwardhdlu/q-trader