diff --git "a/\345\237\272\344\272\216\346\267\261\345\272\246\345\274\272\345\214\226\345\255\246\344\271\240\347\256\227\346\263\225\347\232\204\344\273\277\347\234\237\345\210\260\345\256\236\350\267\265\346\225\231\347\250\213" "b/\345\237\272\344\272\216\346\267\261\345\272\246\345\274\272\345\214\226\345\255\246\344\271\240\347\256\227\346\263\225\347\232\204\344\273\277\347\234\237\345\210\260\345\256\236\350\267\265\346\225\231\347\250\213" new file mode 100644 index 0000000000000000000000000000000000000000..080c89a8d9752c06ffa3ed8a832bc781be5bf364 --- /dev/null +++ "b/\345\237\272\344\272\216\346\267\261\345\272\246\345\274\272\345\214\226\345\255\246\344\271\240\347\256\227\346\263\225\347\232\204\344\273\277\347\234\237\345\210\260\345\256\236\350\267\265\346\225\231\347\250\213" @@ -0,0 +1,246 @@ +# 基于深度强化学习算法的仿真到实践教程 + +## ubuntu18.04 + pytorch+ ros-melodic+gazebo11 + +# 环境配置: + +ubuntu18.04安装跳过 ,虚拟机和双系统都可以 + +虚拟机:[在虚拟机中安装Ubuntu 18.04 - 简书 (jianshu.com)](https://www.jianshu.com/p/c743aaa847de) + +双系统:[(13条消息) Windows 10 安装ubuntu 18.04 双系统(超详细教程)_Ycitus的博客-CSDN博客_win10安装ubuntu双系统](https://blog.csdn.net/qq_43106321/article/details/105361644) + +## ROS-melodic 安装: + +```shell +wget http://fishros.com/install -O fishros && . fishros +``` + +## rosdep: + +``` +wget http://fishros.com/install -O fishros && . fishros +``` + +## 下载安装anaconda: + +https://mirrors.tuna.tsinghua.edu.cn/anaconda/archive/Anaconda3-2021.11-Linux-x86_64.sh + + +bash Anaconda3-2021.11-Linux-x86_64.sh + +## 创建安装虚拟环境: + +```shell +git clone https://gitee.com/fangxiaosheng666/PPO-SAC-DQN-DDPG +cd PPO-SAC-DQN-DDPG +conda env create -f py2.yaml +``` + +## 创建工作空间: + +```shell +mkdir -p ws/src +cd ws/src +git clone https://github.com/ROBOTIS-GIT/turtlebot3.git +git clone https://github.com/ROBOTIS-GIT/turtlebot3_simulations.git +git clone https://github.com/ROBOTIS-GIT/turtlebot3_msgs.git +cd .. +``` + +## 修改激光雷达线数: + +参考:[TurtleBot3 (robotis.com)](https://emanual.robotis.com/docs/en/platform/turtlebot3/machine_learning/#machine-learning) + +``` +roscd turtlebot3_description/urdf/ +gedit turtlebot3_burger.gazebo.xacro +#如果想可视化激光雷达,把下面改成true + +#把激光雷达数据改成24 + + + 24 # The number of sample. Modify it to 24 + 1 + 0.0 + 6.28319 + + + +``` + +## 在工作空间下运行,安装ROS功能包全部依赖: + +```shell +cd ws +rosdep install --from-paths src --ignore-src -r -y +catkin_make +source devel/setup.bash +``` + +## 代码需要修改的几个路径: + +### 模型保存路径: + +```python +def save_model(self,dir): + state = {'target_net':self.target_net.state_dict(),'eval_net':self.eval_net.state_dict(), 'optimizer':self.optimizer.state_dict(), 'epoch':e} + torch.save(state,"/home/ffd/QDN/model/"+ dir+"a.pt") +``` + +### 加载模型继续训练,首先要有模型,不同的算法的模型不能通用,因为网络结构不同: + +需要加载模型的,要把self.load_models=Flase 改成True,然后加载对应算法训练的模型的路径 + + self.load_models=True + if self.load_models: + self.epsilon= 0 + self.start_epoch=self.load_ep + checkpoint = torch.load("/home/ffd/QDN/model/"+str(self.load_ep)+"a.pt") + +### respawnGoal.py修改 + +加载地图名字修改 + +目标点修改(可以根据自己的世界要求修改目标点)如果是加载自己的地图,需要把self.stage =2 改成4,然后修改下面的坐标。 + +```python + self.modelPath = os.path.dirname(os.path.realpath(__file__)) + self.modelPath = self.modelPath.replace('/home/ffd/DRL/PPO', + '/home/ffd/DRL/PPO/model.sdf') +``` + +```python + self.stage = 2 +``` + +```python + while position_check: + goal_x_list = [0.6, 1.9, 0.5, 0.2, -0.8, -1, -1.9, 0.5, 2, 0.5, 0, -0.1, -2] + goal_y_list = [0, -0.5, -1.9, 1.5, -0.9, 1, 1.1, -1.5, 1.5, 1.8, -1, 1.6, -0.8] + + self.index = random.randrange(0, 13) + print(self.index, self.last_index) + if self.last_index == self.index: + position_check = True + else: + self.last_index = self.index + position_check = False +``` + +这些坐标点是根据gazebo地图给的 + +## 如何加载自己的小车和世界 + +```xml + + + + + + + + + + + + + + + + + + + + + + + + + + +``` + +## 启动仿真环境: + +roslaunch turtlebot3_gazebo turtlebot3_stage_2.launch +在vscore启动DQN2.py + + + + +## 仿真效果 +PPO: + +[video(video-t2HjOTzy-1652683094981)(type-bilibili)(url-https://player.bilibili.com/player.html?aid=549434701)(image-https://img-blog.csdnimg.cn/img_convert/6ec8742c0bdba36b450f1293d66dddd0.png)(title-PPO算法在ROS-turtlebot3仿真)] + + +DQN: + +[video(video-85wyysof-1652683114244)(type-bilibili)(url-https://player.bilibili.com/player.html?aid=891843299)(image-https://img-blog.csdnimg.cn/img_convert/3eae7dfe58cfe6a9242dbead330740c1.png)(title-DQN-200回合效果)] + +SAC: + +[video(video-Zem8foNk-1652683160999)(type-bilibili)(url-https://player.bilibili.com/player.html?aid=553110855)(image-https://img-blog.csdnimg.cn/img_convert/57273d06044db2ae16154b36893f234d.png)(title-SAC算法)] + + +## 真实环境测试: + +代码地址: + +```shell +git clone https://gitee.com/fangxiaosheng666/PPO-SAC +``` + +基于离散动作的PPO:[视频](https://www.bilibili.com/video/BV1N44y1G7WJ/) + + +[video(video-vmhHV9sa-1652683025548)(type-bilibili)(url-https://player.bilibili.com/player.html?aid=255683208)(image-https://img-blog.csdnimg.cn/img_convert/86e38ba782e6002b99cadffdd0bd9088.png)(title-在机器人导航中使用深度强化学习)] + + + +基于连续动作的SAC:[视频](https://www.bilibili.com/video/BV1LY411j7jW/) + + +[video(video-ilHO1s3J-1652683071979)(type-bilibili)(url-https://player.bilibili.com/player.html?aid=980761350)(image-https://img-blog.csdnimg.cn/img_convert/c7d3859414457ff06e5603738902cdd0.png)(title-sac 连续控制)] + + +## 交流QQ群:877273841(学习资料交流群) + +## 训练数据可视化: + +使用pytorch的tensorborad.[参考](https://zhuanlan.zhihu.com/p/103630393) + +```shell +tensorboard --logdir C:\Users\26503\Desktop\毕业设计\训练数据\DQN +``` +![在这里插入图片描述](https://img-blog.csdnimg.cn/03888d9cc4514684a59b224f33a3c4ae.png) + + +# 深度强化学习-学习最前沿中文资源推荐: + +## 微信公众号: + +### 深度学习实验室 + +### RLCN + +### OpenDILab + +# 好的课程: + +中文理论-[王树森](https://www.bilibili.com/video/BV1YK4y1G7jw?spm_id_from=333.337.search-card.all.click) + +英文课程-[MIT](https://www.bilibili.com/video/BV1ZL411M7Cv?spm_id_from=333.337.search-card.all.click) + +## [我的小车搭建](https://blog.csdn.net/qq2650326396/article/details/122161688?spm=1001.2014.3001.5502) + +在这过程中可能遇到一些问题可以通过使用Bing或者Google解决。或者加QQ群 + +下一期写算法如何迁移 + + + + + +