This paper introduces a method for automatic robot programming of industrial robots using reinforcement learning on a Software-in-the-loop simulation. The focus of the the paper is on the higher levels of a hierarchical robot programming problem. While the lower levels the skills are stored as domain specific program code, the combination of the skills into a robot control program to solve a specific task is automated. The reinforcement learning learning approach allows the shopfloor workers and technicians just to define the end result of the manufacturing process through a reward function. The programming and process optimization is done within the learning procedure. The Software-in-the-loop simulation with the robot control software makes it possible to to interpret the real program code and generate the exact motion. The exact motion of the robot is needed in order to find not just an optimal but also a collision-free policy.