The test problems used for the evaluation of the proposed algorithm was also used in 33, 35, 36 and is available at the Scheduling Research website (http://www.schedulingResearch.com). The structure of the benchmarked problem is of two types, namely the processing times and the setup times, which are randomly generated by using a discrete uniform distribution of the range U50, 100 and U125, 175 respectively. For the balanced processing and setup times experiment, the randomly generated values are both drawn from U50, 100. For the dominant processing times experiment, both the processing and setup times are drawn from U125, 175 and U50, 100. For the dominant setup times, both processing and setup times are drawn from U50, 100 and U125, 175 respectively. The test problem components consist of jobs n and machines m, where n is in the range of 20, 40, 60, 80 and 120 jobs respectively, while m is in the range of 2, 4, 6, 8, 10 and 12 machines respective. Previous research such as the work presented by Rabadi et al. 34 and Arnaout et al. 36 generated 15 instances for each combination of machine number, job number, processing time distribution and setup time distribution, which gave a total of 6×6×3×15=1620 test problem instances. The same test instances were used to evaluate the performance of the FA in this paper. In the analysis of the different methods described in this paper, the firefly and other existing algorithms’ parameters are configured as shown in Table 1 (experimentation based on number of generations) and Table 2 (experimentation based on the number of fitness evaluation consumed). However, the parameter settings for some existing algorithms such as PH, TS and Meta-RaPS were not provided in the literature and therefore, was not covered in these tables.