In the field of security, baggage-screening with X-rays is used as nondestructive testing for threat object detection. This is a common protocol when inspecting passenger baggage particularly at airports. Unfortunately, the accuracy of such human inspection is around 80–90%, under optimal operator conditions. For this reason, it is quite necessary to assist human inspectors with the aid of computer vision algorithms. This work proposes a deep learning-based methodology designed to detect threat objects in (single spectrum) X-ray baggage scan images. For this purpose, our proposed framework simulates a large number of X-ray images, using a combination of PGGAN (Karras et al. in International conference on learning representations, 2018. https://openreview.net/forum?id=Hk99zCeAb) and superimposition (Mery and Katsaggelos in 2017 IEEE conference on computer vision and pattern recognition workshops (CVPRW), 2017.https://doi.org/10.1109/CVPRW.2017.37) strategies, that are used to train state-of-the-art detection models such as YOLO (Redmon et al. in You only look once: unified, real-time object detection. CoRR abs/1506.02640, 2015. http://arxiv.org/ abs/1506.02640), SSD (Liu et al. in SSD: single shot multibox detector. CoRR abs/1512.02325, 2015. http://arxiv. org/abs/1512.02325) and RetinaNet (Lin et al. in Focal loss for dense object detection. CoRR abs/1708.02002, 2017. http://arxiv.org/abs/1708.02002). Our method has been tested on real X-ray images in the detection of four categories of threat objects: guns, knives, razor blades and shuriken (ninja stars). In our experiments, YOLOv3 (Redmon and Farhadi in Yolov3: An incremental improvement. CoRR abs/1804.02767, 2018. http://arxiv.org/abs/1804.02767) obtained the best mean average precision (mAP) with 96.3% for guns, 76.2% for knives, 86.9% for razor blades and 93.7% for shuriken, while the average mAP for all threat objects was 80.0%. We believe the effectiveness of our method in the detection of threat objects makes its use in checkpoints possible. Moreover, our methodology is scalable and can be easily extended to detect other categories automatically.