DIGERv2: Single-Stage Simultaneous Construction Worker and Equipment Activity Recognition Through Multi-Scale Path Aggregation
Date
Authors
Type
Language
Reading access rights:
Rights Holder
Conference Date
Conference Place
Conference Title
ISBN, e-ISBN
Container Title
Department
Version
Faculty
Subject Area
Subject Field
Subject (OSZKAR)
automated site monitoring
computer vision
multi-scale activity recognition
Gender
University
- Cite this item
- https://doi.org/10.3311/CCC2024-044
OOC works
Abstract
Developing vision-based construction worker and equipment activity monitoring methods is an active area of research with many applications such as enhancing safety and productivity. These applications necessitate the development of accurate, real-time activity recognition methods. One of these methods, DIGER, which was recently developed by the authors, was able to achieve this goal for construction equipment activities by utilizing temporal gradient data and knowledge distillation. However, the application of DIGER for simultaneous detection of the activities of workers and equipment, which may have large differences in their scales in the captured video, suffers from low activity recognition and localization accuracy. To address this limitation, this paper integrates a modified multi-scale Path Aggregation Network (PANet) into the existing DIGER architecture. The integrated PANet combines the output of the different feature maps through top-down and bottom-up pathways to further enhance the information propagation in all layers of the network. The resulting network, called DIGERv2, benefits from high semantic and accurate localization information at various scales, as evidenced by the obtained 88.4% and 74.36% activity recognition and localization accuracies, which correspond to 12.26% and 14.94% improvements over the DIGER method, respectively.