Műegyetemi Digitális Archívum
 

DIGERv2: Single-Stage Simultaneous Construction Worker and Equipment Activity Recognition Through Multi-Scale Path Aggregation

Date

Type

könyvfejezet

Language

en

Reading access rights:

Open access

Rights Holder

Szerző

Conference Date

2024.06.29.-2024.07.02

Conference Place

Praha, Czech Republic

Conference Title

Creative Construction Conference 2024

ISBN, e-ISBN

978-615-5270-78-9

Container Title

Proceedings of the Creative Construction Conference 2024

Department

Építéstechnológia és Menedzsment Tanszék

Version

Online

Faculty

Faculty of Architecture

Subject Area

Műszaki tudományok

Subject Field

építészmérnöki tudományok

Subject (OSZKAR)

activity recognition
automated site monitoring
computer vision
multi-scale activity recognition

Gender

Konferenciacikk

University

Budapest University of Technology and Economics

OOC works

Abstract

Developing vision-based construction worker and equipment activity monitoring methods is an active area of research with many applications such as enhancing safety and productivity. These applications necessitate the development of accurate, real-time activity recognition methods. One of these methods, DIGER, which was recently developed by the authors, was able to achieve this goal for construction equipment activities by utilizing temporal gradient data and knowledge distillation. However, the application of DIGER for simultaneous detection of the activities of workers and equipment, which may have large differences in their scales in the captured video, suffers from low activity recognition and localization accuracy. To address this limitation, this paper integrates a modified multi-scale Path Aggregation Network (PANet) into the existing DIGER architecture. The integrated PANet combines the output of the different feature maps through top-down and bottom-up pathways to further enhance the information propagation in all layers of the network. The resulting network, called DIGERv2, benefits from high semantic and accurate localization information at various scales, as evidenced by the obtained 88.4% and 74.36% activity recognition and localization accuracies, which correspond to 12.26% and 14.94% improvements over the DIGER method, respectively.

Description

Keywords