Named Entity Recognition For Construction Documents Based on Fine-Tuning of Large Language Models
| Zhou, Junyu | ||
| Ma, Zhiliang | ||
| 2024-10-07T09:13:01Z | ||
| 2024-10-07T09:13:01Z | ||
| 2024 | ||
AbstractNamed Entity Recognition (NER) is a necessary task for automatic processing of construction documents. In traditional methods, machine learning has been used, but they rely on large high-quality datasets that are manually made and costly to obtain. Therefore, this paper proposes a method of NER based on fine-tuning of Large Language Models (LLMs) for information extraction of construction documents. Firstly, low-quality datasets are semi-automatically generated from national standards, professional qualification textbooks, input method editor lexicons, including a generation-type dataset, a tagging-type dataset, and a question-answering dataset. Then, the above datasets are used to fine-tune an LLM for NER of structural elements to obtain optimal parametric conditions for fine-tuning. Finally, the optimal conditions are used to fine-tune the LLM and the latter was evaluated manually based on an established dataset and evaluation rules. The accuracy and completeness of the method are significantly improved compared to the LLM before fine-tuning, proving that the method works well. The research contributes to providing a more efficient method for automatic processing of construction documents. | ||
| http://hdl.handle.net/10890/57803 | ||
| en | ||
| Named Entity Recognition For Construction Documents Based on Fine-Tuning of Large Language Models | ||
| könyvfejezet | ||
| Open access | ||
| Szerző | ||
| 2024.06.29.-2024.07.02 | ||
| Praha, Czech Republic | ||
| Creative Construction Conference 2024 | ||
| 2024.09.01 | ||
| 978-615-5270-78-9 | ||
| Budapest University of Technology and Economics | ||
| Online | ||
| Proceedings of the Creative Construction Conference 2024 | ||
| Építéstechnológia és Menedzsment Tanszék | ||
| Online | ||
| Faculty of Architecture | ||
| 10.3311/CCC2024-175 | ||
| Műszaki tudományok | ||
| Műszaki tudományok - építészmérnöki tudományok | ||
| építészmérnöki tudományok | ||
| construction documents | ||
| large language model | ||
| named entity recognition | ||
| Konferenciacikk | ||
| Budapest University of Technology and Economics |