Abstract
In engineering disciplines, leveraging generative language models requires using specialized datasets for training or fine-tuning the preexisting models. Compiling these domain-specific datasets is a complex endeavor, demanding significant human effort and resources. To address the problem of domain-specific dataset scarcity, this study investigates the potential of generative large language models (LLMs) in creating synthetic domain-specific textual datasets for engineering design domains. By harnessing the advanced capabilities of LLMs, such as GPT-4, a systematic methodology was developed to create high-fidelity datasets using designed prompts, evaluated against a manually labeled benchmark dataset through various computational measurements without human intervention. Findings suggest that well-designed prompts can significantly enhance the quality of domain-specific synthetic datasets with reduced manual effort. The research highlights the importance of prompt design in eliciting precise, domain-relevant information and discusses the balance between dataset robustness and richness. It is demonstrated that a language model trained on synthetic datasets can achieve a level of performance comparable to that of human-labeled, domain-specific datasets in terms of quality, offering a strategic solution to the limitations imposed by dataset shortages in engineering domains. The implications for design thinking processes are particularly noteworthy, with the potential to assist designers through GPT-4's structured reasoning capabilities. This work presents a complete guide for domain-specific dataset generation, automated evaluation metrics, and insights into the interplay between data robustness and comprehensiveness.