Exploring Human and Language Model Alignment in Perceived Design Similarity Using Ordinal Embeddings


Loading...

Date

2025-10

Publication Type

Journal Article

ETH Bibliography

yes

Citations

Scopus:
Altmetric

Data

Abstract

In recent years, large language models (LLMs) and vision language models (VLMs) have excelled at tasks requiring human-like reasoning, inspiring researchers in engineering design to use language models (LMs) as surrogate evaluators of design concepts. But do these models actually evaluate designs like humans? While recent work has shown that LM evaluations sometimes fall within human variance on Likert-scale grading tasks, those tasks often obscure the reasoning and biases behind the scores. To address this limitation, we compare LM word embeddings (trained to capture semantic similarity) with human-rated similarity embeddings derived from triplet comparisons (“is A closer to B than C?”) on a dataset of design sketches and descriptions. We assess alignment via local tripletwise similarity and embedding distances, allowing for deeper insights than raw Likert-scale scores provide. We also explore whether describing the designs to LMs through text or images improves alignment with human judgments. Our findings suggest that text alone may not fully capture the nuances humans key into, yet text-based embeddings outperform their multimodal counterparts on satisfying local triplets. On the basis of these insights, we offer recommendations for effectively integrating LMs into design evaluation tasks.

Publication status

published

Editor

Book title

Volume

147 (10)

Pages / Article No.

101402

Publisher

American Society of Mechanical Engineers

Event

Edition / version

Methods

Software

Geographic location

Date collected

Date created

Subject

Comceptual design; Creativity and concept generation; Design evaluation; Design process; Design theory and methodology; Design visualization; Machine learning

Organisational unit

09828 - Fuge, Mark / Fuge, Mark check_circle

Notes

Funding

Related publications and datasets