• Author(s): Sukmin Yun, Haokun Lin, Rusiru Thushara, Mohammad Qazim Bhat, Yongxin Wang, Zutao Jiang, Mingkai Deng, Jinhong Wang, Tianhua Tao, Junbo Li, Haonan Li, Preslav Nakov, Timothy Baldwin, Zhengzhong Liu, Eric P. Xing, Xiaodan Liang, Zhiqiang Shen

“Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs” introduces Web2Code, a comprehensive dataset and evaluation framework designed to advance the capabilities of multimodal large language models (LLMs) in converting web pages into code. This research addresses the growing need for automated tools that can accurately interpret and translate complex web page structures into functional code, a task that is increasingly relevant in web development and digital content creation.


Web2Code is built to provide a robust benchmark for evaluating the performance of multimodal LLMs in webpage-to-code translation tasks. The dataset includes a diverse collection of web pages, each paired with corresponding code snippets that represent the underlying structure and functionality of the pages. This pairing allows models to learn the intricate relationships between visual elements on a webpage and their code representations, facilitating more accurate and efficient code generation.

One of the key innovations of Web2Code is its scale and diversity. The dataset encompasses a wide range of web page types, including static pages, dynamic content, and interactive elements. This diversity ensures that models trained and evaluated on Web2Code can generalize well to various real-world scenarios, making them more versatile and practical for web development tasks.

The evaluation framework provided by Web2Code includes a set of standardized metrics and benchmarks that allow researchers to systematically assess the performance of different models. These metrics cover various aspects of webpage-to-code translation, such as accuracy, efficiency, and robustness. By providing a comprehensive evaluation framework, Web2Code enables researchers to identify strengths and weaknesses in their models and make informed improvements.

The paper presents extensive experimental results to demonstrate the effectiveness of Web2Code. The authors evaluate several state-of-the-art multimodal LLMs on the dataset and compare their performance. The results show that Web2Code is effective in highlighting the capabilities and limitations of different models, providing valuable insights for further development. Additionally, the paper includes qualitative examples that illustrate the practical applications of Web2Code. These examples show how the dataset can be used to train models that generate high-quality code from web page designs, streamlining the web development process and reducing the need for manual coding.

“Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs” presents a significant advancement in the field of automated web development. By introducing a comprehensive dataset and evaluation framework, the authors provide a valuable resource for improving the accuracy and efficiency of webpage-to-code translation models. This research has important implications for web development, making it easier to create and maintain complex web pages with the help of advanced AI tools.