Dachuan Zhang
Loading...
19 results
Filters
Reset filtersSearch Results
Publications 1 - 10 of 19
- Domain knowledge, just evaluation, and robust data standards are required to advance AI in food scienceItem type: Journal Article
Trends in Food Science & TechnologyZhang, Dachuan; Liu, Meihui; Yu, Zhaoshuo; et al. (2025)Background Artificial intelligence (AI) has shown transformative potential across many scientific fields, including food science. Applications span nutrition, safety, flavor, and sustainability. However, current AI implementations in food science often lack integration with domain expertise, face reproducibility challenges, and are hindered by fragmented datasets and limited benchmarking. Scope and approach This perspective outlines key challenges and proposes five strategic initiatives to guide the effective and responsible integration of AI in food science. These include embedding domain knowledge into models, establishing transparent and reproducible workflows, adopting benchmarking practices, promoting practical validation, and developing robust data standards and infrastructure. Key findings and conclusions To fully unlock AI's potential in food science, future research must prioritize domain-aware model development, open science practices, and practical validation. These efforts are critical to enabling reliable, generalizable, and impactful AI tools that address real-world challenges in the food systems. - Whats behind the façade? Mapping building materials using artificial intelligence and architectural historyItem type: Other Conference Item
SBE Conference Series ~ Sustainable Built Environment Conference 2025 Zurich - Extended AbstractsSchmid, Carlo; Kastner, Fabian; Zhang, Dachuan; et al. (2025) - Spatiotemporal mapping of Swiss exterior wall material stock using a large language model and architectural historyItem type: Journal Article
Journal of Industrial EcologySchmid, Carlo; Kastner, Fabian; Zhang, Dachuan; et al. (2025)Building material stock studies are essential for advancing the circular economy in construction. However, existing models often lack both accuracy and scalability. While machine learning has demonstrated significant potential to enhance predictive accuracy, its adoption has been hindered by a shortage of high-quality training data. In this study, we introduce a novel methodology leveraging a large language model to extract previously untapped building material data from building energy performance certificates with a focus on exterior walls. This approach enabled us to create a dataset of over 20,000 buildings—significantly larger than those used in previous studies. Leveraging this dataset, we developed a machine learning model to predict material composition based on building characteristics such as construction year, use, and location. Furthermore, we integrated knowledge of construction history to estimate the material stock of walls in terms of volume, mass, and associated CO2 emissions for each building in the dataset. Our analysis revealed significant regional variations in material use patterns, emphasizing the critical role of location—a parameter often overlooked in existing building material stock models. These findings provide valuable insights for improving building stock modeling and highlight the importance of regionally tailored policies in advancing the circular economy in the construction sector - High-throughput prediction of enzyme promiscuity based on substrate-product pairsItem type: Journal Article
Briefings in BioinformaticsXing, Huadong; Cai, Pengli; Liu, Dongliang; et al. (2024)The screening of enzymes for catalyzing specific substrate-product pairs is often constrained in the realms of metabolic engineering and synthetic biology. Existing tools based on substrate and reaction similarity predominantly rely on prior knowledge, demonstrating limited extrapolative capabilities and an inability to incorporate custom candidate-enzyme libraries. Addressing these limitations, we have developed the Substrate-product Pair-based Enzyme Promiscuity Prediction (SPEPP) model. This innovative approach utilizes transfer learning and transformer architecture to predict enzyme promiscuity, thereby elucidating the intricate interplay between enzymes and substrate-product pairs. SPEPP exhibited robust predictive ability, eliminating the need for prior knowledge of reactions and allowing users to define their own candidate-enzyme libraries. It can be seamlessly integrated into various applications, including metabolic engineering, de novo pathway design, and hazardous material degradation. To better assist metabolic engineers in designing and refining biochemical pathways, particularly those without programming skills, we also designed EnzyPick, an easy-to-use web server for enzyme screening based on SPEPP. EnzyPick is accessible at http://www.biosynther.com/enzypick/. - Analysis of public opinion on food safety in Greater China with big data and machine learningItem type: Journal Article
Current Research in Food ScienceZhang, Haoyang; Zhang, Dachuan; Wei, Zhisheng; et al. (2023)The Internet contains a wealth of public opinion on food safety, including views on food adulteration, food-borne diseases, agricultural pollution, irregular food distribution, and food production issues. To systematically collect and analyze public opinion on food safety in Greater China, we developed IFoodCloud, which automatically collects data from more than 3,100 public sources. Meanwhile, we constructed sentiment classification models using multiple lexicon-based and machine learning-based algorithms integrated with IFoodCloud that provide an unprecedented rapid means of understanding the public sentiment toward specific food safety incidents. Our best model's F1 score achieved 0.9737, demonstrating its great predictive ability and robustness. Using IFoodCloud, we analyzed public sentiment on food safety in Greater China and the changing trend of public opinion at the early stage of the 2019 Coronavirus Disease pandemic, demonstrating the potential of big data and machine learning for promoting risk communication and decision-making. - Enhanced Deep-Learning Model for Carbon Footprints of ChemicalsItem type: Journal Article
ACS Sustainable Chemistry & EngineeringZhang, Dachuan; Wang, Zhanyun; Oberschelp, Christopher; et al. (2024)Millions of chemicals have been designed; however, their product carbon footprints (PCFs) are largely unknown, leaving questions about their sustainability. This general lack of PCF data is because the data needed for comprehensive environmental analyses are typically not available in the early molecular design stages. Several predictive tools have been developed to estimate the PCF of chemicals, which are applicable to only a narrow range of common chemicals and have limited predictive ability. Here, we propose FineChem 2, which is based on a novel transformer framework and first-hand industry data, for accurately predicting the PCF of chemicals. Compared to previous tools, FineChem 2 demonstrates significantly better predictive power, and its applicability domains are improved by similar to 75% on a diverse set of chemicals on the global market, including the high-production-volume chemicals identified by regulators, daily chemicals, and chemical additives in food and plastics. In addition, through better interpretability from the attention mechanism, FineChem 2 may successfully identify PCF-intensive substructures and critical raw materials of chemicals, providing insights into the design of more sustainable molecules and processes. Therefore, we highlight FineChem 2 for estimating the PCF of chemicals, contributing to advancements in the sustainable transition of the global chemical industry. - Coassembly of hybrid microscale biomatter for robust, water-processable, and sustainable bioplasticsItem type: Journal Article
Science AdvancesQiu, Yijin; Zhang, Dachuan; Long, Min; et al. (2025)Unlike conventional methods that typically involve extracting biopolymers/monomers from biomass using lots of hazardous chemicals and high energy, the direct utilization of biological matter (biomatter) without extraction offers a more sustainable alternative for bioplastic production. However, it often suffers from insufficient mechanical performances or limited processabilities. Herein, we proposed a hybrid microscale biomatter coassembly strategy that leverages the interactions between the inherent microarchitectures of waste cotton fiber and pollen particles. With minimal preprocessing, they form a castable slurry that can spontaneously organize into a dense fiber-laminate bioplastic network, exhibiting high mechanical properties (52.22 megapascals and 2.24 gigapascals) without using toxic organic chemicals or heavy machinery. The resulting bioplastic features controlled hydration-induced microstructural disassembly/reassembly, enabling water-based processability into complex, dynamic architectural systems. In addition, it demonstrates good biodegradability, closed-loop recyclability, and satisfactory environmental benefits, outperforming most common plastics. This study provides an instant nature-derived paradigm for bioplastics’ sustainable production, processing, and recycling, offering a promising solution for facilitating eco-friendly advanced applications. - Food recommendation towards personalized wellbeingItem type: Review Article
Trends in Food Science & TechnologyQiao, Guanhua; Zhang, Dachuan; Zhang, Nana; et al. (2025)Background: The intersection of nutrition and technology gave birth to the research of food recommendation system (FRS), which marked the transformation of traditional diet to a more personalized and healthy direction. The FRS uses advanced data analysis and machine learning technology to provide customized dietary advice according to users' personal preferences, and nutritional needs, which plays a vital role in promoting public health and reducing disease risks. Scope and approach: This review presents the architecture of FRS and deeply discusses various recommendation algorithms, including the content-based method, collaborative filtering method, knowledge graph-based method, and hybrid methods. The review further introduces existing data resources and evaluation metrics, and highlights key technologies in user profiling and food analysis. In addition, the wide application of personalized FRS is summarized, and the importance of these systems in satisfying users' dietary preferences and maintaining balanced nutrition is emphasized. Finally, the key challenges and development trends of FRS are deeply analyzed from data level, model level and user experience level. Key findings and conclusions: Personalized FRS shows great potential in helping users make healthier dietary decisions. Although there are still many challenges, such as dealing with heterogeneous data and interpretability. But with the progress of technology, there will be broader development in the future. For example, the powerful data processing ability of deep learning will effectively improve the accuracy of the system. In addition, the application of interactive recommendation system and large language model will also provide strong support for satisfying user experience and improving acceptance. - RDBridge: a knowledge graph of rare diseases based on large-scale text miningItem type: Journal Article
BioinformaticsXing, Huadong; Zhang, Dachuan; Cai, Pengli; et al. (2023)Motivation: Despite low prevalence, rare diseases affect 300 million people worldwide. Research on pathogenesis and drug development lags due to limited commercial potential, insufficient epidemiological data, and a dearth of publications. The unique characteristics of rare diseases, including limited annotated data, intricate processes for extracting pertinent entity relationships, and difficulties in standardizing data, represent challenges for text mining. Results: We developed a rare disease data acquisition framework using text mining and knowledge graphs and constructed the most comprehensive rare disease knowledge graph to date, Rare Disease Bridge (RDBridge). RDBridge offers search functions for genes, potential drugs, pathways, literature, and medical imaging data that will support mechanistic research, drug development, diagnosis, and treatment for rare diseases. Availability and implementation: RDBridge is freely available at http://rdb.lifesynther.com/. - MCF2Chem: A manually curated knowledge base of biosynthetic compound productionItem type: Journal Article
Biotechnology for Biofuels and BioproductsCai, Pengli; Liu, Sheng; Zhang, Dachuan; et al. (2023)Background Microbes have been used as cell factories to synthesize various chemical compounds. Recent advances in synthetic biological technologies have accelerated the increase in the number and capacity of microbial cell factories; the variety and number of synthetic compounds produced via these cell factories have also grown substantially. However, no database is available that provides detailed information on the microbial cell factories and the synthesized compounds. Results In this study, we established MCF₂Chem, a manually curated knowledge base on the production of biosynthetic compounds using microbial cell factories. It contains 8888 items of production records related to 1231 compounds that were synthesizable by 590 microbial cell factories, including the production data of compounds (titer, yield, productivity, and content), strain culture information (culture medium, carbon source/precursor/substrate), fermentation information (mode, vessel, scale, and condition), and other information (e.g., strain modification method). The database contains statistical analyses data of compounds and microbial species. The data statistics of MCF2Chem showed that bacteria accounted for 60% of the species and that “fatty acids”, “terpenoids”, and “shikimates and phenylpropanoids” accounted for the top three chemical products. Escherichia coli, Saccharomyces cerevisiae, Yarrowia lipolytica, and Corynebacterium glutamicum synthesized 78% of these chemical compounds. Furthermore, we constructed a system to recommend microbial cell factories suitable for synthesizing target compounds and vice versa by combining MCF₂Chem data, additional strain- and compound-related data, the phylogenetic relationships between strains, and compound similarities. Conclusions MCF₂Chem provides a user-friendly interface for querying, browsing, and visualizing detailed statistical information on microbial cell factories and their synthesizable compounds. It is publicly available at https://mcf.lifesynther.com. This database may serve as a useful resource for synthetic biologists.
Publications 1 - 10 of 19