Chengyu Zhang


Loading...

Last Name

Zhang

First Name

Chengyu

Organisational unit

Search Results

Publications1 - 10 of 12
  • Xu, Junjielong; Yang, Ruichun; Huo, Yintong; et al. (2024)
    ICSE '24: Proceedings of the IEEE/ACM 46th International Conference on Software Engineering
    Log parsing, which involves log template extraction from semi-structured logs to produce structured logs, is the first and the most critical step in automated log analysis. However, current log parsers suffer from limited effectiveness for two reasons. First, traditional data-driven log parsers solely rely on heuristics or handcrafted features designed by domain experts, which may not consistently perform well on logs from diverse systems. Second, existing supervised log parsers require model tuning, which is often limited to fixed training samples and causes sub-optimal performance across the entire log source. To address this limitation, we propose DivLog, an effective log parsing framework based on the in-context learning (ICL) ability of large language models (LLMs). Specifically, before log parsing, DivLog samples a small amount of offline logs as candidates by maximizing their diversity. Then, during log parsing, DivLog selects five appropriate labeled candidates as examples for each target log and constructs them into a prompt. By mining the semantics of examples in the prompt, DivLog generates a target log template in a training-free manner. In addition, we design a straightforward yet effective prompt format to extract the output and enhance the quality of the generated log templates. We conducted experiments on 16 widely-used public datasets. The results show that DivLog achieves (1) 98.1% Parsing Accuracy, (2) 92.1% Precision Template Accuracy, and (3) 92.9% Recall Template Accuracy on average, exhibiting state-of-the-art performance.
  • Park, Jiwon; Winterer, Dominik; Zhang, Chengyu; et al. (2021)
    Proceedings of the ACM on Programming Languages
    We propose Generative Type-Aware Mutation, an effective approach for testing SMT solvers. The key idea is to realize generation through the mutation of expressions rooted with parametric operators from the SMT-LIB specification. Generative Type-Aware Mutation is a hybrid of mutation-based and grammar-based fuzzing and features an infinite mutation space—overcoming a major limitation of OpFuzz, the state-of-the-art fuzzer for SMT solvers. We have realized Generative Type-Aware Mutation in a practical SMT solver bug hunting tool, TypeFuzz. During our testing period with TypeFuzz, we reported over 237 bugs in the state-of-the-art SMT solvers Z3 and CVC4. Among these, 189 bugs were confirmed and 176 bugs were fixed. Most notably, we found 18 soundness bugs in CVC4’s default mode alone. Several of them were two years latent (7/18). CVC4 has been proved to be a very stable SMT solver and has resisted several fuzzing campaigns.
  • Zhang, Chengyu; Su, Zhendong (2024)
    Proceedings of the ACM on Programming Languages
    One of the primary challenges in software testing is generating high-quality test inputs and obtaining corresponding test oracles. This paper introduces a novel methodology to mitigate this challenge in testing program verifiers by employing SMT (Satisfiability Modulo Theories) formulas as a universal test case generator. The key idea is to transform SMT formulas into programs and link the satisfiability of the formulas with the safety property of the programs, allowing the satisfiability of the formulas to act as a test oracle for program verifiers. This method was implemented as a framework named SMT2Test, which enables the transformation of SMT formulas into Dafny and C programs. An intermediate representation was designed to augment the flexibility of this framework, streamlining the transformation for other programming languages and fostering modular transformation strategies. We evaluated the effectiveness of SMT2Test by finding defects in two program verifiers: the Dafny verifier and CPAchecker. Utilizing the SMT2Test framework with the SMT formulas from the SMT competition and SMT solver fuzzers, we discovered and reported a total of 14 previously unknown defects in these program verifiers that were not found by previous methods. After reporting, all of them have been confirmed, and 6 defects have been fixed. These findings show the effectiveness of our method and imply its potential application in testing other programming language infrastructures.
  • Xiao, Shengping; Zhang, Chengyu; Li, Jianwen; et al. (2023)
    Lecture Notes in Computer Science ~ Tools and Algorithms for the Construction and Analysis of Systems. TACAS 2023
    We present FuzzBtor2, a fuzzer to generate random word-level model checking problems in Btor2 format. Btor2 is one of the mainstream input formats for word-level hardware model checking and was used in the most recent hardware model checking competition. Compared to bit-level one, word-level model checking is a more complex research field at an earlier stage of development. Therefore, it is necessary to develop a tool that can produce a large number of test cases in Btor2 format to test either existing or under-developed word-level model checkers. To evaluate the practicality of FuzzBtor2, we tested the state-of-the-art word-level model checkers AVR and Pono with the generated benchmarks. Experimental results show that both tools are buggy and not mature enough, which reflects the practical value of FuzzBtor2.
  • Validating SMT solvers via semantic fusion
    Item type: Conference Paper
    Winterer, Dominik; Zhang, Chengyu; Su, Zhendong (2020)
    Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation
    We introduce Semantic Fusion, a general, effective methodology for validating Satisfiability Modulo Theory (SMT) solvers. Our key idea is to fuse two existing equisatisfiable (i.e., both satisfiable or unsatisfiable) formulas into a new formula that combines the structures of its ancestors in a novel manner and preserves the satisfiability by construction. This fused formula is then used for validating SMT solvers. We realized Semantic Fusion as YinYang, a practical SMT solver testing tool. During four months of extensive testing, YinYang has found 45 confirmed, unique bugs in the default arithmetic and string solvers of Z3 and CVC4, the two state-of-the-art SMT solvers. Among these, 41 have already been fixed by the developers. The majority (29/45) of these bugs expose critical soundness issues. Our bug reports and testing effort have been well-appreciated by SMT solver developers. © 2020 ACM.
  • Su, Ting; Zhang, Chengyu; Yan, Yichen; et al. (2018)
    arXiv
  • Thorgeirsson, Sverrir; Zhang, Chengyu; Weidmann, Theo B.; et al. (2024)
    ICER '24: Proceedings of the 2024 ACM Conference on International Computing Education Research
    This paper presents a comparative study of Algot, a visual programming language designed to bridge the syntax-semantics gap via liveness and programming by demonstration, and the textual programming language Python. We conducted an experimental, within-subjects study with 24 undergraduate computer science students who performed recursion-based tasks in each language while their cognitive load was measured using an electroencephalogram and a validated survey instrument. The students received a brief introduction to Algot, but were all familiar with Python. The students performed significantly better when programming in Algot, but the cognitive load levels were similar according to both instruments. Our results provide evidence that within the domain that was tested, Algot can be quickly learned, and that students do not find it more cognitively demanding than working in a familiar language.
  • Winterer, Dominik; Zhang, Chengyu; Su, Zhendong (2020)
    Proceedings of the ACM on Programming Languages
    We propose type-aware operator mutation, a simple, but unusually effective approach for testing SMT solvers. The key idea is to mutate operators of conforming types within the seed formulas to generate well-typed mutant formulas. These mutant formulas are then used as the test cases for SMT solvers. We realized type-aware operator mutation within the OpFuzz tool and used it to stress-test Z3 and CVC4, two state-of-the-art SMT solvers. Type-aware operator mutations are unusually effective: During one year of extensive testing with OpFuzz, we reported 1092 bugs on Z3’s and CVC4’s respective GitHub issue trackers, out of which 819 unique bugs were confirmed and 685 of the confirmed bugs were fixed by the developers. The detected bugs are highly diverse — we found bugs of many different types (soundness bugs, invalid model bugs, crashes, etc.), logics and solver configurations. We have further conducted an in-depth study of the bugs found by OpFuzz. The study results show that the bugs found by OpFuzz are of high quality. Many of them affect core components of the SMT solvers’ codebases, and some required major changes for the developers to fix. Among the 819 confirmed bugs found by OpFuzz,184 were soundness bugs, the most critical bugs in SMT solvers,and 489 were in the default modes of the solvers. Notably, OpFuzz found 27 critical soundness bugs in CVC4, which has proved to be a very stable SMT solver.
  • Kamm, Matteo; Rigger, Manuel; Zhang, Chengyu; et al. (2023)
    ISSTA 2023: Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis
    Graph Database Management Systems (GDBMSs) store data as graphs and allow the efficient querying of nodes and their relationships. Logic bugs are bugs that cause a GDBMS to return an incorrect result for a given query (e.g., by returning incorrect nodes or relationships). The impact of such bugs can be severe, as they often go unnoticed. The core insight of this paper is that Query Partitioning, a test oracle that has been proposed to test Relational Database Systems, is applicable to testing GDBMSs as well. The core idea of Query Partitioning is that, given a query, multiple queries are derived whose results can be combined to reconstruct the given query's result. Any discrepancy in the result indicates a logic bug. We have implemented this approach as a practical tool named GDBMeter and evaluated GDBMeter on three popular GDBMSs and found a total of 40 unique, previously unknown bugs. We consider 14 of them to be logic bugs, the others being error or crash bugs. Overall, 27 of the bugs have been fixed, and 35 confirmed. We compared our approach to the state-of-the-art approach to testing GDBMS, which relies on differential testing; we found that it results in a high number of false alarms, while Query Partitioning reported actual logic bugs without any false alarms. Furthermore, despite the previous efforts in testing Neo4j and JanusGraph, we found 18 additional bugs. The developers appreciate our work and plan to integrate GDBMeter into their testing process. We expect that this simple, yet effective approach and the practical tool will be used to test other GDBMSs.
  • Wang, Shuai; Zhang, Chengyu; Su, Zhendong (2019)
    Proceedings of the ACM on Programming Languages
    The term “smart contracts” has become ubiquitous to describe an enormous number of programs uploaded to the popular Ethereum blockchain system. Despite rapid growth of the smart contract ecosystem, errors and exploitations have been constantly reported from online contract systems, which has put financial stability at risk with losses totaling millions of US dollars. Most existing research focuses on pinpointing specific types of vulnerabilities using known patterns. However, due to the lack of awareness of the inherent nondeterminism in the Ethereum blockchain system and how it affects the funds transfer of smart contracts, there can be unknown vulnerabilities that may be exploited by attackers to access numerous online smart contracts. In this paper, we introduce a methodical approach to understanding the inherent nondeterminism in the Ethereum blockchain system and its (unwanted) influence on contract payments. We show that our new focus on nondeterminism-related smart contract payment bugs captures the root causes of many common vulnerabilities without relying on any known patterns and also encompasses recently disclosed issues that are not handled by existing research. To do so, we introduce techniques to systematically model components in the contract execution context and to expose various nondeterministic factors that are not yet fully understood. We further study how these nondeterministic factors impact contract funds transfer using information flow tracking. The technical challenge of detecting nondeterministic payments lies in discovering the contract global variables subtly affected by read-write hazards because of unpredictable transaction scheduling and external callee behavior. We show how to augment and instrument a contract program into a representation that simulates the execution of a large subset of the contract behavior. The instrumented code is then analyzed to flag nondeterministic global variables using off-the-shelf model checkers. We implement the proposed techniques as a practical tool named NPChecker (Nondeterministic Payment Checker) and evaluate it on 30K online contracts (3,075 distinct) collected from the Ethereum mainnet. NPChecker has successfully detected nondeterministic payments in 1,111 online contracts with reasonable cost. Further investigation reports high precision of NPChecker (only four false positives in a manual study of 50 contracts). We also show that NPChecker unveils contracts vulnerable to recently-disclosed attack vectors. NPChecker can identify all six new vulnerabilities or variants of common smart contract vulnerabilities that are missed by existing research relying on a “contract vulnerability checklist.”
Publications1 - 10 of 12