Search
Results
-
Orion: Interference-aware, Fine-grained GPU Sharing for ML Applications
(2024)EuroSys '24: Proceedings of the Nineteenth European Conference on Computer SystemsGPUs are critical for maximizing the throughput-per-Watt of deep neural network (DNN) applications. However, DNN applications often underutilize GPUs, even when using large batch sizes and eliminating input data processing or communication stalls. DNN workloads consist of data-dependent operators, with different compute and memory requirements. While an operator may saturate GPU compute units or memory bandwidth, it often leaves other GPU ...Conference Paper -
Timely Identification of Victim Addresses in DeFi Attacks
(2024)Lecture Notes in Computer Science ~ Computer Security - ESORICS 2023Over the past years, Decentralized Finance (DeFi) protocols have suffered from several attacks. As a result, multiple solutions have been proposed to prevent such attacks. Most solutions rely on identifying malicious transactions before they are included in blocks. However, with the emergence of private pools, attackers can now conceal their exploit transactions from attack detection. This poses a significant challenge for existing security ...Conference Paper -
Syntax-Aware Mutation for Testing the Solidity Compiler
(2024)Lecture Notes in Computer Science ~ Computer Security – ESORICS 2023We introduce fuzzol, the first syntax-aware mutation fuzzer for systematically testing the security and reliability of solc, the standard Solidity compiler. fuzzol addresses a challenge of existing fuzzers when dealing with structured inputs: the generation of inputs that get past the parser checks of the system under test. To do so, fuzzol introduces a novel syntax- aware mutation that breaks into three strategies, each of them making ...Conference Paper -
Efficient Auditing of Event-driven Web Applications
(2024)EuroSys '24: Proceedings of the Nineteenth European Conference on Computer SystemsWhen a deployer of a web application puts that application on a server (on-prem or cloud), how can they be sure that the application is executing as intended? This paper studies how the deployer can efficiently check that the execution is faithful. We seek mechanisms that: (i) work with web applications that are built with modern event-driven web frameworks, (ii) impose tolerable computation and communication overheads on the web server, ...Conference Paper -
Arrow Matrix Decomposition: A Novel Approach for Communication-Efficient Sparse Matrix Multiplication
(2024)Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP ~ PPoPP '24: 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel ProgrammingWe propose a novel approach to iterated sparse matrix dense matrix multiplication, a fundamental computational kernel in scientific computing and graph neural network training. In cases where matrix sizes exceed the memory of a single compute node, data transfer becomes a bottleneck. An approach based on dense matrix multiplication algorithms leads to suboptimal scalability and fails to exploit the sparsity in the problem. To address these ...Conference Paper -
POSTER: RELAX: Durable Data Structures with Swift Recovery
(2024)PPoPP '24: 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel ProgrammingRecent non-volatile main memory technology gave rise to an abundance of research on building persistent data structures, whose content can be recovered after a system crash. While there has been significant progress in making durable data structures efficient, shortening the length of the recovery phase after a crash has not received much attention. In this paper we present the RELAX general transformation. RELAX generates lock-free durable ...Conference Paper -
Optimus: Warming Serverless ML Inference via Inter-Function Model Transformation
(2024)EuroSys '24: Proceedings of the Nineteenth European Conference on Computer SystemsServerless ML inference is an emerging cloud computing paradigm for low-cost, easy-to-manage inference services. In serverless ML inference, each call is executed in a container; however, the cold start of containers results in long inference delays. Unfortunately, most existing works do not work well because they still need to load models into containers from scratch, which is the bottleneck based on our observations. Therefore, this ...Conference Paper -
The Impact of Knowledge Structures on Firm Performance in the Biopharmaceutical Industry
(2010)Other Conference Item -
Collective Lunch: An Empirical Test of Fair and Efficient Equilibrium
(2010)Other Conference Item -
Do Self-driving Cars Swallow Public Transport? A Game-theoretical Perspective on Transportation Systems
(2019)Program Book: INFORMS Annual Meeting 2019Other Conference Item