Journal: Proceedings of the AAAI Conference on Artificial Intelligence
Loading...
Abbreviation
Publisher
AAAI
66 results
Search Results
Publications 1 - 10 of 66
- MFES-HB: Efficient Hyperband with Multi-Fidelity Quality MeasurementsItem type: Conference Paper
Proceedings of the AAAI Conference on Artificial IntelligenceLi, Yang; Shen, Yu; Jiang, Jiawei; et al. (2021)Hyperparameter optimization (HPO) is a fundamental problem in automatic machine learning (AutoML). However, due to the expensive evaluation cost of models (e.g., training deep learning models or training models on large datasets), vanilla Bayesian optimization (BO) is typically computationally infeasible. To alleviate this issue, Hyperband (HB) utilizes the early stopping mechanism to speed up configuration evaluations by terminating those badly-performing configurations in advance. This leads to two kinds of quality measurements: (1) many low-fidelity measurements for configurations that get early-stopped, and (2) few high-fidelity measurements for configurations that are evaluated without being early stopped. The state-of-the-art HB-style method, BOHB, aims to combine the benefits of both BO and HB. Instead of sampling configurations randomly in HB, BOHB samples configurations based on a BO surrogate model, which is constructed with the high-fidelity measurements only. However, the scarcity of high-fidelity measurements greatly hampers the efficiency of BO to guide the configuration search. In this paper, we present MFES-HB, an efficient Hyperband method that is capable of utilizing both the high-fidelity and low-fidelity measurements to accelerate the convergence of HPO tasks. Designing MFES-HB is not trivial as the low-fidelity measurements can be biased yet informative to guide the configuration search. Thus we propose to build a Multi-Fidelity Ensemble Surrogate (MFES) based on the generalized Product of Experts framework, which can integrate useful information from multi-fidelity measurements effectively. The empirical studies on the real-world AutoML tasks demonstrate that MFES-HB can achieve 3.3-8.9x speedups over the state-of-the-art approach --- BOHB. - Majority Opinion Diffusion in Social Networks: An Adversarial ApproachItem type: Conference Paper
Proceedings of the AAAI Conference on Artificial IntelligenceZehmakan, Ahad N. (2021)We introduce and study a novel majority based opinion diffusion model. Consider a graph G, which represents a social network. Assume that initially a subset of nodes, called seed nodes or early adopters, are colored either black or white, which correspond to positive or negative opinion regarding a consumer product or a technological innovation. Then, in each round an uncolored node, which is adjacent to at least one colored node, chooses the most frequent color among its neighbors. Consider a marketing campaign which advertises a product of poor quality and its ultimate goal is that more than half of the population believe in the quality of the product at the end of the opinion diffusion process. We focus on three types of attackers which can select the seed nodes in a deterministic or random fashion and manipulate almost half of them to adopt a positive opinion toward the product (that is, to choose black color). We say that an attacker succeeds if a majority of nodes are black at the end of the process. Our main purpose is to characterize classes of graphs where an attacker cannot succeed. In particular, we prove that if the maximum degree of the underlying graph is not too large or if it has strong expansion properties, then it is fairly resilient to such attacks. Furthermore, we prove tight bounds on the stabilization time of the process (that is, the number of rounds it needs to end) in both settings of choosing the seed nodes deterministically and randomly. We also provide several hardness results for some optimization problems regarding stabilization time and choice of seed nodes. - SAM-Aware Graph Prompt Reasoning Network for Cross-Domain Few-Shot SegmentationItem type: Conference Paper
Proceedings of the AAAI Conference on Artificial Intelligence ~ AAAI-25 Technical Tracks 6Peng, Shi-Feng; Sun, Guolei; Li, Yong; et al. (2025)The primary challenge of cross-domain few-shot segmentation (CD-FSS) is the domain disparity between the training and inference phases, which can exist in either the input data or the target classes. Previous models struggle to learn feature representations that generalize to various unknown domains from limited training domain samples. In contrast, the large-scale visual model SAM, pre-trained on tens of millions of images from various domains and classes, possesses excellent generalizability. In this work, we propose a SAM-aware graph prompt reasoning network (GPRN) that fully leverages SAM to guide CD-FSS feature representation learning and improve prediction accuracy. Specifically, we propose a SAM-aware prompt initialization module (SPI) to transform the masks generated by SAM into visual prompts enriched with high-level semantic information. Since SAM tends to divide an object into many sub-regions, this may lead to visual prompts representing the same semantic object having inconsistent or fragmented features. We further propose a graph prompt reasoning (GPR) module that constructs a graph among visual prompts to reason about their interrelationships and enable each visual prompt to aggregate information from similar prompts, thus achieving global semantic consistency. Subsequently, each visual prompt embeds its semantic information into the corresponding mask region to assist in feature representation learning. To refine the segmentation mask during testing, we also design a non-parameter adaptive point selection module (APS) to select representative point prompts from query predictions and feed them back to SAM to refine inaccurate segmentation results. Experiments on four standard CD-FSS datasets demonstrate that our method establishes new state-of-the-art results. - Text-based RL Agents with Commonsense Knowledge: New Challenges, Environments and BaselinesItem type: Conference Paper
Proceedings of the AAAI Conference on Artificial IntelligenceMurugesan, Keerthiram; Atzeni, Mattia; Kapanipathi, Pavan; et al. (2021)Text-based games have emerged as an important test-bed for Reinforcement Learning (RL) research, requiring RL agents to combine grounded language understanding with sequential decision making. In this paper, we examine the problem of infusing RL agents with commonsense knowledge. Such knowledge would allow agents to efficiently act in the world by pruning out implausible actions, and to perform look-ahead planning to determine how current actions might affect future world states. We design a new text-based gaming environment called Text World Commonsense (TWC) for training and evaluating RL agents with a specific kind of commonsense knowledge about objects, their attributes, and affordances. We also introduce several baseline RL agents which track the sequential context and dynamically retrieve the relevant commonsense knowledge from ConceptNet. We show that agents which incorporate commonsense knowledge in TWC perform better, while acting more efficiently. We conduct user-studies to estimate human performance on TWC and show that there is ample room for future improvement. - On the Expressiveness and Length Generalization of Selective State-Space Models on Regular LanguagesItem type: Conference Paper
Proceedings of the AAAI Conference on Artificial Intelligence ~ AAAI-25 Technical Tracks 19Terzic, Aleksandar; Hersche, Michael; Camposampiero, Giacomo; et al. (2025)Selective state-space models (SSMs) are an emerging alternative to the Transformer, offering the unique advantage of parallel training and sequential inference. Although these models have shown promising performance on a variety of tasks, their formal expressiveness and length generalization properties remain underexplored. In this work, we provide insight into the workings of selective SSMs by analyzing their expressiveness and length generalization performance on regular language tasks, i.e., finite-state automaton (FSA) emulation. We address certain limitations of modern SSM-based architectures by introducing the Selective Dense State-Space Model (SD-SSM), the first selective SSM that exhibits perfect length generalization on a set of various regular language tasks using a single layer. It utilizes a dictionary of dense transition matrices, a softmax selection mechanism that creates a convex combination of dictionary matrices at each time step, and a readout consisting of layer normalization followed by a linear map. We then proceed to evaluate variants of diagonal selective SSMs by considering their empirical performance on commutative and non-commutative automata. We explain the experimental results with theoretical considerations. - Facility Location Problem with Capacity Constraints: Algorithmic and Mechanism Design PerspectivesItem type: Conference Paper
Proceedings of the AAAI Conference on Artificial Intelligence ~ AAAI-20 Technical Tracks 2Aziz, Haris; Chan, Hau; Lee, Barton; et al. (2020)We consider the facility location problem in the one-dimensional setting where each facility can serve a limited number of agents from the algorithmic and mechanism design perspectives. From the algorithmic perspective, we prove that the corresponding optimization problem, where the goal is to locate facilities to minimize either the total cost to all agents or the maximum cost of any agent is NP-hard. However, we show that the problem is fixed-parameter tractable, and the optimal solution can be computed in polynomial time whenever the number of facilities is bounded, or when all facilities have identical capacities. We then consider the problem from a mechanism design perspective where the agents are strategic and need not reveal their true locations. We show that several natural mechanisms studied in the uncapacitated setting either lose strategyproofness or a bound on the solution quality %on the returned solution for the total or maximum cost objective. We then propose new mechanisms that are strategyproof and achieve approximation guarantees that almost match the lower bounds. - FIRM: Flexible Interactive Reflection ReMovalItem type: Conference Paper
Proceedings of the AAAI Conference on Artificial Intelligence ~ AAAI-25 Technical Tracks 2Chen, Xiao; Jiang, Xudong; Tao, Yunkang; et al. (2025)Removing reflection from a single image is challenging due to the absence of general reflection priors. Although existing methods incorporate extensive user guidance for satisfactory performance, they often lack the flexibility to adapt user guidance in different modalities, and dense user interactions further limit their practicality. To alleviate these problems, this paper presents FIRM, a novel framework for Flexible Interactive image Reflection reMoval with various forms of guidance, where users can provide sparse visual guidance (e.g., points, boxes, or strokes) or text descriptions for better reflection removal. Firstly, we design a novel user guidance conversion module (UGC) to transform different forms of guidance into unified contrastive masks. The contrastive masks provide explicit cues for identifying reflection and transmission layers in blended images. Secondly, we devise a contrastive mask-guided reflection removal network that comprises a newly proposed contrastive guidance interaction block (CGIB). This block leverages a unique cross-attention mechanism that merges contrastive masks with image features, allowing for precise layer separation. The proposed framework requires only 10% of the guidance time needed by previous interactive methods, which makes a step-change in flexibility. Extensive results on public real-world reflection removal datasets validate that our method demonstrates state-of-the-art reflection removal performance. - Computationally Tractable Riemannian Manifolds for Graph EmbeddingsItem type: Conference Paper
Proceedings of the AAAI Conference on Artificial IntelligenceCruceru, Calin; Bécigneul, Gary; Ganea, Octavian-Eugen (2021)Representing graphs as sets of node embeddings in certain curved Riemannian manifolds has recently gained momentum in machine learning due to their desirable geometric inductive biases (e.g., hierarchical structures benefit from hyperbolic geometry). However, going beyond embedding spaces of constant sectional curvature, while potentially more representationally powerful, proves to be challenging as one can easily lose the appeal of computationally tractable tools such as geodesic distances or Riemannian gradients. Here, we explore two computationally efficient matrix manifolds, showcasing how to learn and optimize graph embeddings in these Riemannian spaces. Empirically, we demonstrate consistent improvements over Euclidean geometry while often outperforming hyperbolic and elliptical embeddings based on various metrics that capture different graph properties. Our results serve as new evidence for the benefits of non-Euclidean embeddings in machine learning pipelines. - Improving Neural Relation Extraction with Positive and Unlabeled LearningItem type: Conference Paper
Proceedings of the AAAI Conference on Artificial IntelligenceHe, Zhengqiu; Chen, Wenliang; Wang, Yuyi; et al. (2020)We present a novel approach to improve the performance of distant supervision relation extraction with Positive and Unlabeled (PU) Learning. This approach first applies reinforcement learning to decide whether a sentence is positive to a given relation, and then positive and unlabeled bags are constructed. In contrast to most previous studies, which mainly use selected positive instances only, we make full use of unlabeled instances and propose two new representations for positive and unlabeled bags. These two representations are then combined in an appropriate way to make bag-level prediction. Experimental results on a widely used real-world dataset demonstrate that this new approach indeed achieves significant and consistent improvements as compared to several competitive baselines. - Extreme k-Center ClusteringItem type: Conference Paper
Proceedings of the AAAI Conference on Artificial IntelligenceBateni, MohammadHossein; Esfandiari, Hossein; Fischer, Manuela; et al. (2021)Metric clustering is a fundamental primitive in machine learning with several applications for mining massive datasets. An important example of metric clustering is the k-center problem. While this problem has been extensively studied in distributed settings, all previous algorithms use Omega(k) space per machine and Omega(nk) total work. In this paper, we develop the first highly scalable approximation algorithm for k-center clustering, with (O) over tilde (n(epsilon)) space per machine and (O) over tilde (n(1+epsilon)) total work, for arbitrary small constant epsilon. It produces an O(log log log n)-approximate solution with k(1+o(1)) centers in O(log log n) rounds of computation.
Publications 1 - 10 of 66