
Proceedings Track: Accepted Papers
Presentation Format
Accepted Proceedings Track papers are presented as posters at CPAL 2025. A select number of accepted Proceedings Track papers will be presented as orals; they are labeled below with (Oral). See the full program for the precise time and location of each oral and poster session.
Towards Vector Optimization on Low-Dimensional Vector Symbolic Architecture
Shijin Duan, Yejia Liu, Gaowen Liu, Ramana Rao Kompella, Shaolei Ren, Xiaolin Xu
Keywords: Vector Symbolic Architecture, Batch Normalization, Knowledge Distillation
SGD with Weight Decay Secretly Minimizes the Ranks of Your Neural Networks
Tomer Galanti, Zachary S Siegel, Aparna Gupte, Tomaso A Poggio
Keywords: Low-Rank, SGD, Implicit Bias, Rank, Rank Minimization, Weight Decay
Explaining and Mitigating the Modality Gap in Contrastive Multimodal Learning
Can Yaras, Siyi Chen, Peng Wang, Qing Qu
Keywords: multimodal learning, modality gap, contrastive learning
Collaborative and Efficient Personalization with Mixtures of Adaptors
Abdulla Jasem Almansoori, Samuel Horváth, Martin Takáč
Keywords: federated learning, personalization, multi-task learning, clustering, parameter-efficient
Are all layers created equal: A neural collapse perspective
Jinxin Zhou, Jiachen Jiang, Zhihui Zhu
Keywords: Deep Learning, Neural Collapse, Robustness, Generalization, Memorization, Understanding
White-box Error Correction Code Transformer
Ziyan Zheng, Chin Wa Lau, Nian Guo, Xiang Shi, Shao-Lun Huang
Keywords: Error Correction Codes, Neural Decoder, White-box Transformer, Sparse Rate Reduction, Tanner Graph
On How Iterative Magnitude Pruning Discovers Local Receptive Fields in Fully Connected Neural Networks
William T Redman, Zhangyang Wang, Alessandro Ingrosso, Sebastian Goldt
Keywords: iterative magnitude pruning, lottery tickets, sparse machine learning, gaussian statistics
Hamiltonian Mechanics of Feature Learning: Bottleneck Structure in Leaky ResNets (Oral)
Arthur Jacot, Alexandre Kaiser
Keywords: Low-rank bias, NeuralODE, Hamiltonian, Bottleneck structure
Streaming Kernel PCA Algorithm With Small Space
Yichuan Deng, Jiangxuan Long, Zhao Song, Zifan Wang, Han Zhang
Keywords: Principal Component Analysis, Kernel Method, Streaming Algorithm
Sufficient and Necessary Explanations (and What Lies in Between) (Oral)
Beepul Bharti, Paul Yi, Jeremias Sulam
Keywords: interpretability, explainability
Progressive Gradient Flow for Robust N:M Sparsity Training in Transformers (Oral)
Abhimanyu Rajeshkumar Bambhaniya, Amir Yazdanbakhsh, Suvinay Subramanian, Sheng-Chun Kao, Shivani Agrawal, Utku Evci, Tushar Krishna
Keywords: N:M structured sparsity, sparsity, model compression, attention-based models, sparse training recipe
AgentHPO: Large Language Model Agent for Hyper-Parameter Optimization
Siyi Liu, Chen Gao, Yong Li
Keywords: Large Language Models, Agent, Hyperparameter Optimization
Sparse MoE as a New Treatment: Addressing Forgetting, Fitting, Learning Issues in Multi-Modal Multi-Task Learning
Jie Peng, Sukwon Yun, Kaixiong Zhou, Ruida Zhou, Thomas Hartvigsen, Yanyong Zhang, Zhangyang Wang, Tianlong Chen
Keywords: transformer, sparse mixture-of-experts, multi-modal learning, multi-task learning
Exact and Rich Feature Learning Dynamics of Two-Layer Linear Networks
Wei Huang, Wuyang Chen, zhiqiang xu, Zhangyang Wang, Taiji Suzuki
Keywords: Neural networks dyanmics, Feature Learning, Optimization
Vanishing Feature: Diagnosing Model Merging and Beyond (Oral)
Xingyu Qu, Samuel Horváth
Keywords: Model Merging, Efficiency, Deep Learning, Efficient Deep Learning
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients
Zhenyu Zhang, AJAY KUMAR JAISWAL, Lu Yin, Shiwei Liu, Jiawei Zhao, Yuandong Tian, Zhangyang Wang
Keywords: Large Language Models; Memory Efficient Training; Low Rank
Enhancing Video Representation Learning with Temporal Differentiation
Siyi Chen, Minkyu Choi, Zesen Zhao, Kuan Han, Qing Qu, Zhongming Liu
Keywords: video representation learning, physics-inspired
FedOSAA: Improving Federated Learning with One-Step Anderson Acceleration
Xue Feng, M. Paul Laiu, Thomas Strohmer
Keywords: federated learning, quasi-Newton methods, Anderson acceleration
Closure Discovery for Coarse-Grained Partial Differential Equations Using Grid-based Reinforcement Learning (Oral)
Jan-Philipp von Bassewitz, Sebastian Kaltenbach, Petros Koumoutsakos
Keywords: Closure Discovery, Inductive Bias, Multi-Agent Reinforcement Learning
Fast and Efficient Matching Algorithm with Deadline Instances
Zhao Song, Weixin Wang, Chenbo Yin, Junze Yin
Keywords: online weighted matching problem, sketching
Learning Effective Dynamics across Spatio-Temporal Scales of Complex Flows
Han Gao, Sebastian Kaltenbach, Petros Koumoutsakos
Keywords: Learned Effective Dynamics, Reduced-Order Modeling, Multiscale Systems, Turbulent Flows
RecCrysFormer: Refined Protein Structural Prediction from 3D Patterson Maps via Recycling Training Runs
Tom Pan, Evan Dramko, Mitchell D. Miller, George N Phillips Jr., Anastasios Kyrillidis
Keywords: Protein Structural Prediction, Transformers, Patterson Maps
Taming Sensitive Weights : Noise Perturbation Fine-tuning for Robust LLM Quantization
DONGWEI WANG, Huanrui Yang
Keywords: LLM quantization, Hessian trace, Noise-aware finetuning
Adversarially Robust Spiking Neural Networks with Sparse Connectivity
Mathias Schmolli, Maximilian Baronig, Robert Legenstein, Ozan Ozdenizci
Keywords: adversarial robustness, spiking neural networks, ANN-to-SNN conversion, sparsity, robust pruning
Quantum EigenGame for excited state calculation
David A. Quiroga, Jason Han, Anastasios Kyrillidis
Keywords: variational quantum algorithms, PCA, EigenGame, eigensolvers
Improving Neuron-level Interpretability with White-box Language Models (Oral)
Hao Bai, Yi Ma
Keywords: White-box models, deep learning architectures, neuron-level interpretation
You Only Debias Once: Towards Flexible Accuracy-Fairness Trade-offs at Inference Time (Oral)
Xiaotian Han, Tianlong Chen, Kaixiong Zhou, Zhimeng Jiang, Zhangyang Wang, Xia Hu
Keywords: fairness, weight space, neural network subspace
Grouped Sequential Optimization Strategy - the Application of Hyperparameter Importance Assessment in Deep Learning
Ruinan Wang, Ian T. Nabney, MOHAMMAD GOLBABAEE
Keywords: Optimization, Hyperparameter Optimization, Hyperparameter Importance Assessment, Model Efficiency, Search Space Exploration, Resource Allocation
The Computational Limits of State-Space Models and Mamba via the Lens of Circuit Complexity (Oral)
Yifang Chen, Xiaoyu Li, Yingyu Liang, Zhenmei Shi, Zhao Song
Keywords: State-Space Models, Mamba, Circuit Complexity, Computational Limits
Curse of Attention: A Kernel-Based Perspective for Why Transformers Fail to Generalize on Time Series Forecasting and Beyond
Yekun Ke, Yingyu Liang, Zhenmei Shi, Zhao Song, Chiwun Yang
Keywords: Time Series Forecasting, Transformer Generalization, Kernel Methods
Asymptotic Behavior of the Coordinate Ascent Variational Inference in Singular Models
Sean C Plummer, Anirban Bhattacharya, Debdeep Pati, Yun Yang
Keywords: Coordinate Ascent Variational Inference, Singular Models, Dynmaical Systems
Theoretical and Empirical Advances in Forest Pruning
Albert Dorador
Keywords: Regression, Decision Trees, Ensemble Learning, Pruning, Interpretable Machine Learning
Bridging Domain Adaptation and Graph Neural Networks: A Tensor-Based Framework for Effective Label Propagation
Tao Wen, Elynn Chen, Yuzhou Chen, Qi Lei
Keywords: Graph Classification, Domain Adaptation, Label Propagation
Unlock the Theory behind Scaling 1-bit Neural Networks
Majid Daliri, Zhao Song, Chiwun Yang
Keywords: 1-bit neural network, neural tangent kernel, scaling law theory
MoXCo: How I learned to stop exploring and love my local minima?
Esha Singh, Shoham Sabach, Yu-Xiang Wang
Keywords: optimization, deep learning, adaptive methods
Greedy Output Approximation: Towards Efficient Structured Pruning for LLMs Without Retraining
Jianwei Li, Yijun Dong, Qi Lei
Keywords: Efficient, Structured Pruning, LLMs
A unified framework for Sparse plus Low-Rank Matrix Decomposition for LLMs (Oral)
Mehdi Makni, Kayhan Behdin, Zheng Xu, Natalia Ponomareva, Rahul Mazumder
Keywords: model compression, sparse plus low-rank, optimization, inference acceleration, 2:4 sparsity, hardware and system co-design
FedPeWS: Personalized Warmup via Subnetworks for Enhanced Heterogeneous Federated Learning
Nurbek Tastan, Samuel Horváth, Martin Takáč, Karthik Nandakumar
Keywords: federated learning, heterogeneous federated learning, personalized warmup, subnetworks
Concept Bottleneck Model with Zero Performance Loss
Zhenzhen Wang, Aleksander Popel, Jeremias Sulam
Keywords: interpretability, explainability, concept bottleneck model, concept explanations
Meta ControlNet: Enhancing Task Adaptation via Meta Learning
Junjie Yang, Jinze Zhao, Peihao Wang, Zhangyang Wang, Yingbin Liang
Keywords: Meta Learning, Diffusion Models, Generalization
Provable Model-Parallel Distributed Principal Component Analysis with Parallel Deflation
Fangshuo Liao, Wenyi Su, Anastasios Kyrillidis
Keywords: Principal Component Analysis, Distributed Learning
Dimension Mixer: Group Mixing of Input Dimensions for Efficient Function Approximation
Suman Sapkota, Binod Bhattarai
Keywords: Sparse Architectures, Structured Sparsity, Butterfly Sparsity, Butterfly MLP, Butterfly Attention, Long Range Arena (LRA), Solving Pathfinder-X, Patch Only MLP-Mixer, Dimension Mixer
Dual Reasoning: A GNN-LLM Collaborative Framework for Knowledge Graph Question Answering
Guangyi Liu, Yongqi Zhang, Yong Li, Quanming Yao
Keywords: Large Language Model, Knowledge Graph, Question Answering
A Validation Approach to Over-parameterized Matrix and Image Recovery
Lijun Ding, Zhen Qin, Liwei Jiang, Jinxin Zhou, Zhihui Zhu
Keywords: Matrix recovery, low-rank, validation, gradient descent, nonconvex optimization
Revisiting the Initial Steps in Adaptive Gradient Descent Optimization
Abulikemu Abuduweili, Changliu Liu
Keywords: Optimization, Adam, Adaptive Gradient Decent, Neural Networks
Adaptive Batch Size Schedules for Distributed Training of Language Models with Data and Model Parallelism
Tim Tsz-Kit Lau, Weijian Li, Chenwei Xu, Han Liu, Mladen Kolar
Keywords: Distributed training, adaptive batch size, data parallelism, model parallelism
Heterogeneous Decision Making in Mixed Traffic: Uncertainty-aware Planning and Bounded Rationality
Hang Wang, Qiaoyi Fang, Junshan Zhang
Keywords: Mixed Traffic, Reinforcement Learning, Planning, Bounded Rationality
Do Global and Local Perform Cooperatively or Adversarially in Heterogeneous Federated Learning?
Huiwen Wu, Shuo Zhang
Keywords: federated learning; multilevel optimization; learning dynamics
A Case Study of Low Ranked Self-Expressive Structures in Neural Network Representations (Oral)
Uday Singh Saini, William Shiao, Yahya Sattar, Yogesh Dahiya, Samet Oymak, Evangelos E. Papalexakis
Keywords: Subspace Clustering, Centered Kernel Alignment, Representation Similarity Measures.
AdaProx: A Novel Method for Bilevel Optimization under Pessimistic Framework
Ziwei Guan, Daouda Sow, Sen Lin, Yingbin Liang
Keywords: pessimistic bilevel optimization, convergence analysis, nonconvex, gradient-based method
HSR-Enhanced Sparse Attention Acceleration
Bo Chen, Yingyu Liang, Zhizhou Sha, Zhenmei Shi, Zhao Song
Keywords: Half-Space Reporting, Attention Acceleration, Sparse Attention
Learning of Patch-Based Smooth-Plus-Sparse Models for Image Reconstruction
Stanislas Ducotterd, Sebastian Neumayer, Michael Unser
Keywords: Image reconstruction, sparsity, dictionary learning, deep equilibrium
Large-Scale Multiway Clustering with Seeded Clustering
Jiaxin Hu
Keywords: scalable algorithm, time complexity, space complexity, large-scale data, tensor clustering, seeded clustering
Fast John Ellipsoid Computation with Differential Privacy Optimization (Oral)
Xiaoyu Li, Yingyu Liang, Zhenmei Shi, Zhao Song, Junwei Yu
Keywords: Fast Optimization, Differential Privacy, John Ellipsoid Computation
Approximate Nullspace Augmented Finetuning for Robust Vision Transformers (Oral)
Haoyang Liu, Aditya Singh, Yijiang Li, Haohan Wang
Keywords: Robustness, Vision Transformer, Invariance