
Spotlight Track: Accepted Papers
Accepted Spotlight Track papers are presented as posters at CPAL 2025. See the full program for the precise time and location of each poster session.
Stable Minima Cannot Overfit in Univariate ReLU Networks: Generalization by Large Step Sizes
Dan Qiao, Kaiqi Zhang, Esha Singh, Daniel Soudry, Yu-Xiang Wang
Keywords: Minima Stability, Edge-of-Stability, Generalization, Flat Local Minima, Curvature
Principle Component Trees and their Persistent Homology
Ben Kizaric, Daniel L. Pimentel-Alarcón
Keywords: subspace clustering, low-rank decomposition, unsupervised learning, manifold learning, dimensionality reduction, topological data analysis
Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN
Pengxiang Li, Lu Yin, Shiwei Liu
Keywords: LayerNorm, LLM, Transformer
Geometric Algebra Planes: Convex Implicit Neural Volumes
Irmak Sivgin, Sara Fridovich-Keil, Gordon Wetzstein, Mert Pilanci
Keywords: Volume representation, tensor decomposition, convex optimization, geometric algebra, nerf
Understanding and Mitigating Bottlenecks of State Space Models through the Lens of Recency and Over-smoothing
Peihao Wang, Ruisi Cai, Yuehao Wang, Jiajun Zhu, Pragya Srivastava, Zhangyang Wang, Pan Li
Keywords: State Space Models, Large Language Models, Recency, Over-smoothing
Rethinking Addressing in Language Models via Contextualized Equivariant Positional Encoding
Jiajun Zhu, Peihao Wang, Ruisi Cai, Jason D. Lee, Pan Li, Zhangyang Wang
Keywords: Positional Encoding, Equivariant Machine Learning, Large Language Models
WHOMP: Optimizing Randomized Controlled Trials via Wasserstein Homogeneity
Shizhou Xu, Thomas Strohmer
Keywords: randomized controlled trial, Wasserstein homogeneity, anti-clustering, diverse K-means, control/test group splitting, cross-validation
Diffusion models learn low-dimensional distributions via subspace clustering
Peng Wang, Huijie Zhang, Zekai Zhang, Siyi Chen, Yi Ma, Qing Qu
Keywords: diffusion models, mixture of low-rank Gaussians, phase transition, subspace clustering
Generative Learning for Solving Non-Convex Problem with Multi-Valued Input-Solution Mapping
Enming Liang, Minghua Chen
Keywords: Non-convex Optimization, Generative Modeling, Flow, ODE
Attention-Only Transformers via Unrolled Subspace Denoising
Peng Wang, Yifu Lu, Yaodong Yu, Druv Pai, Qing Qu, Yi Ma
Keywords: transformer, self-attention, unrolled optimization, subspace denoising
Learning Gaussian Multi-Index Models with Gradient Flow: Time Complexity and Directional Convergence
Berfin Simsek, Amire Bendjeddou, Daniel Hsu
Keywords: time complexity, gradient flow dynamics, hardness
On Generalization Bounds for Neural Networks with Low Rank Layers
Andrea Pinto, Akshay Rangamani, Tomaso A Poggio
Keywords: Gaussian Complexity, Low Rank, Neural Collapse
Simplifying DINO by Coding Rate Regularization
Ziyang Wu, Jingyuan Zhang, Druv Pai, Yi Ma
Keywords: Representation Learning, Self Supervised Learning, Coding Rate
Knowledge-aware Parsimony Learning: A Perspective from Relational Graphs
Quanming Yao, Yongqi Zhang, Yaqing Wang, Nan Yin, James Kwok, Qiang Yang
Keywords: scaling law, Parsimony Learning, Graph Learning
Understanding Diffusion-based Representation Learning via Low-Dimensional Modeling
Xiao Li, Zekai Zhang, Xiang Li, Siyi Chen, Zhihui Zhu, Peng Wang, Qing Qu
Keywords: diffusion representation learning, representation learning, diffusion model
Geometry of Concepts in Next-token Prediction: Neural-Collapse Meets Semantics
Yize Zhao, Christos Thrampoulidis
Keywords: Large Language Models(LLMs), Neural Embeddings, Word Embeddings, Neural-Collapse, Interpretability, Optimization
FlowDAS: A Flow-Based Framework for Data Assimilation
Siyi Chen, Yixuan Jia, Qing Qu, He Sun, Jeffrey A Fessler
Keywords: Data Assimilation, Stochastic Dynamic System, Flow matching, Stochastic Interpolants, Inverse Problem
What’s in a Prior? Learned Proximal Networks for Inverse Problems
Zhenghan Fang, Sam Buchanan, Jeremias Sulam
Keywords: Inverse problems, Proximal operators, Plug-and-play, Explicit regularizer, Convergent PnP, Input convex neural networks
Pruning neural network models for gene regulatory dynamics using data and domain knowledge
Intekhab Hossain, Jonas Fischer, Rebekka Burkholz, John Quackenbush
Keywords: sparsification, pruning, lottery tickets, explainability, gene regulation, domain knowledge, neural architecture design, NeuralODEs
Certified Robustness against Sparse Adversarial Perturbations via Data Localization
Ambar Pal, Rene Vidal, Jeremias Sulam
Keywords: Adversarial Robustness, Certified Robustness, Sparse Perturbations, Data Localization
Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models
Weixin Liang, LILI YU, Liang Luo, Srini Iyer, Ning Dong, Chunting Zhou, Gargi Ghosh, Mike Lewis, Wen-tau Yih, Luke Zettlemoyer, Xi Victoria Lin
Keywords: Sparse architecture, Efficient deep architecture, Multi-modal foundation models, Mixture-of-Experts, Transformer
Provable Probabilistic Imaging using Score-based Generative Priors
Yu Sun, Zihui Wu, Yifan Chen, Berthy Feng, Katherine Bouman
Keywords: Diffusion models, inverse problem, image reconstruction, langevin dynamics, markov processes, plug-and-play priors, posterior sampling, regularized inversion, score-based generative models, uncertainty quantification
DyVal: Dynamic Evaluation of Large Language Models for Reasoning Tasks
Kaijie Zhu, Jiaao Chen, Jindong Wang, Neil Zhenqiang Gong, Diyi Yang, Xing Xie
Keywords: Large Language Models, Evaluation, Data Contamination
Sparse Training from Random Initialization: Aligning Lottery Ticket Masks using Weight Symmetry
Mohammed Adnan, Rohan Jain, Ekansh Sharma, Yani Ioannou
Keywords: Lottery Ticket Hypothesis, sparse training, linear mode connectivity, weight symmetry, deep learning, deep neural networks, random initialization, git re-basin, optimization
Shallow Diffuse: Robust and Invisible Watermarking through Low-Dimensional Subspaces in Diffusion Models
Wenda Li, Huijie Zhang, Qing Qu
Keywords: diffusion Model, watermark, low-dimensional subspace, consistency, robustness
A Robust Kernel Statistical Test of Invariance: Detecting Subtle Asymmetries
Ashkan Soleymani, Behrooz Tahmasebi, Stefanie Jegelka, Patrick Jaillet
Keywords: Invariance, Hypothesis Testing, Kernel Methods
WAGLE: Strategic Weight Attribution for Effective and Modular Unlearning in Large Language Models
Jinghan Jia, Jiancheng Liu, Yihua Zhang, Parikshit Ram, Nathalie Baracaldo, Sijia Liu
Keywords: Machine Unlearning, LLMs
Masks, Signs, And Learning Rate Rewinding
Advait Gadhikar, Rebekka Burkholz
Keywords: sparsity, pruning, lottery tickets, learning rate rewinding, iterative magnitude pruning
Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs
Tianyu Guo, Druv Pai, Yu Bai, Jiantao Jiao, Michael I. Jordan, Song Mei
Keywords: attention sink, mechanistic interpretability, language models, transformers
Geometry of Neural Reinforcement Learning in Continuous State and Action Spaces
Saket Tiwari, Omer Gottesman, George Konidaris
Keywords: resinforcement learning, continuous control, geometry
Out-of-distribution generalization via composition: a lens through induction heads in Transformers
Jiajun Song, Zhuoyan Xu, Yiqiao Zhong
Keywords: out-of-distribution generalization, low-dimensional subspace, composition, large language models, emergent ability, in-context learning
Dynamic Rescaling for Training GNNs
Nimrah Mustafa, Rebekka Burkholz
Keywords: graph neural network, rescale invariance, generalization, network balance
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Tianzhe Chu, Yuexiang Zhai, Jihan Yang, Shengbang Tong, Saining Xie, Sergey Levine, Yi Ma
Keywords: foundation model post-training
Prior Mismatch and Adaptation in PnP-ADMM with a Nonconvex Convergence Analysis
Shirin Shoushtari, Jiaming Liu, Edward P. Chandler, M. Salman Asif, Ulugbek S. Kamilov
Keywords: Computational Imaging, Plug-and-Play Priors, Imaging Inverse Problems, Mismatched Priors, Domain Adaptation
Relaxed Contrastive Learning for Federated Learning
Seonguk Seo, Jinkyu Kim, Geeho Kim, Bohyung Han
Keywords: dimensional collapse, transferability, federated learning, local deviation
Mixture-of-Mamba: Enhancing Multi-Modal State-Space Models with Modality-Aware Sparsity
Weixin Liang, Junhong Shen, Genghan Zhang, Ning Dong, Luke Zettlemoyer, LILI YU
Keywords: Sparse architecture, Efficient deep architecture, Multi-modal foundation models, Mixture-of-Experts, State Space Model
Training Bayesian Neural Networks with Sparse Subspace Variational Inference
Junbo Li, Zichen Miao, Qiang Qiu, Ruqi Zhang
Keywords: Bayesian neural networks, sparse Bayesian learning, variational inference
SITCOM: Step-wise Triple-Consistent Diffusion Sampling for Inverse Problems
Ismail Alkhouri, Shijun Liang, Cheng-Han Huang, Jimmy Dai, Qing Qu, Saiprasad Ravishankar, Rongrong Wang
Keywords: Image Restoration, Diffusion Models, Inverse Problems
Learning with Exact Invariances in Polynomial Time
Ashkan Soleymani, Behrooz Tahmasebi, Stefanie Jegelka, Patrick Jaillet
Keywords: Learning with Invariances, Kernels, Spectral Theory
Dependence Induced Representations
Xiangxiang Xu, Lizhong Zheng
Keywords: representation learning, statistical dependence, maximal correlation, minimal sufficiency, neural collapse
Unlocking Global Optimality in Bilevel Optimization: A Pilot Study
Quan Xiao, Tianyi Chen
Keywords: bilevel optimization; global convergence
Implicit Geometry of Next-token Prediction: From Language Sparsity Patterns to Model Representations
Yize Zhao, Tina Behnia, Vala Vakilian, Christos Thrampoulidis
Keywords: language models, neural embeddings, optimization, implicit regularization, low-rank matrix factorization, support-vector machines
CompeteAI: Understanding the Competition Dynamics of Large Language Model-based Agents
Qinlin Zhao, Jindong Wang, Yixuan Zhang, Yiqiao Jin, Kaijie Zhu, Hao Chen, Xing Xie
Keywords: LLM-based Agent, Agent Based Modeling, Competition
Image Reconstruction Via Autoencoding Sequential Deep Image Prior
Ismail Alkhouri, Shijun Liang, Evan Bell, Qing Qu, Rongrong Wang, Saiprasad Ravishankar
Keywords: Image Reconstruction, Deep Image Prior, Generative Models
Understanding How Nonlinear Networks Create Linearly Separable Features for Low-Dimensional Data
Alec S Xu, Can Yaras, Peng Wang, Qing Qu
Keywords: union of subspaces, shallow nonlinear networks, random feature model
On the Crucial Role of Initialization for Matrix Factorization
Bingcong Li, Liang Zhang, Aryan Mokhtari, Niao He
Keywords: nonconvex optimization, initialization, quadratic rate, low rank adapter, lora
Non-convex matrix sensing: Breaking the quadratic rank barrier in the sample complexity
Dominik Stöger, Yizhe Zhu
Keywords: non-convex optimization, factorized gradient descent, matrix sensing, sample complexity, virtual sequences
Deep Neural Regression Collapse
Akshay Rangamani, Altay Unal
Keywords: Neural Collapse, Regression, Low Rank
Visual Prompting Reimagined: The Power of Activation Prompts
Yihua Zhang, Hongkang Li, Yuguang Yao, Aochuan Chen, Shuai Zhang, Pin-Yu Chen, Meng Wang, Sijia Liu
Keywords: visual prompt, parameter efficient finetuning, learning theory, generalization analysis
Primal-Dual Spectral Representation for Off-policy Evaluation
Yang Hu, Tianyi Chen, Na Li, Kai Wang, Bo Dai
Keywords: reinforcement learning, off-policy evaluation, spectral representation, primal-dual representation
Learning Dynamics of Deep Matrix Factorization Beyond the Edge of Stability
Avrajit Ghosh, Soo Min Kwon, Rongrong Wang, Saiprasad Ravishankar, Qing Qu
Keywords: edge of stability, deep linear networks
Characterizing ResNet’s Universal Approximation Capability
Chenghao Liu, Enming Liang, Minghua Chen
Keywords: universal approximation, ResNet, optimal approximation rate