Conference on Parsimony and Learning (CPAL)

March 2025, Stanford

Spotlight Track: Accepted Papers

Accepted Spotlight Track papers are presented as posters at CPAL 2025. See the full program for the precise time and location of each poster session.

Stable Minima Cannot Overfit in Univariate ReLU Networks: Generalization by Large Step Sizes

Dan Qiao, Kaiqi Zhang, Esha Singh, Daniel Soudry, Yu-Xiang Wang

Keywords: Minima Stability, Edge-of-Stability, Generalization, Flat Local Minima, Curvature

Principle Component Trees and their Persistent Homology

Ben Kizaric, Daniel L. Pimentel-Alarcón

Keywords: subspace clustering, low-rank decomposition, unsupervised learning, manifold learning, dimensionality reduction, topological data analysis

Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN

Pengxiang Li, Lu Yin, Shiwei Liu

Keywords: LayerNorm, LLM, Transformer

Geometric Algebra Planes: Convex Implicit Neural Volumes

Irmak Sivgin, Sara Fridovich-Keil, Gordon Wetzstein, Mert Pilanci

Keywords: Volume representation, tensor decomposition, convex optimization, geometric algebra, nerf

Understanding and Mitigating Bottlenecks of State Space Models through the Lens of Recency and Over-smoothing

Peihao Wang, Ruisi Cai, Yuehao Wang, Jiajun Zhu, Pragya Srivastava, Zhangyang Wang, Pan Li

Keywords: State Space Models, Large Language Models, Recency, Over-smoothing

Rethinking Addressing in Language Models via Contextualized Equivariant Positional Encoding

Jiajun Zhu, Peihao Wang, Ruisi Cai, Jason D. Lee, Pan Li, Zhangyang Wang

Keywords: Positional Encoding, Equivariant Machine Learning, Large Language Models

WHOMP: Optimizing Randomized Controlled Trials via Wasserstein Homogeneity

Shizhou Xu, Thomas Strohmer

Keywords: randomized controlled trial, Wasserstein homogeneity, anti-clustering, diverse K-means, control/test group splitting, cross-validation

Diffusion models learn low-dimensional distributions via subspace clustering

Peng Wang, Huijie Zhang, Zekai Zhang, Siyi Chen, Yi Ma, Qing Qu

Keywords: diffusion models, mixture of low-rank Gaussians, phase transition, subspace clustering

Generative Learning for Solving Non-Convex Problem with Multi-Valued Input-Solution Mapping

Enming Liang, Minghua Chen

Keywords: Non-convex Optimization, Generative Modeling, Flow, ODE

Attention-Only Transformers via Unrolled Subspace Denoising

Peng Wang, Yifu Lu, Yaodong Yu, Druv Pai, Qing Qu, Yi Ma

Keywords: transformer, self-attention, unrolled optimization, subspace denoising

Learning Gaussian Multi-Index Models with Gradient Flow: Time Complexity and Directional Convergence

Berfin Simsek, Amire Bendjeddou, Daniel Hsu

Keywords: time complexity, gradient flow dynamics, hardness

On Generalization Bounds for Neural Networks with Low Rank Layers

Andrea Pinto, Akshay Rangamani, Tomaso A Poggio

Keywords: Gaussian Complexity, Low Rank, Neural Collapse

Simplifying DINO by Coding Rate Regularization

Ziyang Wu, Jingyuan Zhang, Druv Pai, Yi Ma

Keywords: Representation Learning, Self Supervised Learning, Coding Rate

Knowledge-aware Parsimony Learning: A Perspective from Relational Graphs

Quanming Yao, Yongqi Zhang, Yaqing Wang, Nan Yin, James Kwok, Qiang Yang

Keywords: scaling law, Parsimony Learning, Graph Learning

Understanding Diffusion-based Representation Learning via Low-Dimensional Modeling

Xiao Li, Zekai Zhang, Xiang Li, Siyi Chen, Zhihui Zhu, Peng Wang, Qing Qu

Keywords: diffusion representation learning, representation learning, diffusion model

Geometry of Concepts in Next-token Prediction: Neural-Collapse Meets Semantics

Yize Zhao, Christos Thrampoulidis

Keywords: Large Language Models(LLMs), Neural Embeddings, Word Embeddings, Neural-Collapse, Interpretability, Optimization

FlowDAS: A Flow-Based Framework for Data Assimilation

Siyi Chen, Yixuan Jia, Qing Qu, He Sun, Jeffrey A Fessler

Keywords: Data Assimilation, Stochastic Dynamic System, Flow matching, Stochastic Interpolants, Inverse Problem

What’s in a Prior? Learned Proximal Networks for Inverse Problems

Zhenghan Fang, Sam Buchanan, Jeremias Sulam

Keywords: Inverse problems, Proximal operators, Plug-and-play, Explicit regularizer, Convergent PnP, Input convex neural networks

Pruning neural network models for gene regulatory dynamics using data and domain knowledge

Intekhab Hossain, Jonas Fischer, Rebekka Burkholz, John Quackenbush

Keywords: sparsification, pruning, lottery tickets, explainability, gene regulation, domain knowledge, neural architecture design, NeuralODEs

Certified Robustness against Sparse Adversarial Perturbations via Data Localization

Ambar Pal, Rene Vidal, Jeremias Sulam

Keywords: Adversarial Robustness, Certified Robustness, Sparse Perturbations, Data Localization

Weixin Liang, LILI YU, Liang Luo, Srini Iyer, Ning Dong, Chunting Zhou, Gargi Ghosh, Mike Lewis, Wen-tau Yih, Luke Zettlemoyer, Xi Victoria Lin

Keywords: Sparse architecture, Efficient deep architecture, Multi-modal foundation models, Mixture-of-Experts, Transformer

Provable Probabilistic Imaging using Score-based Generative Priors

Yu Sun, Zihui Wu, Yifan Chen, Berthy Feng, Katherine Bouman

Keywords: Diffusion models, inverse problem, image reconstruction, langevin dynamics, markov processes, plug-and-play priors, posterior sampling, regularized inversion, score-based generative models, uncertainty quantification

DyVal: Dynamic Evaluation of Large Language Models for Reasoning Tasks

Kaijie Zhu, Jiaao Chen, Jindong Wang, Neil Zhenqiang Gong, Diyi Yang, Xing Xie

Keywords: Large Language Models, Evaluation, Data Contamination

Sparse Training from Random Initialization: Aligning Lottery Ticket Masks using Weight Symmetry

Mohammed Adnan, Rohan Jain, Ekansh Sharma, Yani Ioannou

Keywords: Lottery Ticket Hypothesis, sparse training, linear mode connectivity, weight symmetry, deep learning, deep neural networks, random initialization, git re-basin, optimization

Shallow Diffuse: Robust and Invisible Watermarking through Low-Dimensional Subspaces in Diffusion Models

Wenda Li, Huijie Zhang, Qing Qu

Keywords: diffusion Model, watermark, low-dimensional subspace, consistency, robustness

A Robust Kernel Statistical Test of Invariance: Detecting Subtle Asymmetries

Ashkan Soleymani, Behrooz Tahmasebi, Stefanie Jegelka, Patrick Jaillet

Keywords: Invariance, Hypothesis Testing, Kernel Methods

WAGLE: Strategic Weight Attribution for Effective and Modular Unlearning in Large Language Models

Jinghan Jia, Jiancheng Liu, Yihua Zhang, Parikshit Ram, Nathalie Baracaldo, Sijia Liu

Keywords: Machine Unlearning, LLMs

Masks, Signs, And Learning Rate Rewinding

Advait Gadhikar, Rebekka Burkholz

Keywords: sparsity, pruning, lottery tickets, learning rate rewinding, iterative magnitude pruning

Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs

Tianyu Guo, Druv Pai, Yu Bai, Jiantao Jiao, Michael I. Jordan, Song Mei

Keywords: attention sink, mechanistic interpretability, language models, transformers

Geometry of Neural Reinforcement Learning in Continuous State and Action Spaces

Saket Tiwari, Omer Gottesman, George Konidaris

Keywords: resinforcement learning, continuous control, geometry

Out-of-distribution generalization via composition: a lens through induction heads in Transformers

Jiajun Song, Zhuoyan Xu, Yiqiao Zhong

Keywords: out-of-distribution generalization, low-dimensional subspace, composition, large language models, emergent ability, in-context learning

Dynamic Rescaling for Training GNNs

Nimrah Mustafa, Rebekka Burkholz

Keywords: graph neural network, rescale invariance, generalization, network balance

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Tianzhe Chu, Yuexiang Zhai, Jihan Yang, Shengbang Tong, Saining Xie, Sergey Levine, Yi Ma

Keywords: foundation model post-training

Prior Mismatch and Adaptation in PnP-ADMM with a Nonconvex Convergence Analysis

Shirin Shoushtari, Jiaming Liu, Edward P. Chandler, M. Salman Asif, Ulugbek S. Kamilov

Keywords: Computational Imaging, Plug-and-Play Priors, Imaging Inverse Problems, Mismatched Priors, Domain Adaptation

Relaxed Contrastive Learning for Federated Learning

Seonguk Seo, Jinkyu Kim, Geeho Kim, Bohyung Han

Keywords: dimensional collapse, transferability, federated learning, local deviation

Weixin Liang, Junhong Shen, Genghan Zhang, Ning Dong, Luke Zettlemoyer, LILI YU

Keywords: Sparse architecture, Efficient deep architecture, Multi-modal foundation models, Mixture-of-Experts, State Space Model