Splash photo of Tübingen
Conference on Parsimony and Learning (CPAL)
March 2026, Tübingen

Poster Sessions at CPAL 2026

Presentation Format

All accepted papers at CPAL 2026, from both the Proceedings and Spotlight tracks, will be presented as posters at the conference. A select number of Proceedings track papers are also presented as orals, as specified on the orals page.

See the full program for an aggregated view of the precise times and locations of each poster and oral session.

  1. Poster Session I
    1. Time: Day 2 (Mar 24) – Tuesday – 4:00 PM to 6:00 PM
  2. Poster Session II
    1. Time: Day 3 (Mar 25) – Wednesday – 3:30 PM to 5:30 PM

Poster Session I

Time: Day 2 (Mar 24) – Tuesday – 4:00 PM to 6:00 PM

1. Optimal $k$-Discretization Learning

Tong Wang, Zhangyang Wang

Keywords: Clustering

2. AlphaFormer: End-to-End Symbolic Regression of Alpha Factors with Transformers

Haotong Huang, Jie Peng, Zezhen Ding, Pingzhi Li, Tianlong Chen

Keywords: Symbolic Regression, Alpha Mining, Time Series Generative Modeling

3. From sparse recovery to plug-and-play priors, understanding trade-offs for stable recovery with generalized projected gradient descent

Ali Joundi, Yann Traonmilin, Jean-François Aujol

Keywords: Inverse Problems, Sparse Recovery, Plug-and-Play, Deep Prior, Optimization

4. Generalized Radius and Integrated Codebook Transforms for Differentiable Vector Quantization

Haochen You, Heng Zhang, Hongyang He, Yuqi Li, Baojing Liu

Keywords: Vector Quantization, Discrete Representation Learning, Radius Surrogate, Codebook Transform, Gradient Coupling

5. Beyond Greedy Decoding: Model-Specific Strategy Selection via Multi-faceted Uncertainty Decomposition

Kwangje Baeg, Yubin Lim

Keywords: Uncertainty Decomposition, Adaptive Decoding, Model Heterogeneity, Behavioral Clustering, Instruction-Tuned Models

6. Parameter-Efficient Distributional RL via Normalizing Flows and a Geometry-Aware Cramér Surrogate

Simo Alami Chehboune, Rim Kaddah, Marie-Paule CANI, Jesse Read

Keywords: Distributional Reinforcement Learning, Generative models, Deep Learning, Optimal Transport

7. Cannistraci-Hebb Training with N:M Semi-Structured Sparsity for Pre-Training and Re-Training

Jiaqing Lyu, Ruijie Wang, Kangyou Bao, Yingtao Zhang, Carlo Vittorio Cannistraci

Keywords: Dynamic Sparse Training; Semi-Structured Sparsity; LLM; ViT

8. Effective Learning for Small Reasoning Models: An Empirical Study on 0.5B Reasoning LLMs

Xialie Zhuang, Peixian MA, Zhikai Jia, Zane Cao, Shiwei Liu

Keywords: Small Reasoning Model, Reasoning, Reinforcement Learning

9. ShapLoRA: Allocation of Low-rank Adaption on Large Language Models via Shapley Value Inspired Importance Estimation

Colin Zhao, Qinghua Yao, Xinyuan Song, Wei Zhu

Keywords: LLM LoRA

10. Simplex Deep Linear Discriminant Analysis

Maxat Tezekbayev, Arman Bolatov, Zhenisbek Assylbekov

Keywords: Deep LDA, Maximum likelihood, Simplex-constrained embeddings

11. Matrix Sensing with Kernel Optimal Loss: Robustness and Optimization Landscape

Xinyuan Song, Ziye Ma

Keywords: Matrix sensing, kernel loss function, optimization

12. Improving Medical Visual Reinforcement Fine-Tuning via Perception and Reasoning Augmentation

Guangjing Yang, ZhangYuan Yu, Ziyuan Qin, Xinyuan Song, Huahui Yi, Qingbo Kang, Jun Gao, Yiyue Li, Chenlin Du, Qicheng Lao

Keywords: Reinforcement Fine-Tuning (RFT), Medical Vision-Language Models, Reward Design, Perception-Reasoning Augmentation, Visual Reinforcement Learning, Medical Image Understanding

13. Token-Aware Representation Augmentation for Fine-Grained Semi-Supervised Learning

Hongyang He, Yan Zhong, Xinyuan Song, Daizong Liu, Victor Sanchez

Keywords: Semi-supervised learning, FixMatch, consistency regularization, token-aware masking, token-level augmentation, high-confidence token suppression, feature diversity

14. Enhancing Long-Context Inference with Context-Position Duo-Mixture

Zhenyu Zhang, Sharath Nittur Sridhar, Zhangyang Wang, Souvik Kundu

Keywords: Long-Context; LLM; Efficiency

15. Can Less Be More? Benchmarking Lightweight Models Against State-of-the-Art Deep Learning Architectures for Deployable Seizure Detection

Isaiah Essien, Donna-lee Ginsberg, Jesse Thornburg

Keywords: Parsimonious Learning, Mobile Health, Seizure Detection, TensorFlow Lite, Deep Learning, Resource-Constrained Deployment, Global Health Equity

16. ERC-SVD: Error-Controlled SVD for Large Language Model Compression

Haolei Bai, Siyong Jian, Tuo Liang, Yu Yin, Huan Wang

Keywords: Model Compression, SVD, Large Language Models

17. Sparsity-Aware Prompt Tuning: A Simple and Effective Way to Fine-tune High-Sparsity LLMs

Yuxin Zhang, Weizhong Huang, Yuexiao Ma, Yunshan Zhong, Xiawu Zheng, Rongrong Ji

Keywords: Large language models; Network Pruning

18. Teaching LLMs According to Their Aptitude: Adaptive Switching Between CoT and TIR for Mathematical Problem Solving

Xin Xu, Yan Xu, Tianhao Chen, Yuchen Yan, Chengwu Liu, Zaoyu Chen, Yufei Wang, Yichun Yin, Yasheng Wang, Qun Liu, Lu Yin

Keywords: Large Language Models, math QA, chain-of-thought, tool-integrated reasoning, fine-tuning

19. Scalable LLM Reasoning Acceleration with Low-rank Distillation

Harry Dong, Bilge Acun, Beidi Chen, Yuejie Chi

Keywords: large language model, efficiency, distillation, reasoning, scaling, low-rank, inference

20. Sparse Mixture-of-Experts for Compositional Generalization: Empirical Evidence and Theoretical Foundations of Optimal Sparsity

Jinze Zhao, Peihao Wang, Junjie Yang, Ruisi Cai, Gaowen Liu, Jayanth Srinivasa, Ramana Rao Kompella, Yingbin Liang, Zhangyang Wang

Keywords: Compositional Generalization, Sparsity, Mixture of Experts

21. Pruned Adaptation Modules: A Simple yet Strong Baseline for Continual Foundation Models

Elif Ceren Gok Yildirim, Murat Onur Yildirim, Joaquin Vanschoren

Keywords: continual learning, parameter efficient, foundation models

22. Prompt Stability Matters: Evaluating and Optimizing Auto-Generated Prompt in General-Purpose Systems

Ke Chen, Xucheng Yu, Yufei Zhou, Haohan Wang

Keywords: Prompt Stability, Prompt Evaluation, Multi-Agent System, General-Purpose System, Prompt Auto-Generation, Prompt Optimization

23. Dynamic SFT with Structured Measurements: Fast Queries, Fast Updates, Provable Guarantees

Yang Cao, Zhao Song

Keywords: sparse Fourier transform

24. Panza: Investigating the Feasibility of Fully-Local Personalized Text Generation

Armand Mihai Nicolicioiu, Eugenia Iofinova, Andrej Jovanovic, Eldar Kurtic, Mahdi Nikdan, Andrei Panferov, Ilia Markov, Nir N Shavit, Dan Alistarh

Keywords: LLMs, PEFT, LoRA, personalization, efficient ML

25. (PASS) Visual Prompt Locates Good Structure Sparsity through a Recurrent HyperNetwork

Tianjin Huang, Yong Tao, Meng Fang, Li Shen, Fan Liu, Yulong Pei, Mykola Pechenizkiy, Tianlong Chen

Keywords: Structure Pruning, Visual Prompt, Recurrent HyperNetwork

26. Concept based Ambiguity Resolution in LLMs

Zhibo Hu, Chen Wang, Yanfeng Shu, Hye-young Paik, Liming Zhu

Keywords: Language Ambiguity; Large Language Model; Sparse Autoencoder; Path Kernel

27. Data-Efficient and Robust Trajectory Generation through Pathlet Dictionary Learning

yuanbo tang, Yan Tang, Zihui Zhao, Zixuan Zhang, Yang Li

Keywords: trajectory generative model, dictionary learning, sparse representation

28. Sign-In to the Lottery: Reparameterized Sparse Training

Advait Gadhikar, Tom Jacobs, Chao Zhou, Rebekka Burkholz

Keywords: pruning at initialization, sparse training, lottery ticket hypothesis, mirror flow, reparameterization, sign flips

29. PoLAR: Polar-Decomposed Low-Rank Adapter Representation

Kai Lion, Liang Zhang, Bingcong Li, Niao He

Keywords: low-rank adaptation, architecture-optimizer co-design, large language models, lora, low-rank adapter, fine-tuning

30. Kernel von Mises Formula of the Influence Function

Yaroslav Mukhin

Keywords: Influence function, Fisher-Rao gradient, first variation, tangential interpolation, semiparametric estimation, distributional robustness, kernel PCA, Mercer kernel

31. Dual-Kernel Adapter: Expanding Spatial Horizons for Data-Constrained Medical Image Analysis

Ziquan Zhu, Hanruo Zhu, Si-Yuan Lu, Xiang Li, Yanda Meng, Yunxiao Zhang, Gaojie Jin, Lu Yin, Lijie Hu, Di Wang, Lu Liu, Tianjin Huang

Keywords: Adapter; Medical Image Analysis; Data-Limited Training;

32. Adversarial generalization of unfolding (model-based) networks

Vicky Kouni

Keywords: unfolding networks, adversarial generalization, adversarial Rademacher complexity

33. Revisiting Glorot Initialization for Long-Range Linear Recurrences

Noga Bar, Mariia Seleznova, Yotam Alexander, Gitta Kutyniok, Raja Giryes

Keywords: Recurrent Networks, Initialization, Signal Propagation, Joint Scaling Limits

34. Mask in the Mirror: Implicit Sparsification

Tom Jacobs, Rebekka Burkholz

Keywords: Sparse Training, Continuous sparsification, Implicit bias, Mirror flow, Time-dependent Bregman function, Regularization, Rich regime

35. Dimension-free error estimate for diffusion model and optimal scheduling

Valentin De Bortoli, Romuald Elie, Anna Kazeykina, Zhenjie Ren, Jiacheng Zhang

Keywords: diffusion model, score-matching, statistical error, optimal scheduling

36. E2ENet: Dynamic Sparse Feature Fusion for Accurate and Efficient 3D Medical Image Segmentation

Boqian Wu, Qiao Xiao, Shiwei Liu, Lu Yin, Mykola Pechenizkiy, Decebal Constantin Mocanu, Maurice van Keulen, Elena Mocanu

Keywords: Medical Image Segmentation, Sparse Training, Feature Fusion

37. Leave it to the Specialist: Repair Sparse LLMs with Sparse Fine-Tuning via Sparsity Evolution

Qiao Xiao, Alan Ansell, Boqian Wu, Lu Yin, Mykola Pechenizkiy, Shiwei Liu, Decebal Constantin Mocanu

Keywords: Large Language Models, Fine-Tuning, Sparse Training

38. The Curse of Depth in Large Language Models

Wenfang Sun, Xinyuan Song, Pengxiang Li, Lu Yin, Yefeng Zheng, Shiwei Liu

Keywords: Curse of Depth, Large Language Models, Pre-Layer Normalization

39. Ringleader ASGD: The First Asynchronous SGD with Optimal Time Complexity under Data Heterogeneity

Arto Maranjyan, Peter Richtárik

Keywords: asynchronous SGD, data heterogeneity, optimal time complexity, nonconvex optimization, parallel methods, stochastic optimization

40. FOSL: A Foldable Sparse-and-Low-Rank Method for Efficient LLM Pre-training

Dong Wang, Francesco Corti, Yun Cheng, Olga Saukh

Keywords: efficient pre-training, low-rank adaption, structured sparsity, large language models, model folding

41. LIFT the Veil for the Truth: Principal Weights Emerge after Rank Reduction for Reasoning-Focused Supervised Fine-Tuning

Zihang Liu, Tianyu Pang, Oleg Balabanov, Chaoqun Yang, Tianjin Huang, Lu Yin, Yaoqing Yang, Shiwei Liu

Keywords: Reasoning, Sparse Fine-tuning, Low-Rank Approximation, Memory Efficiency

42. Johnson-Lindenstrauss Lemma Beyond Euclidean Geometry

Chengyuan Deng, Jie Gao, Kevin Lu, Feng Luo, Cheng Xin

Keywords: Dimension Reduction, Geometry

43. Stackelberg Control in Combinatorial Congestion Games without Differentiating Through Equilibria

Saeed Masiha, Sepehr Elahi, Negar Kiyavash, Patrick Thiran

Keywords: Stackelberg games, zeroth-order optimization, congestion games, zero-suppressed decision diagrams (ZDDs), compact combinatorial representations

Poster Session II

Time: Day 3 (Mar 25) – Wednesday – 3:30 PM to 5:30 PM

1. Stochastic Unrolled Neural Networks

Samar Hadou, Navid NaderiAlizadeh, Alejandro Ribeiro

Keywords: unrolled optimization, learning to learn, deep unfolding, interpretable deep architecture, constrained learning

2. Symbiotic Cooperation for Web Agents: Harnessing Complementary Strengths of Large and Small LLMs

Ruichen Zhang, Mufan Qiu, Zhen Tan, Mohan Zhang, Xiaopeng Lu, Jie Peng, Kaidi Xu, Leandro Z. Agudelo, Peter Zhenghao Qian, Tianlong Chen

Keywords: LLM, Agent, Knowledge Distillation, Web Agent, Symbiotic Cooperation, Privacy Preservation, Hybrid Mode

3. FocusDC: Real-World Scene Infusion for Robust Dataset Condensation

Youbing Hu, Yun Cheng, Olga Saukh, Firat Ozdemir, Anqi Lu, Zhiqiang Cao, Min Zhang, Zhijun Li

Keywords: Dataset Distillation and Condensation, Vision Transformer

4. Enhancing Low-Cost Video Editing with Lightweight Adaptors and Temporal-Aware Inversion

Yangfan He, Sida Li, Jianhui Wang, Xinyuan Song, Kun Li, Xinhang Yuan, Kuan Lu, Menghao Huo, Jingqun Tang, Yi Xin, Jiaqi Chen, Keqin Li, Miao Zhang, Xueqian Wang

Keywords: Text-to-Image (T2I) Generation, Diffusion Models, Text-to-Video (T2V) Editing, Temporal Consistency, Spatial Consistency

5. MMA:Benchmarking Multi-ModalLarge Language Models in Ambiguity Contexts

Ru Wang, Selena Song, Yuquan Wang, Liang Ding, Mingming Gong, Yusuke Iwasawa, Yutaka Matsuo, Jiaxian Guo

Keywords: Multi-Modal Large Language Model, Ambiguity, Benchmark, Dataset

6. Byzantine-Robust Optimization under $(L_0,L_1)$-Smoothness

Arman Bolatov, Samuel Horváth, Martin Takáč, Eduard Gorbunov

Keywords: byzantine-robust optimization, federated learning, generalized smoothness, normalized SGD

7. ROSE: Reordered SparseGPT for More Accurate One-Shot Large Language Models Pruning

Mingluo Su, Huan Wang

Keywords: Large language models, Unstructured pruning, Pruning order

8. Selective Collaboration for Robust Federated Learning

Nazarii Tupitsa, Samuel Horváth, Martin Takáč, Eduard Gorbunov

Keywords: federated learning, robust aggreagation

9. Trainable Bitwise Soft Quantization for Input Feature Compression

Karsten Schrödter, Jan Stenkamp, Nina Herrmann, Fabian Gieseke

Keywords: Soft Quantization, Trainable Quantization, Input Compression, Tiny Machine Learning, Split Inference

10. Efficient Temporal Consistency in Diffusion-Based Video Editing with Adaptor Modules: A Theoretical Framework

Xinyuan Song, Yangfan He, Sida Li, Jianhui Wang, Hongyang He, Xinhang Yuan, Ruoyu Wang, Jiaqi Chen, Keqin Li, Kuan Lu, Menghao Huo, Ziqian Bi, Binxu Li, Pei Liu

Keywords: Adapter-based Methods, Diffusion Models, Video Editing, Temporal Consistency, DDIM Inversion, Prompt Learning, Theoretical Analysis

11. Learning of Discretized LSTMs

Nikolaus Kopp, Franz Pernkopf

Keywords: probabilistic, QAT, discrete LSTM, Gumbel-Softmax

12. FLIPR: FLexible and Interpretable Prediction Regions for time series

Eshant English, Christoph Lippert

Keywords: interpretable regions, time series, conformal prediction

13. SPIKE: Sparse Koopman Regularization for Physics-Informed Neural Networks

Jose Marie Antonio Miñoza

Keywords: Physics-Informed Neural Networks, Koopman Operator, Out-Of-Distribution Generalization, Dynamical Systems

14. GRAIL: Post-hoc Compensation by Linear Reconstruction for Compressed Networks

Wenwu Tang, Dong Wang, Lothar Thiele, Olga Saukh

Keywords: Model Compression, Model Pruning, Model Folding, Model Compensation, LLM, Model Efficiency

15. What Scalable Second-Order Information Knows for Pruning at Initialization

Ivo Gollini Navarrete, Nicolas Mauricio Cuadrado, Martin Takáč, Samuel Horváth

Keywords: Pruning, Hessian, One-shot, Initialization, Hutchinson, Fisher

16. Superclass-Guided Representation Disentanglement for Spurious Correlation Mitigation

Chenruo Liu, Hongjun Liu, Zeyu Lai, Yiqiu Shen, Chen Zhao, Qi Lei

Keywords: Spurious Correlation, Group Robustness, Domain Generalization

17. Lattice-Based Vector Quantization for Low-Bit Quantization-Aware Training

Rishika Kohli, Soma S Dhavala, Shaifu Gupta, Manoj Singh Gaur

Keywords: compression, quantization, pruning, deep learning, vector quantization, quantization aware training, post training quantization, BERT

18. Deep Neural Regression Collapse

Akshay Rangamani, Altay Unal

Keywords: Neural Collapse, Low Rank, Neural Regression Collapse

19. Learning in the Null Space: Small Singular Values for Continual Learning

Cuong Anh Pham, Praneeth Vepakomma, Samuel Horváth

Keywords: continual learning, singular value decomposition, small singular values, null space

20. A Stein identity for $q$-Gaussians with bounded support

Sophia Sklaviadis, Thomas Möllenhoff, Mario A. T. Figueiredo, Andre Martins, Mohammad Emtiyaz Khan

Keywords: Generalized Stein identities, elliptical families, bounded-support q-Gaussians

21. LLMQ: Efficient Lower-Precision LLM Training for Consumer GPUs

Erik Schultheis, Dan Alistarh

Keywords: consumer GPU, quantized training

22. Beyond In-Distribution Success: Scaling Curves of CoT Granularity for Language Model Generalization

Ru Wang, Wei Huang, Selena Song, Haoyu Zhang, Qian Niu, Yusuke Iwasawa, Yutaka Matsuo, Jiaxian Guo

Keywords: Chain of Thought, Scaling Curve, Out-of-Distribution Generalization, Sample Efficiency

23. Analyzing and Mitigating Model Collapse in Reflow Methods

Huminhao Zhu, Fangyikang Wang, Tianyu Ding, Qing Qu, Zhihui Zhu

Keywords: Model Collapse, Self-training, Synthetic Data, Reflow, Rectified Flow

24. SonoEdit: Null-Space Constrained Knowledge Editing for Pronunciation Correction in LLM-Based TTS

Ayush Pratap Singh, Harshit Singh, Nityanand Mathur, Akshat Mandloi, Sudarshan Kamath

Keywords: Knowledge Editing, Text to Speech, LLMs, Parameter Efficiency

25. KNIGHT: Knowledge Graph-Driven Multiple-Choice Question Generation with Adaptive Hardness Calibration

Mohammad Amanlou, Erfan Shafiee Moghaddam, Mahdi Nouri, Yasaman Amou Jafary, Farhan Farsi, Behnam Bahrak

Keywords: Multiple-Choice Question Generation, Knowledge Graph, Difficulty Calibration, Question Answering Dataset

26. Emergence of Auditory Receptive Fields based on Surprise

Yashaswini, Sneha Dash, Sharba Bandyopadhyay

Keywords: Auditory receptive fields, Bayesian surprise, sparse coding, Oddball paradigm, predictive inference, Autoregressive generative modeling, efficient sensory coding, biologically inspired learning

27. Semantic Homogeneity As Demonstration: Batch-Structured Semi-Supervised In-Context Learning for Natural Language Understanding

Cheng Chen, Yuangang Pan, Ivor Tsang

Keywords: In-Context Learning, Natural Language Understanding, Prompt Engineering / Prompting, Aggregate Ranking

28. GNNs Getting ComFy: Community and Feature Similarity Guided Rewiring

Celia Rubio-Madrigal, Adarsh Jamadandi, Rebekka Burkholz

Keywords: graph neural networks, over-squashing, graph rewiring, community structure, homophily, feature similarity

29. Beyond Scores: Proximal Diffusion Models

Zhenghan Fang, Mateo Diaz Diaz, Sam Buchanan, Jeremias Sulam

Keywords: Generative models, Diffusion models, Proximal operators, Backward discretization

30. AdaBoost.SDM: Similarity and dissimilarity-based manifold regularized adaptive boosting algorithm

Azamat Mukhamediya, Amin Zollanvari

Keywords: Ensemble learning, Adaptive boosting, Manifold regularization

31. Pay Attention to Small Weights

Chao Zhou, Tom Jacobs, Advait Gadhikar, Rebekka Burkholz

Keywords: large model, finetuning, effciency, catastrophic forgetting

32. Dynamic Sparse Training versus Dense Training: The Unexpected Winner in Image Corruption Robustness

Boqian Wu, Qiao Xiao, Shunxin Wang, Nicola Strisciuglio, Mykola Pechenizkiy, Maurice van Keulen, Decebal Constantin Mocanu, Elena Mocanu

Keywords: Dynamic Sparse Training, Image Corruption Robustness

33. Stable Minima of ReLU Neural Networks Suffer from the Curse of Dimensionality: The Neural Shattering Phenomenon

Tongtong Liang, Dan Qiao, Yu-Xiang Wang, Rahul Parhi

Keywords: Generalization bound, minima stability, gradient descent, large learning rate, ReLU neural network, minimax rate

34. HyperINR: Ensuring Semantics in Weights with Implicit Function Theorem

Tianming Qiu, Christos Sonis, Hao Shen

Keywords: Implicit Function Theorem, Semantics in Weights, Weight Space Learning, Implicit Neural Representations, Hypernetworks

35. Connectivity determines the capability of sparse neural network quantum states

Brandon Barton, Juan Felipe Carrasquilla Alvarez, Christopher Robert Roth, Agnes Valenti

Keywords: Neural network quantum states, Pruning, Lottery ticket hypothesis

36. Computational Algebra with Attention: Transformer Oracles for Border Basis Algorithms

Hiroshi Kera, Nico Pelleriti, Yuki Ishihara, Max Zimmer, Sebastian Pokutta

Keywords: Polynomial System Solving, Border Bases, Transformer, Computational Algebra, AI4Science, AI4Math

37. REAP the Experts: Why Pruning Prevails for One-Shot MoE compression

Mike Lasby, Ivan Lazarevich, Nish Sinnadurai, Sean Lie, Yani Ioannou, Vithursan Thangarasa

Keywords: mixture-of-experts, moe, compresson, expert pruning, expert merging, merging, pruning, LLM, evaluation

38. Hyperbolic Aware Minimization: Implicit Bias for Sparsity

Tom Jacobs, Advait Gadhikar, Celia Rubio-Madrigal, Rebekka Burkholz

Keywords: Sparsity, Implicit bias, Sign flip, Exponential update, Training dynamics, Bregman function

39. Beyond the Ideal: Analyzing the Inexact Muon Update

Egor Shulgin, Sultan AlRashed, Francesco Orabona, Peter Richtárik

Keywords: Optimization, Muon

40. The Graphon Limit Hypothesis: Understanding Neural Network Pruning via Infinite Width Analysis

Hoang Pham, The-Anh Ta, Tom Jacobs, Rebekka Burkholz, Long Tran-Thanh

Keywords: Pruning Network, Graphon, Neural Tangent Kernel

41. Fixed Aggregation Features Can Rival GNNs

Celia Rubio-Madrigal, Rebekka Burkholz

Keywords: deep learning, graph neural networks, node classification, kolmogorov-arnold representation, tabular learning, non-trainable aggregation

42. SparseOpt: Addressing Normalization-induced Gradient Skew in Sparse Training

Mohammed Adnan, Rohan Jain, Tom Jacobs, Ekansh Sharma, Rahul G Krishnan, Rebekka Burkholz, Yani Ioannou

Keywords: sparse training, dynamic sparse training, training dynamics, normalization layers

43. SALAAD: Sparse and Low-Rank Adaptation via ADMM for Large Language Model Inference

Hao Ma, Melis Ilayda Bal, Liang Zhang, Bingcong Li, Niao He, Melanie Zeilinger, Michael Muehlebach

Keywords: Sparse and Low-Rank Learning, Large Language Models, Structured Optimization, Model Compression, Elastic Inference

44. LOST: Low-rank and Sparse Pre-training for Large Language Models

Jiaxi Li, Lu Yin, Li Shen, Jinjin Xu, Adarsh Kappiyath, LiWu Xu, Tianjin Huang, Wenwu Wang, Shiwei Liu, Xilu Wang

Keywords: Large language models, Low-rank, Sparse, Singular value decomposition, Pre-training