You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
abstract={Finetuning large pretrained neural networks is known to be resource-intensive, both in terms of memory and computational cost. To mitigate this, a common approach is to restrict training to a subset of the model parameters. By analyzing the relationship between gradients and weights during finetuning, we observe a notable pattern: large gradients are often associated with small-magnitude weights. This correlation is more pronounced in finetuning settings than in training from scratch. Motivated by this observation, we propose NANOADAM, which dynamically updates only the small-magnitude weights during finetuning and offers several practical advantages: first, this criterion is gradient-free -- the parameter subset can be determined without gradient computation; second, it preserves large-magnitude weights, which are likely to encode critical features learned during pretraining, thereby reducing the risk of catastrophic forgetting; thirdly, it permits the use of larger learning rates and consistently leads to better generalization performance in experiments. We demonstrate this for both NLP and vision tasks.},
11
+
img={smallweights.png}
11
12
}
12
13
13
14
@inproceedings{ Gadhikar2025SignInTT,
@@ -29,6 +30,7 @@ @inproceedings{ pham2025the
29
30
url={https://openreview.net/forum?id=EEZLBhyer1},
30
31
pdf={https://openreview.net/pdf?id=EEZLBhyer1},
31
32
abstract={Sparse neural networks promise efficiency, yet training them effectively remains a fundamental challenge. Despite advances in pruning methods that create sparse architectures, understanding why some sparse structures are better trainable than others with the same level of sparsity remains poorly understood. Aiming to develop a systematic approach to this fundamental problem, we propose a novel theoretical framework based on the theory of graph limits, particularly graphons, that characterizes sparse neural networks in the infinite-width regime. Our key insight is that connectivity patterns of sparse neural networks induced by pruning methods converge to specific graphons as networks' width tends to infinity, which encodes implicit structural biases of different pruning methods. We postulate the Graphon Limit Hypothesis and provide empirical evidence to support it. Leveraging this graphon representation, we derive a Graphon Neural Tangent Kernel (Graphon NTK) to study the training dynamics of sparse networks in the infinite width limit. Graphon NTK provides a general framework for the theoretical analysis of sparse networks. We empirically show that the spectral analysis of Graphon NTK correlates with observed training dynamics of sparse networks, explaining the varying convergence behaviours of different pruning methods. Our framework provides theoretical insights into the impact of connectivity patterns on the trainability of various sparse network architectures.},
33
+
img={graphons.png}
32
34
}
33
35
34
36
@inproceedings{jacobs2025mirror,
@@ -62,7 +64,7 @@ @inproceedings{
62
64
year={2025},
63
65
url={https://openreview.net/forum?id=g6v09VxgFw},
64
66
pdf={https://openreview.net/pdf?id=g6v09VxgFw},
65
-
img={gnns-getting-comfy.png},
67
+
img={small-comfy.png},
66
68
abstract={Maximizing the spectral gap through graph rewiring has been proposed to enhance the performance of message-passing graph neural networks (GNNs) by addressing over-squashing. However, as we show, minimizing the spectral gap can also improve generalization. To explain this, we analyze how rewiring can benefit GNNs within the context of stochastic block models. Since spectral gap optimization primarily influences community strength, it improves performance when the community structure aligns with node labels. Building on this insight, we propose three distinct rewiring strategies that explicitly target community structure, node labels, and their alignment: (a) community structure-based rewiring (ComMa), a more computationally efficient alternative to spectral gap optimization that achieves similar goals; (b) feature similarity-based rewiring (FeaSt), which focuses on maximizing global homophily; and (c) a hybrid approach (ComFy), which enhances local feature similarity while preserving community structure to optimize label-community alignment. Extensive experiments confirm the effectiveness of these strategies and support our theoretical insights.},
67
69
code={https://github.com/RelationalML/ComFy}
68
70
}
@@ -75,7 +77,7 @@ @inproceedings{
75
77
year={2024},
76
78
url={https://openreview.net/forum?id=EMkrwJY2de},
77
79
pdf={https://openreview.net/pdf?id=EMkrwJY2de},
78
-
img={spectral-graph-pruning.png},
80
+
img={small-spectral.png},
79
81
abstract={Message Passing Graph Neural Networks are known to suffer from two problems that are sometimes believed to be diametrically opposed: over-squashing and over-smoothing. The former results from topological bottlenecks that hamper the information flow from distant nodes and are mitigated by spectral gap maximization, primarily, by means of edge additions. However, such additions often promote over-smoothing that renders nodes of different classes less distinguishable. Inspired by the Braess phenomenon, we argue that deleting edges can address over-squashing and over-smoothing simultaneously. This insight explains how edge deletions can improve generalization, thus connecting spectral gap optimization to a seemingly disconnected objective of reducing computational resources by pruning graphs for lottery tickets. To this end, we propose a computationally effective spectral gap optimization framework to add or delete edges and demonstrate its effectiveness on the long range graph benchmark and on larger heterophilous datasets.},
Copy file name to clipboardExpand all lines: _data/news.yml
+8-2Lines changed: 8 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,12 @@
1
1
- date: 10. November 2025
2
2
headline: "Celia and Tom are presenting at the Workshop on Geometry, Topology, and Machine Learning ([GTML](https://www.mis.mpg.de/events/series/workshop-on-geometry-topology-and-machine-learning-gtml-2025)) in Leipzig."
3
3
4
+
- date: 8. October 2025
5
+
headline: "Baraah is presenting her work on game-aware optimization at the [Shocklab](https://shocklab.net/seminars/) online seminar."
6
+
7
+
- date: 6. October 2025
8
+
headline: "Welcome to Adnan (University of Calgary) and Levy (University of Campinas) for research stays. Furthermore, Prof. [Yani Ioannou](https://yani.ai/) (University of Calgary) is visiting us for a talk on sparse training."
9
+
4
10
- date: 18. September 2025
5
11
headline: "Three papers
6
12
[(1)](https://openreview.net/forum?id=XKnOA7MhCz)
@@ -12,7 +18,7 @@
12
18
headline: "Rebekka and Celia are presenting at the Workshop on Mining and Learning with Graphs ([MLG](https://mlg-europe.github.io/2025/)) in Porto with a keynote and two posters, respectively."
13
19
14
20
- date: 14. August 2025
15
-
headline: "Tom is [presenting](https://cohere.com/events/Cohere-Labs-Tom-Jacobs-2025) his work on implicit regularization at Cohere Labs: Open Science Community ([video](/outreach#tom-jacobs--cohere-labs-aug-14-2025))."
21
+
headline: "Tom is presenting his work on implicit regularization at [Cohere Labs](https://cohere.com/events/Cohere-Labs-Tom-Jacobs-2025): Open Science Community ([video](/outreach#tom-jacobs--cohere-labs-aug-14-2025))."
16
22
17
23
- date: 12. June 2025
18
24
headline: "Tom is attending the AI & Mathematics Workshop ([AIM](https://aimath.nl/index.php/2025/03/13/4th-aim-cluster-event-tilburg/)) at Tilburg University."
@@ -36,7 +42,7 @@
36
42
headline: "Rebekka is at [CPAL](https://cpal.cc/spotlight_track/) presenting three [papers](/publications) as recent spotlights."
37
43
38
44
- date: 13. February 2025
39
-
headline: "Celia is presenting her work on graph rewiring at Cohere Labs: Open Science Community ([video](/outreach#celia-rubio-madrigal--cohere-labs-feb-13-2025))."
45
+
headline: "Celia is presenting her work on graph rewiring at [Cohere Labs](https://cohere.com/events/cohere-for-ai-Celia-Rubio-Madrigal-2025): Open Science Community ([video](/outreach#celia-rubio-madrigal--cohere-labs-feb-13-2025))."
description: "„KI-Kompass für Krisenzeiten“ – Echte Hilfe oder Black-Box-Gefahr? Dr. Rebekka Burkholz, CISPA Helmholtz-Zentrum für Informationssicherheit im Gespräch mit Christian Seel, Landesbeauftragter für zivil-militärische Zusammenarbeit und Bevölkerungsschutz."
description: "After coffee break we had Rebekka Burkholz discussing current challenges when modelling gene regulation and how to fix them. Her approach is innovative and allows us to infer biological processes with both scalability and interpretability."
#description: "After coffee break we had Rebekka Burkholz discussing current challenges when modelling gene regulation and how to fix them. Her approach is innovative and allows us to infer biological processes with both scalability and interpretability."
143
149
date: "2025-06-02"
150
+
embeds: '<blockquote class="bluesky-embed" data-bluesky-uri="at://did:plc:6azaynaddykjpnn6a2gj7rms/app.bsky.feed.post/3lqmzjkcks22l" data-bluesky-cid="bafyreid4sijbp7ftpkc4bam7e4mvxem4nyi7idji3vvuh2zc5mnexkzhpi" data-bluesky-embed-color-mode="system"><p lang="en">After coffee break we had Rebekka Burkholz discussing current challenges when modelling gene regulation and how to fix them. Her approach is innovative and allows us to infer biological processes with both scalability and interpretability. #NetBioMed2025 #NetSci2025<br><br><a href="https://bsky.app/profile/did:plc:6azaynaddykjpnn6a2gj7rms/post/3lqmzjkcks22l?ref_src=embed">[image or embed]</a></p>— NetBioMed 2025 (<a href="https://bsky.app/profile/did:plc:6azaynaddykjpnn6a2gj7rms?ref_src=embed">@netbiomed2025.bsky.social</a>) <a href="https://bsky.app/profile/did:plc:6azaynaddykjpnn6a2gj7rms/post/3lqmzjkcks22l?ref_src=embed">June 2, 2025 at 4:49 PM</a></blockquote>'
Copy file name to clipboardExpand all lines: _pages/home.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,6 +14,6 @@ permalink: /
14
14
Welcome! We are the Relational ML research group.
15
15
We are part of the [CISPA Helmholtz Center for Information Security](https://cispa.de) in Saarbrücken and St. Ingbert, Germany, and are grateful to [Saarland University (UdS)](https://www.uni-saarland.de) for granting us supervision rights.
16
16
17
-
Our research is supported by an [ERC starting grant](https://cispa.de/en/research/grants/sparse-ml)and Apple Research to improve the **efficiency of deep learning**. The aim is to design smaller-scale neural networks, which excel in noisy and potentially changing environments and require minimal sample sizes for learning. This is of particular interest in the sciences and application domains where data is scarce.
17
+
Our research is supported by an [ERC starting grant](https://cispa.de/en/research/grants/sparse-ml) to improve the **efficiency of deep learning**. The aim is to design smaller-scale neural networks, which excel in noisy and potentially changing environments and require minimal sample sizes for learning. This is of particular interest in the sciences and application domains where data is scarce.
18
18
We care deeply about solving real world problems in collaboration with domain experts. Of special interest to us are problems related to gene regulation and its alterations during cancer progression, drug design, and international food trade.
19
19
From a methodological point of view, we combine robust algorithm design with complex network science to advance deep learning theory and efficiency in general and in various applications ranging from biomedicine to pharmacy, physics, and economics.
0 commit comments