You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: _bibliography/references.bib
+10Lines changed: 10 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -1,5 +1,15 @@
1
1
---
2
2
---
3
+
@inproceedings{jacobs2025mirror,
4
+
title={Mirror, Mirror of the Flow: How Does Regularization Shape Implicit Bias?},
5
+
author={Tom Jacobs and Chao Zhou and Rebekka Burkholz},
6
+
booktitle={Forty-second International Conference on Machine Learning},
7
+
year={2025},
8
+
url={https://arxiv.org/abs/2504.12883},
9
+
pdf={https://arxiv.org/pdf/2504.12883},
10
+
img={mirror-mirror.jpg},
11
+
abstract={Implicit bias plays an important role in explaining how overparameterized models generalize well. Explicit regularization like weight decay is often employed in addition to prevent overfitting. While both concepts have been studied separately, in practice, they often act in tandem. Understanding their interplay is key to controlling the shape and strength of implicit bias, as it can be modified by explicit regularization. To this end, we incorporate explicit regularization into the mirror flow framework and analyze its lasting effects on the geometry of the training dynamics, covering three distinct effects: positional bias, type of bias, and range shrinking. Our analytical approach encompasses a broad class of problems, including sparse coding, matrix sensing, single-layer attention, and LoRA, for which we demonstrate the utility of our insights. To exploit the lasting effect of regularization and highlight the potential benefit of dynamic weight decay schedules, we propose to switch off weight decay during training, which can improve generalization, as we demonstrate in experiments.},
Copy file name to clipboardExpand all lines: _data/alumni_members.yml
+10Lines changed: 10 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -15,6 +15,16 @@
15
15
end_date: Jul 24
16
16
url: https://nelaturuharsha.github.io/
17
17
18
+
- role: Research engineers
19
+
members:
20
+
- name: Nikita (Nik) Vedeneev
21
+
last_name: Vedeneev
22
+
photo: c01mive.jpg
23
+
start_date: Dec 24
24
+
end_date: May 25
25
+
email: mikita.vedzeneyeu@cispa.de
26
+
description: "I am interesting in making modern AI models efficient. In particular, I work on discovering and exploiting structure in Neural Networks (sparsity, low-dimensional representations and similar) for efficient training, fine-tuning and inference. I am a former full-time core developer for [PyTorch](https://github.com/pytorch/pytorch) and [Lightning Thunder](https://github.com/Lightning-AI/lightning-thunder). Check my [GitHub](https://github.com/nikitaved) to see what I work on now."
Copy file name to clipboardExpand all lines: _data/news.yml
+11-2Lines changed: 11 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -1,11 +1,20 @@
1
+
- date: 2. June 2025
2
+
headline: "Rebekka and Celia are presenting at the International Network Science Conference ([NetSci](https://netsci2025.github.io/)) in Maastricht."
3
+
4
+
- date: 1. June 2025
5
+
headline: Welcome Baraah!
6
+
7
+
- date: 1. May 2025
8
+
headline: "Our paper on [implicit bias](https://arxiv.org/pdf/2504.12883) has been accepted at ICML 2025."
9
+
1
10
- date: 24. March 2025
2
11
headline: "Rebekka is at [CPAL](https://cpal.cc/spotlight_track/) in Stanford presenting three of our [papers](/publications) as recent spotlights."
3
12
4
13
- date: 22. January 2025
5
14
headline: "Two papers
6
15
[(1)](https://openreview.net/forum?id=g6v09VxgFw)
7
16
[(2)](https://openreview.net/forum?id=U47ymTS3ut)
8
-
have been accepted at ICLR 2025 (see [publications](/publications))."
17
+
have been accepted at ICLR 2025."
9
18
10
19
- date: 1. December 2024
11
20
headline: "Welcome to Gowtham and Nik!"
@@ -21,7 +30,7 @@
21
30
headline: "Welcome to Chao, Rahul, and Dong!"
22
31
23
32
- date: 14. June 2024
24
-
headline: "Celia, Advait and Adarsh are presenting at the Helmholtz AI Conference: AI for Science ([HAICON](https://eventclass.it/haic2024/scientific/external-program/session?s=S-05a))."
33
+
headline: "Celia, Advait and Adarsh are presenting at the Helmholtz AI Conference: AI for Science ([HAICON](https://eventclass.it/haic2024/scientific/external-program/session?s=S-05a)) in Düsseldorf."
25
34
26
35
- date: 1. May 2024
27
36
headline: "Our paper on [improving GATs](https://openreview.net/forum?id=Sjv5RcqfuH) has been accepted at ICML 2024."
Copy file name to clipboardExpand all lines: _data/team_members.yml
+6-9Lines changed: 6 additions & 9 deletions
Original file line number
Diff line number
Diff line change
@@ -74,12 +74,9 @@
74
74
url: https://cispa.de/en/people/c01dosu
75
75
description: "My current research focuses on theoretically elucidating the superior performance of Mixture of Experts models, with an emphasis on their generalization performance, sample complexity, training dynamics, and robustness to adversarial noises."
76
76
77
-
- role: Research engineers
78
-
members:
79
-
- name: Nikita (Nik) Vedeneev
80
-
last_name: Vedeneev
81
-
photo: c01mive.jpg
82
-
start_date: Dec 2024
83
-
email: mikita.vedzeneyeu@cispa.de
84
-
url: https://cispa.de/en/people/c01mive
85
-
description: "I am interesting in making modern AI models efficient. In particular, I work on discovering and exploiting structure in Neural Networks (sparsity, low-dimensional representations and similar) for efficient training, fine-tuning and inference. I am a former full-time core developer for [PyTorch](https://github.com/pytorch/pytorch) and [Lightning Thunder](https://github.com/Lightning-AI/lightning-thunder). Check my [GitHub](https://github.com/nikitaved) to see what I work on now."
Copy file name to clipboardExpand all lines: _site/index.html
+10-10Lines changed: 10 additions & 10 deletions
Original file line number
Diff line number
Diff line change
@@ -120,28 +120,28 @@ <h1 id="relational-ml-lab">Relational ML Lab</h1>
120
120
<h3>News</h3>
121
121
<divclass="well">
122
122
123
-
<b>24 Mar 2025</b>
124
-
<p>Rebekka is at <ahref="https://cpal.cc/spotlight_track/">CPAL</a> in Stanford presenting three of our <ahref="/publications">papers</a> as recent spotlights.</p>
123
+
<b>02 Jun 2025</b>
124
+
<p>Rebekka and Celia are presenting at the International Network Science Conference (<ahref="https://netsci2025.github.io/">NetSci</a>) in Maastricht.</p>
125
125
126
126
<hr/>
127
127
128
-
<b>22 Jan 2025</b>
129
-
<p>Two papers <ahref="https://openreview.net/forum?id=g6v09VxgFw">(1)</a><ahref="https://openreview.net/forum?id=U47ymTS3ut">(2)</a> have been accepted at ICLR 2025 (see <ahref="/publications">publications</a>).</p>
128
+
<b>01 Jun 2025</b>
129
+
<p>Welcome Baraah!</p>
130
130
131
131
<hr/>
132
132
133
-
<b>01 Dec 2024</b>
134
-
<p>Welcome to Gowtham and Nik!</p>
133
+
<b>01 May 2025</b>
134
+
<p>Our paper on <ahref="https://arxiv.org/pdf/2504.12883">implicit bias</a> has been accepted at ICML 2025.</p>
135
135
136
136
<hr/>
137
137
138
-
<b>25 Sep 2024</b>
139
-
<p>Three papers <ahref="https://openreview.net/forum?id=EMkrwJY2de">(1)</a><ahref="https://openreview.net/forum?id=IfZwSRpqHl">(2)</a><ahref="https://openreview.net/forum?id=FNtsZLwkGr">(3)</a>have been accepted at NeurIPS 2024.</p>
138
+
<b>24 Mar 2025</b>
139
+
<p>Rebekka is at <ahref="https://cpal.cc/spotlight_track/">CPAL</a>in Stanford presenting three of our <ahref="/publications">papers</a>as recent spotlights.</p>
140
140
141
141
<hr/>
142
142
143
-
<b>01 Jul 2024</b>
144
-
<p>Welcome to Chao, Rahul, and Dong!</p>
143
+
<b>22 Jan 2025</b>
144
+
<p>Two papers <ahref="https://openreview.net/forum?id=g6v09VxgFw">(1)</a><ahref="https://openreview.net/forum?id=U47ymTS3ut">(2)</a> have been accepted at ICLR 2025.</p>
Copy file name to clipboardExpand all lines: _site/news.html
+17-2Lines changed: 17 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -71,13 +71,28 @@ <h1 id="news">News</h1>
71
71
72
72
<hr/>
73
73
74
+
<h4>02 June 2025</h4>
75
+
<p>Rebekka and Celia are presenting at the International Network Science Conference (<ahref="https://netsci2025.github.io/">NetSci</a>) in Maastricht.</p>
76
+
77
+
<hr/>
78
+
79
+
<h4>01 June 2025</h4>
80
+
<p>Welcome Baraah!</p>
81
+
82
+
<hr/>
83
+
84
+
<h4>01 May 2025</h4>
85
+
<p>Our paper on <ahref="https://arxiv.org/pdf/2504.12883">implicit bias</a> has been accepted at ICML 2025.</p>
86
+
87
+
<hr/>
88
+
74
89
<h4>24 March 2025</h4>
75
90
<p>Rebekka is at <ahref="https://cpal.cc/spotlight_track/">CPAL</a> in Stanford presenting three of our <ahref="/publications">papers</a> as recent spotlights.</p>
76
91
77
92
<hr/>
78
93
79
94
<h4>22 January 2025</h4>
80
-
<p>Two papers <ahref="https://openreview.net/forum?id=g6v09VxgFw">(1)</a><ahref="https://openreview.net/forum?id=U47ymTS3ut">(2)</a> have been accepted at ICLR 2025 (see <ahref="/publications">publications</a>).</p>
95
+
<p>Two papers <ahref="https://openreview.net/forum?id=g6v09VxgFw">(1)</a><ahref="https://openreview.net/forum?id=U47ymTS3ut">(2)</a> have been accepted at ICLR 2025.</p>
81
96
82
97
<hr/>
83
98
@@ -97,7 +112,7 @@ <h4>01 July 2024</h4>
97
112
<hr/>
98
113
99
114
<h4>14 June 2024</h4>
100
-
<p>Celia, Advait and Adarsh are presenting at the Helmholtz AI Conference: AI for Science (<ahref="https://eventclass.it/haic2024/scientific/external-program/session?s=S-05a">HAICON</a>).</p>
115
+
<p>Celia, Advait and Adarsh are presenting at the Helmholtz AI Conference: AI for Science (<ahref="https://eventclass.it/haic2024/scientific/external-program/session?s=S-05a">HAICON</a>) in Düsseldorf.</p>
<p>Implicit bias plays an important role in explaining how overparameterized models generalize well. Explicit regularization like weight decay is often employed in addition to prevent overfitting. While both concepts have been studied separately, in practice, they often act in tandem. Understanding their interplay is key to controlling the shape and strength of implicit bias, as it can be modified by explicit regularization. To this end, we incorporate explicit regularization into the mirror flow framework and analyze its lasting effects on the geometry of the training dynamics, covering three distinct effects: positional bias, type of bias, and range shrinking. Our analytical approach encompasses a broad class of problems, including sparse coding, matrix sensing, single-layer attention, and LoRA, for which we demonstrate the utility of our insights. To exploit the lasting effect of regularization and highlight the potential benefit of dynamic weight decay schedules, we propose to switch off weight decay during training, which can improve generalization, as we demonstrate in experiments.</p>
<spanclass="na">title</span><spanclass="p">=</span><spanclass="s">{Mirror, Mirror of the Flow: How Does Regularization Shape Implicit Bias?}</span><spanclass="p">,</span>
132
+
<spanclass="na">author</span><spanclass="p">=</span><spanclass="s">{Jacobs, Tom and Zhou, Chao and Burkholz, Rebekka}</span><spanclass="p">,</span>
133
+
<spanclass="na">booktitle</span><spanclass="p">=</span><spanclass="s">{Forty-second International Conference on Machine Learning}</span><spanclass="p">,</span>
<p>I am interesting in making modern AI models efficient. In particular, I work on discovering and exploiting structure in Neural Networks (sparsity, low-dimensional representations and similar) for efficient training, fine-tuning and inference. I am a former full-time core developer for <ahref="https://github.com/pytorch/pytorch">PyTorch</a> and <ahref="https://github.com/Lightning-AI/lightning-thunder">Lightning Thunder</a>. Check my <ahref="https://github.com/nikitaved">GitHub</a> to see what I work on now.</p>
0 commit comments