Skip to content

Commit e01db20

Browse files
committed
tom cohere
1 parent df7e64e commit e01db20

File tree

9 files changed

+80
-35
lines changed

9 files changed

+80
-35
lines changed

_bibliography/references.bib

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,8 +5,8 @@ @inproceedings{jacobs2025mirror
55
author={Tom Jacobs and Chao Zhou and Rebekka Burkholz},
66
booktitle={Forty-second International Conference on Machine Learning},
77
year={2025},
8-
url={https://arxiv.org/abs/2504.12883},
9-
pdf={https://arxiv.org/pdf/2504.12883},
8+
url={https://openreview.net/forum?id=MLiR9LS5PW},
9+
pdf={https://openreview.net/pdf?id=MLiR9LS5PW},
1010
img={mirror-mirror.jpg},
1111
abstract={Implicit bias plays an important role in explaining how overparameterized models generalize well. Explicit regularization like weight decay is often employed in addition to prevent overfitting. While both concepts have been studied separately, in practice, they often act in tandem. Understanding their interplay is key to controlling the shape and strength of implicit bias, as it can be modified by explicit regularization. To this end, we incorporate explicit regularization into the mirror flow framework and analyze its lasting effects on the geometry of the training dynamics, covering three distinct effects: positional bias, type of bias, and range shrinking. Our analytical approach encompasses a broad class of problems, including sparse coding, matrix sensing, single-layer attention, and LoRA, for which we demonstrate the utility of our insights. To exploit the lasting effect of regularization and highlight the potential benefit of dynamic weight decay schedules, we propose to switch off weight decay during training, which can improve generalization, as we demonstrate in experiments.},
1212
}

_data/alumni_members.yml

Lines changed: 12 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,4 @@
11

2-
3-
- role: Research engineers
4-
members:
5-
- name: Nikita Vedeneev
6-
last_name: Vedeneev
7-
photo: c01mive.jpg
8-
start_date: Dec 24
9-
end_date: May 25
10-
email: mikita.vedzeneyeu@cispa.de
11-
url: https://github.com/nikitaved
12-
description: "I am interesting in making modern AI models efficient. In particular, I work on discovering and exploiting structure in Neural Networks (sparsity, low-dimensional representations and similar) for efficient training, fine-tuning and inference. I am a former full-time core developer for [PyTorch](https://github.com/pytorch/pytorch) and [Lightning Thunder](https://github.com/Lightning-AI/lightning-thunder). Check my [GitHub](https://github.com/nikitaved) to see what I work on now."
13-
next: Senior Engineer at NVIDIA
14-
152
- role: Research assistants
163
members:
174
- name: Ben Horvath
@@ -42,6 +29,18 @@
4229
start_date: Dec 21
4330
end_date: Oct 22
4431

32+
- role: Research engineers
33+
members:
34+
- name: Nikita Vedeneev
35+
last_name: Vedeneev
36+
photo: c01mive.jpg
37+
start_date: Dec 24
38+
end_date: May 25
39+
email: mikita.vedzeneyeu@cispa.de
40+
url: https://github.com/nikitaved
41+
description: "I am interesting in making modern AI models efficient. In particular, I work on discovering and exploiting structure in Neural Networks (sparsity, low-dimensional representations and similar) for efficient training, fine-tuning and inference. I am a former full-time core developer for [PyTorch](https://github.com/pytorch/pytorch) and [Lightning Thunder](https://github.com/Lightning-AI/lightning-thunder). Check my [GitHub](https://github.com/nikitaved) to see what I work on now."
42+
next: Senior Engineer at NVIDIA
43+
4544
- role: Visiting students
4645
members:
4746
- name: Otto Piramuthu

_data/news.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
headline: "Rebekka and Celia are presenting at the Workshop on Mining and Learning with Graphs ([MLG](https://mlg-europe.github.io/2025/)) in Porto with a keynote and two posters, respectively."
33

44
- date: 14. August 2025
5-
headline: "Tom is [presenting](https://cohere.com/events/Cohere-Labs-Tom-Jacobs-2025) his work on implicit regularization at Cohere Labs: Open Science Community."
5+
headline: "Tom is [presenting](https://cohere.com/events/Cohere-Labs-Tom-Jacobs-2025) his work on implicit regularization at Cohere Labs: Open Science Community ([video](/outreach#tom-jacobs--cohere-labs-aug-14-2025))."
66

77
- date: 12. June 2025
88
headline: "Tom is attending the AI & Mathematics Workshop ([AIM](https://aimath.nl/index.php/2025/03/13/4th-aim-cluster-event-tilburg/)) at Tilburg University."

_data/outreach.yml

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,18 @@
11
videos:
2+
- title: "Weight Decay Controls Implicit Regularization: Insights on Generalization and Sparsity"
3+
date: 2025-08-14
4+
speaker: "Tom Jacobs"
5+
venue: "Cohere Labs"
6+
video: https://www.youtube.com/embed/KwxqXbgu78c?si=U-5CdYYuHwoP6r5o
7+
papers:
8+
- title: "Mask in the Mirror: Implicit Sparsification"
9+
authors: "Tom Jacobs, and Rebekka Burkholz"
10+
conference: ICLR 2025
11+
link: https://openreview.net/forum?id=U47ymTS3ut
12+
- title: "Mirror, Mirror of the Flow: How Does Regularization Shape Implicit Bias?"
13+
authors: "Tom Jacobs, Chao Zhou, and Rebekka Burkholz"
14+
conference: ICML 2025
15+
link: https://openreview.net/forum?id=MLiR9LS5PW
216
- title: "Rewiring Graph Neural Networks: When Less is More and Structure Matters"
317
date: 2025-02-13
418
speaker: "Celia Rubio-Madrigal"

_site/index.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -136,7 +136,7 @@ <h3>News</h3>
136136
<hr/>
137137

138138
<b>14 Aug 2025</b>
139-
<p>Tom is <a href="https://cohere.com/events/Cohere-Labs-Tom-Jacobs-2025">presenting</a> his work on implicit regularization at Cohere Labs: Open Science Community.</p>
139+
<p>Tom is <a href="https://cohere.com/events/Cohere-Labs-Tom-Jacobs-2025">presenting</a> his work on implicit regularization at Cohere Labs: Open Science Community (<a href="/outreach#tom-jacobs--cohere-labs-aug-14-2025">video</a>).</p>
140140

141141
<hr/>
142142

_site/news.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -83,7 +83,7 @@ <h4>15 September 2025</h4>
8383
<hr />
8484

8585
<h4>14 August 2025</h4>
86-
<p>Tom is <a href="https://cohere.com/events/Cohere-Labs-Tom-Jacobs-2025">presenting</a> his work on implicit regularization at Cohere Labs: Open Science Community.</p>
86+
<p>Tom is <a href="https://cohere.com/events/Cohere-Labs-Tom-Jacobs-2025">presenting</a> his work on implicit regularization at Cohere Labs: Open Science Community (<a href="/outreach#tom-jacobs--cohere-labs-aug-14-2025">video</a>).</p>
8787

8888
<hr />
8989

_site/outreach/index.html

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -85,6 +85,38 @@ <h1 id="outreach">Outreach</h1>
8585

8686
<h2 id="videos" class="anchor">Videos</h2>
8787

88+
<h3 id="tom-jacobs--cohere-labs-aug-14-2025">Tom Jacobs @ Cohere Labs (Aug 14, 2025)</h3>
89+
<h4 id="weight-decay-controls-implicit-regularization-insights-on-generalization-and-sparsity">Weight Decay Controls Implicit Regularization: Insights on Generalization and Sparsity</h4>
90+
91+
<div class="row">
92+
93+
<div class="col-sm-6 clearfix">
94+
95+
<p>Based on papers:</p>
96+
<ul>
97+
98+
<li>
99+
<strong>Mask in the Mirror: Implicit Sparsification</strong>,
100+
Tom Jacobs, and Rebekka Burkholz,
101+
<em>ICLR 2025</em>. (<a href="https://openreview.net/forum?id=U47ymTS3ut">Link to paper</a>)
102+
</li>
103+
104+
<li>
105+
<strong>Mirror, Mirror of the Flow: How Does Regularization Shape Implicit Bias?</strong>,
106+
Tom Jacobs, Chao Zhou, and Rebekka Burkholz,
107+
<em>ICML 2025</em>. (<a href="https://openreview.net/forum?id=MLiR9LS5PW">Link to paper</a>)
108+
</li>
109+
110+
</ul>
111+
112+
</div>
113+
114+
<div class="col-sm-6 clearfix">
115+
<iframe width="374" height="210" src="https://www.youtube.com/embed/KwxqXbgu78c?si=U-5CdYYuHwoP6r5o&amp;autoplay=0" title="video player" frameborder="0" allow="accelerometer; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen=""></iframe>
116+
</div>
117+
118+
</div>
119+
88120
<h3 id="celia-rubio-madrigal--cohere-labs-feb-13-2025">Celia Rubio-Madrigal @ Cohere Labs (Feb 13, 2025)</h3>
89121
<h4 id="rewiring-graph-neural-networks-when-less-is-more-and-structure-matters">Rewiring Graph Neural Networks: When Less is More and Structure Matters</h4>
90122

_site/publications/index.html

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -389,7 +389,7 @@ <h2 id="accepted-papers">Accepted papers</h2>
389389
<div id="jacobs2025mirror" class="col-sm-10">
390390
<!-- Title -->
391391

392-
<div class="title"><a href="https://arxiv.org/abs/2504.12883"><b>Mirror, Mirror of the Flow: How Does Regularization Shape Implicit Bias?</b></a></div>
392+
<div class="title"><a href="https://openreview.net/forum?id=MLiR9LS5PW"><b>Mirror, Mirror of the Flow: How Does Regularization Shape Implicit Bias?</b></a></div>
393393

394394
<!-- Author -->
395395
<div class="author">
@@ -410,7 +410,7 @@ <h2 id="accepted-papers">Accepted papers</h2>
410410
<!-- Links/Buttons -->
411411
<div class="links"><a class="conf btn btn-sm z-depth-0">ICML</a><a class="bibtex btn btn-sm z-depth-0" role="button">Bib</a>
412412
<a class="abstract btn btn-sm z-depth-0" role="button">Abs</a>
413-
<!-- <a href="https://arxiv.org/pdf/2504.12883" class="btn btn-sm z-depth-0" role="button">PDF</a> -->
413+
<!-- <a href="https://openreview.net/pdf?id=MLiR9LS5PW" class="btn btn-sm z-depth-0" role="button">PDF</a> -->
414414
</div>
415415

416416

@@ -424,7 +424,7 @@ <h2 id="accepted-papers">Accepted papers</h2>
424424
<span class="na">author</span> <span class="p">=</span> <span class="s">{Jacobs, Tom and Zhou, Chao and Burkholz, Rebekka}</span><span class="p">,</span>
425425
<span class="na">booktitle</span> <span class="p">=</span> <span class="s">{Forty-second International Conference on Machine Learning}</span><span class="p">,</span>
426426
<span class="na">year</span> <span class="p">=</span> <span class="s">{2025}</span><span class="p">,</span>
427-
<span class="na">url</span> <span class="p">=</span> <span class="s">{https://arxiv.org/abs/2504.12883}</span><span class="p">,</span>
427+
<span class="na">url</span> <span class="p">=</span> <span class="s">{https://openreview.net/forum?id=MLiR9LS5PW}</span><span class="p">,</span>
428428
<span class="p">}</span></code></pre></figure>
429429
</div>
430430
</div>

_site/team/index.html

Lines changed: 14 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -264,20 +264,6 @@ <h2 id="alumni">Alumni</h2>
264264

265265
<!-- -->
266266

267-
<div class="row">
268-
<div class="col-sm-12 clearfix">
269-
270-
<h3>Research engineers</h3>
271-
272-
<p><a href="https://github.com/nikitaved">Nikita Vedeneev</a>:
273-
Dec 24-May 25. Next ⇢ Senior Engineer at NVIDIA.</p>
274-
275-
</div>
276-
277-
</div>
278-
279-
<!-- -->
280-
281267
<div class="row">
282268
<div class="col-sm-12 clearfix">
283269

@@ -301,6 +287,20 @@ <h3>Research assistants</h3>
301287

302288
<!-- -->
303289

290+
<div class="row">
291+
<div class="col-sm-12 clearfix">
292+
293+
<h3>Research engineers</h3>
294+
295+
<p><a href="https://github.com/nikitaved">Nikita Vedeneev</a>:
296+
Dec 24-May 25. Next ⇢ Senior Engineer at NVIDIA.</p>
297+
298+
</div>
299+
300+
</div>
301+
302+
<!-- -->
303+
304304
<div class="row">
305305
<div class="col-sm-12 clearfix">
306306

0 commit comments

Comments
 (0)