-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathScience of Deep Learning.html
1518 lines (1515 loc) · 234 KB
/
Science of Deep Learning.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title>Science of Learning</title>
<style>
</style>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/[email protected]/dist/katex.min.css" integrity="sha384-yFRtMMDnQtDRO8rLpMIKrtPCD5jdktao2TV19YiZYWMDkUR5GQZR/NOVTdquEx1j" crossorigin="anonymous">
<link href="https://cdn.jsdelivr.net/npm/katex-copytex@latest/dist/katex-copytex.min.css" rel="stylesheet" type="text/css">
<link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/Microsoft/vscode/extensions/markdown-language-features/media/markdown.css">
<link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/Microsoft/vscode/extensions/markdown-language-features/media/highlight.css">
<style>
body {
font-family: -apple-system, BlinkMacSystemFont, 'Segoe WPC', 'Segoe UI', system-ui, 'Ubuntu', 'Droid Sans', sans-serif;
font-size: 14px;
line-height: 1.6;
}
</style>
<style>
.task-list-item { list-style-type: none; } .task-list-item-checkbox { margin-left: -20px; vertical-align: middle; }
</style>
<script src="https://cdn.jsdelivr.net/npm/katex-copytex@latest/dist/katex-copytex.min.js"></script>
</head>
<body class="vscode-body vscode-light">
<h1 id="science-of--learning">Science of Learning</h1>
<p>V. Vapnik said that ``Nothing is more practical than a good theory.''
Here we focus on the theoretical machine learning.</p>
<ul>
<li><a href="https://www.helsinki.fi/en/researchgroups/constraint-reasoning-and-optimization">https://www.helsinki.fi/en/researchgroups/constraint-reasoning-and-optimization</a></li>
<li><a href="https://www.math.ubc.ca/~erobeva/seminar.html">https://www.math.ubc.ca/~erobeva/seminar.html</a></li>
<li><a href="https://www.deel.ai/theoretical-guarantees/">https://www.deel.ai/theoretical-guarantees/</a></li>
<li><a href="http://www.vanderschaar-lab.com/NewWebsite/index.html">http://www.vanderschaar-lab.com/NewWebsite/index.html</a></li>
<li><a href="https://nthu-datalab.github.io/ml/index.html">https://nthu-datalab.github.io/ml/index.html</a></li>
<li><a href="http://www.cs.cornell.edu/~shmat/research.html">http://www.cs.cornell.edu/~shmat/research.html</a></li>
<li><a href="http://www.prace-ri.eu/best-practice-guide-deep-learning">http://www.prace-ri.eu/best-practice-guide-deep-learning</a></li>
<li><a href="https://math.ethz.ch/sam/research/reports.html?year=2019">https://math.ethz.ch/sam/research/reports.html?year=2019</a></li>
<li><a href="http://gr.xjtu.edu.cn/web/jjx323/home">http://gr.xjtu.edu.cn/web/jjx323/home</a></li>
<li><a href="https://zhouchenlin.github.io/">https://zhouchenlin.github.io/</a></li>
<li><a href="https://www.math.tamu.edu/~bhanin/">https://www.math.tamu.edu/~bhanin/</a></li>
<li><a href="https://yani.io/annou/">https://yani.io/annou/</a></li>
<li><a href="https://probability.dmi.unibas.ch/seminar.html">https://probability.dmi.unibas.ch/seminar.html</a></li>
<li><a href="http://mjt.cs.illinois.edu/courses/dlt-f19/">http://mjt.cs.illinois.edu/courses/dlt-f19/</a></li>
<li><a href="http://danroy.org/">http://danroy.org/</a></li>
<li><a href="https://www.symbiont-project.org/events/Slides-2018-03/SYMBIONT-2018-03-zimmermann.pdf">https://www.symbiont-project.org/events/Slides-2018-03/SYMBIONT-2018-03-zimmermann.pdf</a></li>
<li><a href="https://losslandscape.com/faq/">https://losslandscape.com/faq/</a></li>
<li><a href="https://mcallester.github.io/ttic-31230/">https://mcallester.github.io/ttic-31230/</a></li>
<li><a href="http://deep-phenomena.org/">http://deep-phenomena.org/</a></li>
<li><a href="https://ijcai20interpretability.github.io/">https://ijcai20interpretability.github.io/</a></li>
<li><a href="https://niceworkshop.org/">https://niceworkshop.org/</a></li>
</ul>
<p>Deep learning is a transformative technology that has delivered impressive improvements in image classification and speech recognition.
Many researchers are trying to better understand how to improve prediction performance and also how to improve training methods.
<a href="https://stats385.github.io/">Some researchers use experimental techniques; others use theoretical approaches.</a></p>
<ul>
<li><a href="https://www.cl.cam.ac.uk/~rja14/">https://www.cl.cam.ac.uk/~rja14/</a></li>
</ul>
<p>Deep learning is at least related with kernel tricks, projection pursuit and neural networks.</p>
<ul>
<li><a href="#science-of--learning">Science of Learning</a>
<ul>
<li><a href="#resource--on-deep-learning-theory">Resource on Deep Learning Theory</a>
<ul>
<li><a href="#deep-learning-reading-group">Deep Learning Reading Group</a></li>
</ul>
</li>
<li><a href="#interpretability-in-ai">Interpretability in AI</a>
<ul>
<li><a href="#interpretability-of-neural-networks">Interpretability of Neural Networks</a></li>
<li><a href="#deeplever">DeepLEVER</a></li>
<li><a href="#dlphi">DLphi</a></li>
<li><a href="#scientific-machine-learning">Scientific Machine Learning</a></li>
</ul>
</li>
<li><a href="#physics-and-deep-learning">Physics and Deep Learning</a>
<ul>
<li><a href="#machine-learning-for-physics">Machine Learning for Physics</a>
<ul>
<li><a href="#deep-learning-for-physics">Deep Learning for Physics</a></li>
</ul>
</li>
<li><a href="#physics-for-machine-learning">Physics for Machine Learning</a>
<ul>
<li><a href="#physics-informed-machine-learning">Physics Informed Machine Learning</a></li>
</ul>
</li>
<li><a href="#statistical-mechanics-and-deep-learning">Statistical Mechanics and Deep Learning</a></li>
<li><a href="#born-machine">Born Machine</a></li>
<li><a href="#quantum-machine-learning">Quantum Machine learning</a></li>
</ul>
</li>
<li><a href="#mathematics-of-deep-learning">Mathematics of Deep Learning</a>
<ul>
<li><a href="#discrete-mathematics-and--neural-networks">Discrete Mathematics and Neural Networks</a></li>
<li><a href="#numerical-analysis-for-deep-learning">Numerical Analysis for Deep Learning</a>
<ul>
<li><a href="#resnets">ResNets</a></li>
<li><a href="#differential-equations-motivated-deep-learning-methods">Differential Equations Motivated Deep Learning Methods</a></li>
</ul>
</li>
<li><a href="#control-theory-and-deep-learning">Control Theory and Deep Learning</a></li>
<li><a href="#neural-ordinary-differential-equations">Neural Ordinary Differential Equations</a></li>
</ul>
</li>
<li><a href="#dynamics-and-deep-learning">Dynamics and Deep Learning</a>
<ul>
<li><a href="#stability-for-neural-networks">Stability For Neural Networks</a></li>
</ul>
</li>
<li><a href="#differential-equation-and-deep-learning">Differential Equation and Deep Learning</a>
<ul>
<li><a href="#deep-learning-for-pdes">Deep Learning for PDEs</a></li>
<li><a href="#mathcal-h-matrix-and-deep-learning"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi mathvariant="script">H</mi></mrow><annotation encoding="application/x-tex">\mathcal H</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.68333em;vertical-align:0em;"></span><span class="mord mathcal" style="margin-right:0.00965em;">H</span></span></span></span> matrix and deep learning</a></li>
<li><a href="#stochastic-differential-equations-and-deep-learning">Stochastic Differential Equations and Deep Learning</a></li>
<li><a href="#finite-element-methods-and-deep-learning">Finite Element Methods and Deep Learning</a></li>
</ul>
</li>
<li><a href="#approximation-theory-for-deep-learning">Approximation Theory for Deep Learning</a>
<ul>
<li><a href="#workshop">Workshop</a></li>
<li><a href="#labs-and-groups">Labs and Groups</a></li>
<li><a href="#the-f-principle">The F-Principle</a></li>
</ul>
</li>
<li><a href="#inverse-problem-and-deep-learning">Inverse Problem and Deep Learning</a>
<ul>
<li><a href="#deep-learning-for-inverse-problems">Deep Learning for Inverse Problems</a></li>
<li><a href="#deep-inverse-optimization">Deep Inverse Optimization</a></li>
</ul>
</li>
<li><a href="#random-matrix-theory-and-deep-learning">Random Matrix Theory and Deep Learning</a>
<ul>
<li><a href="#nonlinear-random-matrix-theory">Nonlinear Random Matrix Theory</a></li>
</ul>
</li>
<li><a href="#deep-learning-and-optimal-transport">Deep learning and Optimal Transport</a>
<ul>
<li><a href="#generative-models-and-optimal-transport">Generative Models and Optimal Transport</a></li>
</ul>
</li>
<li><a href="#geometric-analysis-approach-to-ai">Geometric Analysis Approach to AI</a>
<ul>
<li><a href="#tropical-geometry-of-deep-neural-networks">Tropical Geometry of Deep Neural Networks</a></li>
</ul>
</li>
<li><a href="#topology-and-deep-learning">Topology and Deep Learning</a>
<ul>
<li><a href="#topology-optimization-and--deep-learning">Topology Optimization and Deep Learning</a></li>
</ul>
</li>
<li><a href="#algebra-and-deep-learning">Algebra and Deep Learning</a>
<ul>
<li><a href="#tensor-network">Tensor network</a></li>
<li><a href="#group-equivariant-convolutional-networks">Group Equivariant Convolutional Networks</a></li>
<li><a href="#complex-valued-neural-networks">Complex Valued Neural Networks</a></li>
<li><a href="#quaternion-neural-networks">Quaternion Neural Networks</a></li>
</ul>
</li>
<li><a href="#probabilistic-theory-and-deep-learning">Probabilistic Theory and Deep Learning</a>
<ul>
<li><a href="#bayesian-deep-learning">Bayesian Deep Learning</a></li>
</ul>
</li>
<li><a href="#statistics-and-deep-learning">Statistics and Deep Learning</a>
<ul>
<li><a href="#statistical-relational-ai">Statistical Relational AI</a></li>
<li><a href="#principal-component-neural-networks">Principal Component Neural Networks</a></li>
<li><a href="#least-squares-support-vector-machines">Least squares support vector machines</a></li>
</ul>
</li>
<li><a href="#information-theory-and-deep-learning">Information Theory and Deep Learning</a>
<ul>
<li><a href="#information-bottleneck-theory">Information bottleneck theory</a></li>
</ul>
</li>
<li><a href="#brain-science-and-ai">Brain Science and AI</a>
<ul>
<li><a href="#spiking-neural-networks">Spiking neural networks</a></li>
<li><a href="#the-thousand-brains-theory-of-intelligence">The Thousand Brains Theory of Intelligence</a></li>
</ul>
</li>
<li><a href="#cognition-science-and-deep-learning">Cognition Science and Deep Learning</a></li>
<li><a href="#the-lottery-ticket-hypothesis">The lottery ticket hypothesis</a></li>
<li><a href="#double-descent">Double Descent</a></li>
</ul>
</li>
</ul>
<h2 id="resource--on-deep-learning-theory">Resource on Deep Learning Theory</h2>
<ul>
<li><a href="http://pwp.gatech.edu/fdl-2018/program/">http://pwp.gatech.edu/fdl-2018/program/</a></li>
<li><a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6052125/">https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6052125/</a></li>
<li><a href="https://ori.ox.ac.uk/labs/a2i/">https://ori.ox.ac.uk/labs/a2i/</a></li>
<li><a href="https://deep-learning-drizzle.github.io/">Deep Learning Drizzle</a></li>
<li><a href="https://github.com/Stephlat/DeepRegression">A Comprehensive Analysis of Deep Regression</a></li>
<li><a href="https://gangwg.github.io/research.html">https://gangwg.github.io/research.html</a></li>
<li><a href="http://www.mit.edu/~k2smith/">http://www.mit.edu/~k2smith/</a></li>
<li><a href="http://www.dfki.de/semdeep-4/">4th Workshop on Semantic Deep Learning (SemDeep-4)</a></li>
<li><a href="https://www.lri.fr/~gcharpia/deeppractice/">Deep Learning in Practice</a></li>
<li><a href="http://guillefix.me/cosmos/static/Deep%2520learning%2520theory">Deep learning theory</a></li>
<li><a href="http://web.fsktm.um.edu.my/~cschan/iredlia.html">2018 Workshop on Interpretable & Reasonable Deep Learning and its Applications (IReDLiA)</a></li>
<li><a href="https://stats385.github.io/">Analyses of Deep Learning (STATS 385) 2019</a></li>
<li><a href="http://www.cs.ox.ac.uk/people/yarin.gal/website/blog_5058.html">The Science of Deep Learning</a></li>
<li><a href="http://workshop.tbsi.edu.cn/index.html">TBSI 2019 Retreat Conference</a></li>
<li><a href="https://people.csail.mit.edu/madry/6.883/">6.883 Science of Deep Learning: Bridging Theory and Practice -- Spring 2018</a></li>
<li><a href="http://mitliagkas.github.io/ift6085-dl-theory-class/">(Winter 2018) IFT 6085: Theoretical principles for deep learning</a></li>
<li><a href="http://principlesofdeeplearning.com/">http://principlesofdeeplearning.com/</a></li>
<li><a href="https://cbmm.mit.edu/education/courses">https://cbmm.mit.edu/education/courses</a></li>
<li><a href="http://dalimeeting.org/dali2018/workshopTheoryDL.html">DALI 2018 - Data, Learning and Inference</a></li>
<li><a href="http://www.deeplearningpatterns.com/doku.php?id=theory">On Theory@http://www.deeplearningpatterns.com </a></li>
<li><a href="https://blog.csdn.net/dQCFKyQDXYm3F8rB0/article/details/85815724">https://blog.csdn.net/dQCFKyQDXYm3F8rB0/article/details/85815724</a></li>
<li><a href="https://uvadlc.github.io/">UVA DEEP LEARNING COURSE</a></li>
<li><a href="https://rakeshchada.github.io/Neural-Embedding-Animation.html">Understanding Neural Networks by embedding hidden representations</a></li>
<li><a href="https://www.cs.washington.edu/research/tractable-deep-learning">Tractable Deep Learning</a></li>
<li><a href="https://stats385.github.io/">Theories of Deep Learning (STATS 385)</a></li>
<li><a href="https://github.com/joanbruna/stat212b">Topics Course on Deep Learning for Spring 2016 by Joan Bruna, UC Berkeley, Statistics Department</a></li>
<li><a href="http://elmos.scripts.mit.edu/mathofdeeplearning/">Mathematical aspects of Deep Learning</a></li>
<li><a href="https://deeplearning-math.github.io/">MATH 6380p. Advanced Topics in Deep Learning Fall 2018</a></li>
<li><a href="https://www.advancedtopicsindeeplearning.com/">CoMS E6998 003: Advanced Topics in Deep Learning</a></li>
<li><a href="http://www.mit.edu/~9.520/fall17/Classes/deep_learning_theory.html">Deep Learning Theory: Approximation, Optimization, Generalization</a></li>
<li><a href="https://sites.google.com/site/deeplearningtheory/">Theory of Deep Learning, ICML'2018</a></li>
<li><a href="http://dalimeeting.org/dali2018/workshopTheoryDL.html">DALI 2018, Data Learning and Inference</a></li>
<li><a href="https://github.com/joanbruna/MathsDL-spring18">MATHEMATICS OF DEEP LEARNING, NYU, Spring 2018</a></li>
<li><a href="https://www.researchgate.net/project/Theory-of-Deep-Learning">Theory of Deep Learning, project in researchgate</a></li>
<li><a href="https://physicsml.github.io/blog/DL-theory.html">THE THEORY OF DEEP LEARNING - PART I</a></li>
<li><a href="http://cognitivemedium.com/magic_paper/index.html">Magic paper</a></li>
<li><a href="https://www.padl.ws/">Principled Approaches to Deep Learning</a></li>
<li><a href="https://arxiv.org/pdf/1811.03962.pdf">A Convergence Theory for Deep Learning via Over-Parameterization</a></li>
<li><a href="https://github.com/brendenlake/AAI-site">Advancing AI through cognitive science</a></li>
<li><a href="http://stillbreeze.github.io/Deep-Learning-and-the-Demand-For-Interpretability/">Deep Learning and the Demand for Interpretability</a></li>
<li><a href="https://www.robots.ox.ac.uk/~vedaldi//research/idiu/idiu.html">Integrated and detailed image understanding</a></li>
<li><a href="http://nips2018dltheory.rice.edu/">NeuroIP 2018 workshop on Deep Learning Theory</a></li>
<li><a href="http://networkinterpretability.org/">http://networkinterpretability.org/</a></li>
<li><a href="https://interpretablevision.github.io/">https://interpretablevision.github.io/</a></li>
<li><a href="https://www.msra.cn/zh-cn/news/people-stories/wei-chen">https://www.msra.cn/zh-cn/news/people-stories/wei-chen</a></li>
<li><a href="https://www.microsoft.com/en-us/research/people/tyliu/">https://www.microsoft.com/en-us/research/people/tyliu/</a></li>
<li><a href="https://zhuanlan.zhihu.com/p/22353056">https://zhuanlan.zhihu.com/p/22353056</a></li>
<li><a href="http://qszhang.com/index.php/team/">http://qszhang.com/index.php/team/</a></li>
<li><a href="https://www.researchgate.net/profile/Hatef_Monajemi">https://www.researchgate.net/profile/Hatef_Monajemi</a></li>
<li><a href="https://indico.cern.ch/event/781223/">Symposium Artificial Intelligence for Science, Industry and Society</a></li>
<li><a href="https://arxiv.org/abs/1909.13458">https://arxiv.org/abs/1909.13458</a></li>
<li><a href="https://www.lri.fr/TAU_seminars/">TAU & GTDeepNet seminars</a></li>
</ul>
<h3 id="deep-learning-reading-group">Deep Learning Reading Group</h3>
<p><a href="http://www.cs.virginia.edu//papers.htm">yanjun</a> organized a wonderful reading group on deep learning.</p>
<ul>
<li><a href="https://a2i2.deakin.edu.au/">https://a2i2.deakin.edu.au/</a></li>
<li><a href="https://qdata.github.io/deep2Read/">https://qdata.github.io/deep2Read/</a></li>
<li><a href="https://dlta-reading.github.io/">https://dlta-reading.github.io/</a></li>
</ul>
<ul>
<li><a href="http://www.mlnl.cs.ucl.ac.uk/readingroup.html">http://www.mlnl.cs.ucl.ac.uk/readingroup.html</a></li>
<li><a href="https://labrosa.ee.columbia.edu/cuneuralnet/">https://labrosa.ee.columbia.edu/cuneuralnet/</a></li>
<li><a href="http://www.ub.edu/cvub/reading-group/">http://www.ub.edu/cvub/reading-group/</a></li>
<li><a href="https://team.inria.fr/perception/deeplearning/">https://team.inria.fr/perception/deeplearning/</a></li>
<li><a href="https://scholar.princeton.edu/csmlreading">https://scholar.princeton.edu/csmlreading</a></li>
<li><a href="https://junjuew.github.io/elijah-reading-group/">https://junjuew.github.io/elijah-reading-group/</a></li>
<li><a href="http://www.sribd.cn/DL/schedule.html">http://www.sribd.cn/DL/schedule.html</a></li>
<li><a href="http://lear.inrialpes.fr/people/gaidon/lear_xrce_deep_learning_01.html">http://lear.inrialpes.fr/people/gaidon/lear_xrce_deep_learning_01.html</a></li>
<li><a href="https://simons.berkeley.edu/events/reading-group-deep-learning">https://simons.berkeley.edu/events/reading-group-deep-learning</a></li>
<li><a href="https://csml.princeton.edu/readinggroup">https://csml.princeton.edu/readinggroup</a></li>
<li><a href="http://www.bicv.org/deep-learning/">http://www.bicv.org/deep-learning/</a></li>
<li><a href="https://www.cs.ubc.ca/labs/lci/mlrg/">https://www.cs.ubc.ca/labs/lci/mlrg/</a></li>
<li><a href="https://calculatedcontent.com/2015/03/25/why-does-deep-learning-work/">https://calculatedcontent.com/2015/03/25/why-does-deep-learning-work/</a></li>
<li><a href="https://project.inria.fr/deeplearning/">https://project.inria.fr/deeplearning/</a></li>
<li><a href="https://hustcv.github.io/reading-list.html">https://hustcv.github.io/reading-list.html</a></li>
</ul>
<h2 id="interpretability-in-ai">Interpretability in AI</h2>
<ul>
<li><a href="https://ec.europa.eu/jrc/communities/en/node/1162/article/interpretability-ai-and-its-relation-fairness-transparency-reliability-and-trust">https://ec.europa.eu/jrc/communities/en/node/1162/article/interpretability-ai-and-its-relation-fairness-transparency-reliability-and-trust</a></li>
<li><a href="https://github.com/jphall663/awesome-machine-learning-interpretability">https://github.com/jphall663/awesome-machine-learning-interpretability</a></li>
<li><a href="https://people.mpi-sws.org/~manuelgr/">https://people.mpi-sws.org/~manuelgr/</a></li>
<li><a href="https://ec.europa.eu/jrc/communities/en/community/humaint/event/2nd-humaint-winter-school-fairness-accountability-and-transparency">https://ec.europa.eu/jrc/communities/en/community/humaint/event/2nd-humaint-winter-school-fairness-accountability-and-transparency</a></li>
<li><a href="https://facctconference.org/network/">https://facctconference.org/network/</a></li>
<li><a href="https://facctconference.org/network/">https://facctconference.org/network/</a></li>
</ul>
<h3 id="interpretability-of-neural-networks">Interpretability of Neural Networks</h3>
<p><a href="http://academic.hep.com.cn/fitee/CN/10.1631/FITEE.1700808#1"> Although deep neural networks have exhibited superior performance in various tasks, interpretability is always Achilles’ heel of deep neural networks.</a>
At present, deep neural networks obtain high discrimination power at the cost of a low interpretability of their black-box representations.
We believe that high model interpretability may help people break several bottlenecks of deep learning,
e.g., learning from a few annotations, learning via human–computer communications at the semantic level,
and semantically debugging network representations.
We focus on convolutional neural networks (CNNs), and revisit the visualization of CNN representations,
methods of diagnosing representations of pre-trained CNNs, approaches for disentangling pre-trained CNN representations, learning of CNNs
with disentangled representations, and middle-to-end learning based on model interpretability.
Finally, we discuss prospective trends in explainable artificial intelligence.</p>
<ul>
<li><a href="https://www.transai.org/">https://www.transai.org/</a></li>
<li><a href="http://games-cn.org/games-webinar-20190509-93/">GAMES Webinar 2019 – 93期(深度学习可解释性专题课程) </a></li>
<li><a href="http://games-cn.org/games-webinar-20190516-94/">GAMES Webinar 2019 – 94期(深度学习可解释性专题课程) | 刘日升(大连理工大学),张拳石(上海交通大学)</a></li>
<li><a href="http://qszhang.com/index.php/publications/">http://qszhang.com/index.php/publications/</a></li>
<li><a href="https://arxiv.org/abs/1812.07169">Explaining Neural Networks Semantically and Quantitatively</a></li>
<li><a href="https://www.jiqizhixin.com/articles/0211">https://www.jiqizhixin.com/articles/0211</a></li>
<li><a href="https://www.jiqizhixin.com/articles/030205">https://www.jiqizhixin.com/articles/030205</a></li>
<li><a href="https://mp.weixin.qq.com/s/xY7Cpe6idbOTJuyD3vwD3w">https://mp.weixin.qq.com/s/xY7Cpe6idbOTJuyD3vwD3w</a></li>
<li><a href="http://academic.hep.com.cn/fitee/CN/10.1631/FITEE.1700808#1">http://academic.hep.com.cn/fitee/CN/10.1631/FITEE.1700808#1</a></li>
<li><a href="https://arxiv.org/pdf/1905.11833.pdf">https://arxiv.org/pdf/1905.11833.pdf</a></li>
<li><a href="http://www.cs.sjtu.edu.cn/~leng-jw/">http://www.cs.sjtu.edu.cn/~leng-jw/</a></li>
<li><a href="https://lemondan.github.io">https://lemondan.github.io</a></li>
<li><a href="http://ise.sysu.edu.cn/teacher/teacher02/1136886.htm">http://ise.sysu.edu.cn/teacher/teacher02/1136886.htm</a></li>
<li><a href="http://www.cs.cmu.edu/~zhitingh/data/hu18texar.pdf">http://www.cs.cmu.edu/~zhitingh/data/hu18texar.pdf</a></li>
<li><a href="https://datasciencephd.eu/DSSS19/slides/GiannottiPedreschi-ExplainableAI.pdf">https://datasciencephd.eu/DSSS19/slides/GiannottiPedreschi-ExplainableAI.pdf</a></li>
<li><a href="http://www.cs.cmu.edu/~zhitingh/">http://www.cs.cmu.edu/~zhitingh/</a></li>
<li><a href="https://graphreason.github.io/">https://graphreason.github.io/</a></li>
</ul>
<ul>
<li><a href="https://beenkim.github.io/">https://beenkim.github.io/</a></li>
<li><a href="https://www.math.ucla.edu/~montufar/">https://www.math.ucla.edu/~montufar/</a></li>
<li><a href="https://link.springer.com/book/10.1007/978-3-030-28954-6">Explainable AI: Interpreting, Explaining and Visualizing Deep Learning</a></li>
<li><a href="http://www.prcv2019.com/en/index.html">http://www.prcv2019.com/en/index.html</a></li>
<li><a href="http://gr.xjtu.edu.cn/web/jiansun">http://gr.xjtu.edu.cn/web/jiansun</a></li>
<li><a href="http://www.shixialiu.com/">http://www.shixialiu.com/</a></li>
<li><a href="http://irc.cs.sdu.edu.cn/">http://irc.cs.sdu.edu.cn/</a></li>
<li><a href="https://www.seas.upenn.edu/~minchenl/">https://www.seas.upenn.edu/~minchenl/</a></li>
<li><a href="https://cs.nyu.edu/~yixinhu/">https://cs.nyu.edu/~yixinhu/</a></li>
<li><a href="http://www.cs.utexas.edu/~huangqx/">http://www.cs.utexas.edu/~huangqx/</a></li>
<li><a href="https://stats385.github.io/">https://stats385.github.io/</a></li>
</ul>
<p>Not all one can understand the relative theory or quantum theory.</p>
<ul>
<li><a href="https://github.com/zqs1022/interpretableCNN">Interpretable Convolutional Neural Networks</a></li>
</ul>
<h3 id="deeplever">DeepLEVER</h3>
<blockquote>
<p>DeepLEVER aims at explaining and verifying machine learning systems via combinatorial optimization in general and SAT in particular.
<a href="http://anitideeplever.laas.fr/project">The main thesis of the DeepLever project</a> is that a solution to address the challenges faced by ML models is at the intersection of formal methods (FM) and AI. (A recent Summit on Machine Learning Meets Formal Methods offered supporting evidence to how strategic this topic is.) The DeepLever project envisions two main lines of research, concretely explanation and verification of deep ML models, supported by existing and novel constraint reasoning technologies.</p>
</blockquote>
<ul>
<li><a href="http://anitideeplever.laas.fr/deeplever-project-has-started">DeepLEVER</a></li>
<li><a href="https://aniti.univ-toulouse.fr/index.php/en/">https://aniti.univ-toulouse.fr/index.php/en/</a></li>
<li><a href="https://jpmarquessilva.github.io/">https://jpmarquessilva.github.io/</a></li>
<li><a href="https://www.researchgate.net/profile/Martin_Cooper3">https://www.researchgate.net/profile/Martin_Cooper3</a></li>
<li><a href="http://homepages.laas.fr/ehebrard/Home.html">http://homepages.laas.fr/ehebrard/Home.html</a></li>
<li><a href="http://www.merl.com/">http://www.merl.com/</a></li>
</ul>
<h3 id="dlphi">DLphi</h3>
<blockquote>
<p>Together with the participants of the Oberwolfach Seminar: Mathematics of Deep Learning, <a href="http://www.pc-petersen.eu/">I wrote a (not entirely serious) paper</a> called "The Oracle of DLPhi" proving that <code>Deep Learning techniques can perform accurate classifications on test data that is entirely uncorrelated to the training data</code>. This, however, requires a couple of non-standard assumptions such as uncountably many data points and the axiom of choice. In a sense this shows that mathematical results on machine learning need to be approached with a bit of scepticism.</p>
</blockquote>
<ul>
<li><a href="https://github.com/juliusberner/oberwolfach_workshop">https://github.com/juliusberner/oberwolfach_workshop</a></li>
<li><a href="http://www.pc-petersen.eu/">http://www.pc-petersen.eu/</a></li>
<li><a href="http://voigtlaender.xyz/">http://voigtlaender.xyz/</a></li>
<li><a href="https://math.ethz.ch/sam/research/reports.html">https://math.ethz.ch/sam/research/reports.html</a></li>
<li><a href="https://arxiv.org/abs/1901.05744">The Oracle of DLphi</a></li>
<li><a href="https://faculty.washington.edu/kutz/">https://faculty.washington.edu/kutz/</a></li>
</ul>
<h3 id="scientific-machine-learning">Scientific Machine Learning</h3>
<p><a href="https://thewinnower.com/papers/25359-the-essential-tools-of-scientific-machine-learning-scientific-ml">Scientific machine learning is a burgeoning discipline which blends scientific computing and machine learning. Traditionally, scientific computing focuses on large-scale mechanistic models, usually differential equations, that are derived from scientific laws that simplified and explained phenomena. On the other hand, machine learning focuses on developing non-mechanistic data-driven models which require minimal knowledge and prior assumptions. The two sides have their pros and cons: differential equation models are great at extrapolating, the terms are explainable, and they can be fit with small data and few parameters. Machine learning models on the other hand require "big data" and lots of parameters but are not biased by the scientists ability to correctly identify valid laws and assumptions.</a></p>
<ul>
<li><a href="https://www.scd.stfc.ac.uk/Pages/Scientific-Machine-Learning.aspx">https://www.scd.stfc.ac.uk/Pages/Scientific-Machine-Learning.aspx</a></li>
<li><a href="https://mitmath.github.io/18337/">https://mitmath.github.io/18337/</a></li>
<li><a href="https://www.stat.purdue.edu/~fmliang/STAT598Purdue/MLS.pdf">https://www.stat.purdue.edu/~fmliang/STAT598Purdue/MLS.pdf</a></li>
<li><a href="https://sciml.ai/">https://sciml.ai/</a></li>
<li><a href="https://github.com/mitmath/18S096SciML">https://github.com/mitmath/18S096SciML</a></li>
<li><a href="https://ml4sci.lbl.gov/">https://ml4sci.lbl.gov/</a></li>
<li><a href="https://www.nottingham.ac.uk/conference/fac-sci/maths-sci/scientific-computation-using-machine-learning-algorithms/">https://www.nottingham.ac.uk/conference/fac-sci/maths-sci/scientific-computation-using-machine-learning-algorithms/</a></li>
<li><a href="https://sites.google.com/lbl.gov/ml4sci/">https://sites.google.com/lbl.gov/ml4sci/</a></li>
<li><a href="https://github.com/sciann/sciann/">SciANN: Neural Networks for Scientific Computations</a></li>
</ul>
<h2 id="physics-and-deep-learning">Physics and Deep Learning</h2>
<p>Neuronal networks have enjoyed a resurgence both in the worlds of neuroscience, where they yield mathematical frameworks for thinking about complex neural datasets, and in machine learning, where they achieve state of the art results on a variety of tasks, including machine vision, speech recognition, and language translation.<br>
Despite their empirical success, a mathematical theory of how deep neural circuits, with many layers of cascaded nonlinearities, learn and compute remains elusive.<br>
We will discuss three recent vignettes in which ideas from statistical physics can shed light on this issue.<br>
In particular, we show how dynamical criticality can help in neural learning, how the non-intuitive geometry of high dimensional error landscapes can be exploited to speed up learning, and how modern ideas from non-equilibrium statistical physics, like the Jarzynski equality, can be extended to yield powerful algorithms for modeling complex probability distributions.<br>
<a href="https://physics.berkeley.edu/news-events/events/20151005/the-statistical-physics-of-deep-learning-on-the-beneficial-roles-of">Time permitting, we will also discuss the relationship between neural network learning dynamics and the developmental time course of semantic concepts in infants.</a></p>
<p>In recent years, artificial intelligence has made remarkable advancements, impacting many industrial sectors dependent on complex decision-making and optimization.
Physics-leaning disciplines also face hard inference problems in complex systems: climate prediction, density matrix estimation for many-body quantum systems, material phase detection, protein-fold quality prediction, parametrization of effective models of high-dimensional neural activity, energy landscapes of transcription factor-binding, etc.
Methods using artificial intelligence have in fact already advanced progress on such problems.
<a href="http://www.physics.mcgill.ca/ai2019/">So, the question is not whether, but how AI serves as a powerful tool for data analysis in academic research, and physics-leaning disciplines in particular.</a></p>
<img src="https://d2r55xnwy6nx47.cloudfront.net/uploads/2017/09/InfoBottleneck_2880x1620.jpg" width="80%"/>
<ul>
<li><a href="https://zhuanlan.zhihu.com/p/94249675">https://zhuanlan.zhihu.com/p/94249675</a></li>
<li><a href="https://web.stanford.edu/~montanar/index.html">https://web.stanford.edu/~montanar/index.html</a></li>
<li><a href="https://www.microsoft.com/en-us/research/event/physics-ml-workshop/">Physics Meets ML</a></li>
<li><a href="http://apagom.com/physicsforests/">physics forests</a></li>
<li><a href="https://www.appliedmldays.org/">Applied Machine Learning Days</a></li>
<li><a href="http://www.ncsa.illinois.edu/Conferences/DeepLearningLSST/">DEEP LEARNING FOR MULTIMESSENGER ASTROPHYSICS: REAL-TIME DISCOVERY AT SCALE</a></li>
<li><a href="http://indico.ictp.it/event/8722/">Workshop on Science of Data Science | (smr 3283)</a></li>
<li><a href="http://www.physics.mcgill.ca/ai2019/">Physics & AI Workshop</a></li>
<li><a href="https://physicsml.github.io/pages/papers.html">https://physicsml.github.io/pages/papers.html</a></li>
<li><a href="http://super-ms.mit.edu/physics-ai.html">Physics-AI opportunities at MIT</a></li>
<li><a href="https://gogul.dev/software/deep-learning-meets-physics">https://gogul.dev/software/deep-learning-meets-physics</a></li>
<li><a href="https://github.com/2prime/ODE-DL/blob/master/DL_Phy.html">https://github.com/2prime/ODE-DL/blob/master/DL_Phy.md</a></li>
<li><a href="https://physics-ai.com/">https://physics-ai.com/</a></li>
<li><a href="http://physics.usyd.edu.au/quantum/Coogee2015/Presentations/Svore.pdf">http://physics.usyd.edu.au/quantum/Coogee2015/Presentations/Svore.pdf</a></li>
<li><a href="https://ocw.mit.edu/resources/res-9-003-brains-minds-and-machines-summer-course-summer-2015/index.htm">Brains, Minds and Machines Summer Course</a></li>
<li><a href="http://amos3.aapm.org/abstracts/pdf/127-36916-419554-130797.pdf">deep medcine</a></li>
<li><a href="http://www.dam.brown.edu/people/mraissi/publications/">http://www.dam.brown.edu/people/mraissi/publications/</a></li>
<li><a href="http://www.physics.rutgers.edu/gso/SSPAR/">http://www.physics.rutgers.edu/gso/SSPAR/</a></li>
<li><a href="https://community.singularitynet.io/c/education/course-brains-minds-machines">https://community.singularitynet.io/c/education/course-brains-minds-machines</a></li>
<li><a href="https://physai.sciencesconf.org/">ARTIFICIAL INTELLIGENCE AND PHYSICS</a></li>
<li><a href="http://inspirehep.net/record/1680302/references">http://inspirehep.net/record/1680302/references</a></li>
<li><a href="https://www.pnnl.gov/computing/philms/Announcements.stm">https://www.pnnl.gov/computing/philms/Announcements.stm</a></li>
<li><a href="https://tacocohen.wordpress.com/">https://tacocohen.wordpress.com/</a></li>
<li><a href="https://cnls.lanl.gov/External/workshops.php">https://cnls.lanl.gov/External/workshops.php</a></li>
<li><a href="https://www.researchgate.net/profile/Jinlong_Wu3">https://www.researchgate.net/profile/Jinlong_Wu3</a></li>
<li><a href="http://djstrouse.com/">http://djstrouse.com/</a></li>
<li><a href="https://www.researchgate.net/scientific-contributions/2135376837_Maurice_Weiler">https://www.researchgate.net/scientific-contributions/2135376837_Maurice_Weiler</a></li>
<li><a href="https://arxiv.org/abs/1710.06096">Spontaneous Symmetry Breaking in Neural Networks</a></li>
<li><a href="https://physai.sciencesconf.org/">https://physai.sciencesconf.org/</a></li>
</ul>
<h3 id="machine-learning-for-physics">Machine Learning for Physics</h3>
<ul>
<li><a href="https://dlonsc.github.io/ISC2019/7_Keynote_DL_HEP_SofiaVallecorsa.pdf">Deep Learning in High Energy Physics</a></li>
<li><a href="https://www.ipam.ucla.edu/programs/long-programs/machine-learning-for-physics-and-the-physics-of-learning/">Machine Learning for Physics and the Physics of Learning</a></li>
<li><a href="https://machine-learning-for-physicists.org/">Machine Learning for Physics</a></li>
<li><a href="http://www.thp2.nat.uni-erlangen.de/index.php/2017_Machine_Learning_for_Physicists,_by_Florian_Marquardt">2017 Machine Learning for Physicists, by Florian Marquardt</a></li>
<li><a href="https://ml4physicalsciences.github.io/2020/">https://ml4physicalsciences.github.io/2020/</a></li>
<li><a href="http://phys.cts.nthu.edu.tw/actnews/content.php?Sn=468">Machine Learning in Physics School/Workshop</a></li>
<li><a href="http://deeplearnphysics.org/">http://deeplearnphysics.org/</a></li>
</ul>
<h4 id="deep-learning-for-physics">Deep Learning for Physics</h4>
<ul>
<li><a href="https://inspirehep.net/literature/1680302">https://inspirehep.net/literature/1680302</a></li>
<li><a href="https://www.in.tum.de/cg/teaching/winter-term-1819/deep-learning-in-physics/">Master-Seminar - Deep Learning in Physics (IN2107, IN0014)</a></li>
<li><a href="https://www.ml4science.org/agenda-physics-in-ml">https://www.ml4science.org/agenda-physics-in-ml</a></li>
<li><a href="https://www.ias.edu/events/deep-learning-physics">https://www.ias.edu/events/deep-learning-physics</a></li>
<li><a href="https://dl4physicalsciences.github.io/">https://dl4physicalsciences.github.io/</a></li>
</ul>
<h3 id="physics-for-machine-learning">Physics for Machine Learning</h3>
<ul>
<li><a href="https://tartakovsky.stanford.edu/research/physics-informed-machine-learning">https://tartakovsky.stanford.edu/research/physics-informed-machine-learning</a></li>
<li><a href="https://bids.berkeley.edu/events/physics-machine-learning-workshop">Physics in Machine Learning Workshop</a></li>
<li><a href="https://www.ml4science.org/astrophysics-in-machine-learning-workshop">Physics in Machine Learning Workshop</a></li>
<li><a href="http://phys.csail.mit.edu/papers/1.pdf">A Differentiable Physics Engine for Deep Learning</a></li>
<li><a href="https://pbdl2019.github.io/">Physics Based Vision meets Deep Learning (PBDL)</a></li>
<li><a href="https://github.com/thunil/Physics-Based-Deep-Learning">Physics-Based Deep Learning</a></li>
</ul>
<h4 id="physics-informed-machine-learning">Physics Informed Machine Learning</h4>
<ul>
<li><a href="https://sites.google.com/view/icml2019phys4dl/schedule">https://sites.google.com/view/icml2019phys4dl/schedule</a></li>
<li><a href="https://icml.cc/Conferences/2019/ScheduleMultitrack?event=3531">Theoretical Physics for Deep Learning</a></li>
<li><a href="https://sites.google.com/view/icml2019phys4dl/schedule">https://sites.google.com/view/icml2019phys4dl/schedule</a></li>
<li><a href="http://www.databookuw.com/page-5/">Physics Informed Machine Learning Workshop</a></li>
<li><a href="https://github.com/maziarraissi/PINNs">Physics Informed Neural Networks</a></li>
<li><a href="https://maziarraissi.github.io/PINNs/">https://maziarraissi.github.io/PINNs/</a></li>
</ul>
<h3 id="statistical-mechanics-and-deep-learning">Statistical Mechanics and Deep Learning</h3>
<p><a href="https://www.annualreviews.org/doi/pdf/10.1146/annurev-conmatphys-031119-050745">The recent striking success of deep neural networks in machine learning raises profound questions about the theoretical principles underlying their success. For example, what can such deep networks compute? How can we train them? How does information propagate through them? Why can they generalize? And how can we teach them to imagine? We review recent work in which methods of physical analysis rooted in statistical mechanics have begun to shed conceptual insights into these questions. These insights yield connections between deep learning and diverse physical and mathematical topics, including random landscapes, spin glasses, jamming, dynamical phase transitions, chaos, Riemannian geometry, random matrix theory, free probability, and nonequilibrium statistical mechanics. Indeed, the fields of statistical mechanics and machine learning have long enjoyed a rich history of strongly coupled interactions, and recent advances at the intersection of statistical mechanics and deep learning suggest these interactions will only deepen going forward.</a></p>
<ul>
<li><a href="https://www.icts.res.in/discussion-meeting/spmml2020">Statistical Physics of Machine Learning</a></li>
<li><a href="http://smml.io/">statistical mechanics // machine learning</a></li>
<li><a href="https://arxiv.org/abs/1906.10228">A Theoretical Connection Between Statistical Physics and Reinforcement Learning</a></li>
<li><a href="https://phys.org/news/2017-02-thermodynamics.html">The thermodynamics of learning</a></li>
<li><a href="https://calculatedcontent.com/2015/03/25/why-does-deep-learning-work/">WHY DOES DEEP LEARNING WORK?</a></li>
<li><a href="https://calculatedcontent.com/2015/04/01/why-deep-learning-works-ii-the-renormalization-group/">WHY DEEP LEARNING WORKS II: THE RENORMALIZATION GROUP</a></li>
<li><a href="https://github.com/CalculatedContent/ImplicitSelfRegularization">https://github.com/CalculatedContent/ImplicitSelfRegularization</a></li>
<li><a href="https://sites.google.com/site/torbenkruegermath/home/graduate-seminar-random-matrices-spin-glasses-deep-learning">torbenkruegermath</a></li>
<li><a href="https://calculatedcontent.com/2019/12/03/towards-a-new-theory-of-learning-statistical-mechanics-of-deep-neural-networks/">TOWARDS A NEW THEORY OF LEARNING: STATISTICAL MECHANICS OF DEEP NEURAL NETWORKS</a></li>
<li><a href="https://www.annualreviews.org/doi/pdf/10.1146/annurev-conmatphys-031119-050745">Statistical Mechanics of Deep Learning</a></li>
<li><a href="https://zhuanlan.zhihu.com/p/90096775">https://zhuanlan.zhihu.com/p/90096775</a></li>
</ul>
<h3 id="born-machine">Born Machine</h3>
<p>Born machine is a Probabilistic Generative Modeling.</p>
<ul>
<li><a href="https://journals.aps.org/prx/abstract/10.1103/PhysRevX.8.031012#fulltext">Unsupervised Generative Modeling Using Matrix Product States</a></li>
<li><a href="https://wangleiphy.github.io/talks/BornMachine-USTC.pdf">https://wangleiphy.github.io/talks/BornMachine-USTC.pdf</a></li>
<li><a href="https://github.com/congzlwag/UnsupGenModbyMPS">https://github.com/congzlwag/UnsupGenModbyMPS</a></li>
<li><a href="https://congzlwag.github.io/UnsupGenModbyMPS/">https://congzlwag.github.io/UnsupGenModbyMPS/</a></li>
<li><a href="https://github.com/congzlwag/BornMachineTomo">https://github.com/congzlwag/BornMachineTomo</a></li>
<li><a href="https://wangleiphy.github.io/talks/BornMachine.pdf">From Baltzman machine to Born Machine</a></li>
<li><a href="https://quantum.ustc.edu.cn/web/node/623">Born Machines: A fresh approach to quantum machine learning</a></li>
<li><a href="https://github.com/GiggleLiu/QuantumCircuitBornMachine">Gradient based training of Quantum Circuit Born Machine (QCBM)</a></li>
</ul>
<h3 id="quantum-machine-learning">Quantum Machine learning</h3>
<p><a href="https://peterwittek.com/">Quantum Machine Learning: What Quantum Computing Means to Data Mining explains the most relevant concepts of machine learning, quantum mechanics, and quantum information theory, and contrasts classical learning algorithms to their quantum counterparts.</a></p>
<ul>
<li><a href="https://www.quantummachinelearning.org/events.html">Combining quantum information and machine learning</a></li>
<li><a href="https://www.mpl.mpg.de/divisions/marquardt-division/workshops/2019-machine-learning-for-quantum-technology/">machine learning for quantum technology/</a></li>
<li><a href="https://wangleiphy.github.io/">https://wangleiphy.github.io/</a></li>
<li><a href="https://tacocohen.wordpress.com">https://tacocohen.wordpress.com</a></li>
<li><a href="https://peterwittek.com/qml-in-2015.html">https://peterwittek.com/qml-in-2015.html</a></li>
<li><a href="https://github.com/krishnakumarsekar/awesome-quantum-machine-learning">https://github.com/krishnakumarsekar/awesome-quantum-machine-learning</a></li>
<li><a href="https://peterwittek.com/">https://peterwittek.com/</a></li>
</ul>
<ul>
<li><a href="https://wangleiphy.github.io/lectures/DL.pdf">Lecture Note on Deep Learning and Quantum Many-Body Computation</a></li>
<li><a href="http://www.math.chalmers.se/~stig/project4.pdf">Quantum Deep Learning and Renormalization</a></li>
</ul>
<hr>
<ul>
<li><a href="https://scholar.harvard.edu/madvani/home">https://scholar.harvard.edu/madvani/home</a></li>
<li><a href="https://www.elen.ucl.ac.be/esann/index.php?pg=specsess#statistical">https://www.elen.ucl.ac.be/esann/index.php?pg=specsess#statistical</a></li>
<li><a href="https://krzakala.github.io/cargese.io/program.html">https://krzakala.github.io/cargese.io/program.html</a></li>
<li><a href="https://www.quantamagazine.org/new-theory-cracks-open-the-black-box-of-deep-learning-20170921/">New Theory Cracks Open the Black Box of Deep Learning</a></li>
<li><a href="https://ai.googleblog.com/2019/03/unifying-physics-and-deep-learning-with.html">Unifying Physics and Deep Learning with TossingBot</a></li>
</ul>
<h2 id="mathematics-of-deep-learning">Mathematics of Deep Learning</h2>
<ul>
<li><a href="https://www.4tu.nl/ami/en/Agenda-Events/">Meeting on Mathematics of Deep Learning</a></li>
<li><a href="http://www.yanivplan.com/math-608d">Probability in high dimensions</a></li>
<li><a href="https://math.ethz.ch/sam/research/reports.html?year=2019">https://math.ethz.ch/sam/research/reports.html?year=2019</a></li>
<li><a href="http://rt.dgyblog.com/ref/ref-learning-deep-learning.html">Learning Deep Learning</a></li>
<li><a href="https://github.com/leiwu1990/course.math_theory_nn">Summer school on Deep Learning Theory by Weinan E</a></li>
<li><a href="http://www.mit.edu/~9.520/fall18/">.520/6.860: Statistical Learning Theory and Applications, Fall 2018</a></li>
<li><a href="https://zhuanlan.zhihu.com/p/40097048">2018上海交通大学深度学习理论前沿研讨会 - 凌泽南的文章 - 知乎</a></li>
<li><a href="https://www.researchgate.net/project/Theories-of-Deep-Learning">Theories of Deep Learning</a></li>
</ul>
<p>A mathematical theory of deep networks and of why they work as well as they do is now emerging.
<a href="http://www.mit.edu/~9.520/fall17/Classes/deep_learning_theory.html">I will review some recent theoretical results on the approximation power of deep networks</a>
including conditions under which they can be exponentially better than shallow learning.
A class of deep convolutional networks represent an important special case of these conditions,
though weight sharing is not the main reason for their exponential advantage.
I will also discuss another puzzle around deep networks: what guarantees that they generalize and
they do not overfit despite the number of weights being larger than the number of training data and despite the absence of explicit regularization in the optimization?</p>
<p>Deep Neural Networks and Partial Differential Equations: Approximation Theory and
Structural Properties
Philipp Petersen, University of Oxford</p>
<ul>
<li><a href="https://memento.epfl.ch/event/a-theoretical-analysis-of-machine-learning-and-par/">https://memento.epfl.ch/event/a-theoretical-analysis-of-machine-learning-and-par/</a></li>
<li><a href="http://at.yorku.ca/c/b/p/g/30.htm">http://at.yorku.ca/c/b/p/g/30.htm</a></li>
<li><a href="https://mat.univie.ac.at/~grohs/">https://mat.univie.ac.at/~grohs/</a></li>
<li><a href="https://www.math.tamu.edu/~bhanin/DL2018.html">Deep Learning: Theory and Applications (Math 689 Fall 2018)</a></li>
<li><a href="https://joanbruna.github.io/MathsDL-spring18/">Topics course Mathematics of Deep Learning, NYU, Spring 18</a></li>
<li><a href="https://skymind.ai/ebook/Skymind_The_Math_Behind_Neural_Networks.pdf">https://skymind.ai/ebook/Skymind_The_Math_Behind_Neural_Networks.pdf</a></li>
<li><a href="https://github.com/markovmodel/deeptime">https://github.com/markovmodel/deeptime</a></li>
<li><a href="https://omar-florez.github.io/scratch_mlp/">https://omar-florez.github.io/scratch_mlp/</a></li>
<li><a href="https://joanbruna.github.io/MathsDL-spring19/">https://joanbruna.github.io/MathsDL-spring19/</a></li>
<li><a href="https://github.com/isikdogan/deep_learning_tutorials">https://github.com/isikdogan/deep_learning_tutorials</a></li>
<li><a href="https://www.brown.edu/research/projects/crunch/machine-learning-x-seminars">https://www.brown.edu/research/projects/crunch/machine-learning-x-seminars</a></li>
<li><a href="http://anotherdatum.com/tce_2018.html">Deep Learning: Theory & Practice</a></li>
<li><a href="https://www.math.ias.edu/wtdl">https://www.math.ias.edu/wtdl</a></li>
<li><a href="https://www.ml.tu-berlin.de/menue/mitglieder/klaus-robert_mueller/">https://www.ml.tu-berlin.de/menue/mitglieder/klaus-robert_mueller/</a></li>
<li><a href="https://www-m15.ma.tum.de/Allgemeines/MathFounNN">https://www-m15.ma.tum.de/Allgemeines/MathFounNN</a></li>
<li><a href="https://www.math.purdue.edu/~buzzard/MA598-Spring2019/index.shtml">https://www.math.purdue.edu/~buzzard/MA598-Spring2019/index.shtml</a></li>
<li><a href="http://mathematics-in-europe.eu/?p=801">http://mathematics-in-europe.eu/?p=801</a></li>
<li><a href="https://cims.nyu.edu/~bruna/">https://cims.nyu.edu/~bruna/</a></li>
<li><a href="https://www.math.ias.edu/wtdl">https://www.math.ias.edu/wtdl</a></li>
<li><a href="https://www.pims.math.ca/scientific-event/190722-pcssdlcm">https://www.pims.math.ca/scientific-event/190722-pcssdlcm</a></li>
<li><a href="https://www.embl.de/training/events/2020/MAC20-01/">Deep Learning for Image Analysis EMBL COURSE</a></li>
<li><a href="https://deeplearning-math.github.io/2018spring.html">MATH 6380o. Deep Learning: Towards Deeper Understanding, Spring 2018</a></li>
<li><a href="https://github.com/joanbruna/MathsDL-spring19">Mathematics of Deep Learning, Courant Insititute, Spring 19</a></li>
<li><a href="http://voigtlaender.xyz/">http://voigtlaender.xyz/</a></li>
<li><a href="http://www.mit.edu/~9.520/fall19/">http://www.mit.edu/~9.520/fall19/</a></li>
<li><a href="https://gateway.newton.ac.uk/event/ofbw46/programme">The Mathematics of Deep Learning and Data Science - Programme</a></li>
</ul>
<ul>
<li><a href="https://www.brown.edu/research/projects/crunch/">Home of Math + Machine Learning + X</a></li>
<li><a href="http://crm.sns.it/event/451/">Mathematical and Computational Aspects of Machine Learning</a></li>
<li><a href="https://www.researchgate.net/project/Mathematical-Theory-for-Deep-Neural-Networks">Mathematical Theory for Deep Neural Networks</a></li>
<li><a href="https://www.researchgate.net/project/Theory-of-Deep-Learning">Theory of Deep Learning</a></li>
<li><a href="http://dalimeeting.org/dali2018/workshopTheoryDL.html">DALI 2018 - Data, Learning and Inference</a></li>
<li><a href="https://www.math-berlin.de/academics/summer-schools/2019">BMS Summer School 2019: Mathematics of Deep Learning</a></li>
<li><a href="https://www.siam.org/conferences/cm/conference/mds20">SIAM Conference on Mathematics of Data Science (MDS20)</a></li>
</ul>
<ul>
<li><a href="http://web.cs.ucla.edu/~qgu/research.html">http://web.cs.ucla.edu/~qgu/research.html</a></li>
<li><a href="https://sgo-workshop.github.io/">BRIDGING GAME THEORY AND DEEP LEARNING</a></li>
</ul>
<h3 id="discrete-mathematics-and--neural-networks">Discrete Mathematics and Neural Networks</h3>
<ul>
<li><a href="http://proceedings.mlr.press/v28/ermon13.html">http://proceedings.mlr.press/v28/ermon13.html</a></li>
<li><a href="https://sites.google.com/view/ijcai2019dso/">https://sites.google.com/view/ijcai2019dso/</a></li>
<li><a href="http://www.cas.mcmaster.ca/~deza/tokyo2018progr.html">http://www.cas.mcmaster.ca/~deza/tokyo2018progr.html</a></li>
<li><a href="https://www.cs.cornell.edu/~bistra/">https://www.cs.cornell.edu/~bistra/</a></li>
<li><a href="https://epubs.siam.org/doi/book/10.1137/1.9780898718539?mobileUi=0">Discrete Mathematics of Neural Networks: Selected Topics</a></li>
<li><a href="https://www.math.uwaterloo.ca/~bico/co759/2018/index.html">Deep Learning in Computational Discrete Optimization</a></li>
<li><a href="http://www.ams.jhu.edu/~wcook12/dl/index.html">Deep Learning in Discrete Optimization</a></li>
<li><a href="https://web-app.usc.edu/soc/syllabus/20201/30126.pdf">https://web-app.usc.edu/soc/syllabus/20201/30126.pdf</a></li>
<li><a href="http://www.columbia.edu/~yf2414/Slides.pdf">http://www.columbia.edu/~yf2414/Slides.pdf</a></li>
<li><a href="http://www.columbia.edu/~yf2414/teach.html">http://www.columbia.edu/~yf2414/teach.html</a></li>
<li><a href="https://opt-ml.org/cfp.html">https://opt-ml.org/cfp.html</a></li>
</ul>
<hr>
<ul>
<li><a href="http://www.cas.mcmaster.ca/~deza/slidesRIKEN2019/huchette.pdf">Strong mixed-integer programming formulations for trained neural networks by Joey Huchette1</a></li>
<li><a href="https://link.springer.com/article/10.1007/s10601-018-9285-6">Deep neural networks and mixed integer linear optimization</a></li>
<li><a href="http://www.dei.unipd.it/~fisch/papers/slides/2018%20Dagstuhl%20%5BFischetti%20on%20DL%5D.pdf">Matteo Fischetti, University of Padova</a></li>
<li><a href="https://arxiv.org/abs/1712.06174">Deep Neural Networks as 0-1 Mixed Integer Linear Programs: A Feasibility Study</a></li>
<li><a href="https://www.researchgate.net/profile/Matteo_Fischetti">https://www.researchgate.net/profile/Matteo_Fischetti</a></li>
<li><a href="http://www.amp.i.kyoto-u.ac.jp/tecrep/ps_file/2019/2019-001.pdf">A Mixed Integer Linear Programming Formulation to Artificial Neural Networks</a></li>
<li><a href="http://www.optimization-online.org/DB_FILE/2019/07/7276.pdf">ReLU Networks as Surrogate Models in Mixed-Integer Linear Programs</a></li>
</ul>
<h3 id="numerical-analysis-for-deep-learning">Numerical Analysis for Deep Learning</h3>
<p>Dynamics of deep learning is to consider deep learning as a dynamic system.
For example, the forward feedback network is expressed in the recurrent form:</p>
<p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mi>x</mi><mrow><mi>t</mi><mo>+</mo><mn>1</mn></mrow></msup><mo>=</mo><msub><mi>f</mi><mi>t</mi></msub><mo stretchy="false">(</mo><msup><mi>x</mi><mi>t</mi></msup><mo stretchy="false">)</mo><mo separator="true">,</mo><mi>t</mi><mo>∈</mo><mo stretchy="false">[</mo><mn>0</mn><mo separator="true">,</mo><mn>1</mn><mo separator="true">,</mo><mo>⋯</mo><mtext> </mtext><mo separator="true">,</mo><mi>T</mi><mo stretchy="false">]</mo></mrow><annotation encoding="application/x-tex">x^{t+1} = f_t(x^{t}),t\in [0,1,\cdots, T]
</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.864108em;vertical-align:0em;"></span><span class="mord"><span class="mord mathdefault">x</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.864108em;"><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathdefault mtight">t</span><span class="mbin mtight">+</span><span class="mord mtight">1</span></span></span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2777777777777778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span></span><span class="base"><span class="strut" style="height:1.093556em;vertical-align:-0.25em;"></span><span class="mord"><span class="mord mathdefault" style="margin-right:0.10764em;">f</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2805559999999999em;"><span style="top:-2.5500000000000003em;margin-left:-0.10764em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathdefault mtight">t</span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mopen">(</span><span class="mord"><span class="mord mathdefault">x</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.843556em;"><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathdefault mtight">t</span></span></span></span></span></span></span></span></span><span class="mclose">)</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="mord mathdefault">t</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span><span class="mrel">∈</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mopen">[</span><span class="mord">0</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="mord">1</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="minner">⋯</span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="mord mathdefault" style="margin-right:0.13889em;">T</span><span class="mclose">]</span></span></span></span></span></p>
<p>where <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>f</mi><mi>t</mi></msub></mrow><annotation encoding="application/x-tex">f_t</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8888799999999999em;vertical-align:-0.19444em;"></span><span class="mord"><span class="mord mathdefault" style="margin-right:0.10764em;">f</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2805559999999999em;"><span style="top:-2.5500000000000003em;margin-left:-0.10764em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathdefault mtight">t</span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> is some nonlinear function and <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>t</mi></mrow><annotation encoding="application/x-tex">t</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.61508em;vertical-align:0em;"></span><span class="mord mathdefault">t</span></span></span></span> is discrete.</p>
<p>However, it is not easy to select a proper nonlinear function <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>f</mi><mi>t</mi></msub><mtext> </mtext><mtext> </mtext><mi mathvariant="normal">∀</mi><mi>t</mi><mo>∈</mo><mo stretchy="false">[</mo><mn>0</mn><mo separator="true">,</mo><mn>1</mn><mo separator="true">,</mo><mo>⋯</mo><mtext> </mtext><mo separator="true">,</mo><mi>T</mi><mo stretchy="false">]</mo></mrow><annotation encoding="application/x-tex">f_t \,\,\forall t\in[0,1,\cdots, T]</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8888799999999999em;vertical-align:-0.19444em;"></span><span class="mord"><span class="mord mathdefault" style="margin-right:0.10764em;">f</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2805559999999999em;"><span style="top:-2.5500000000000003em;margin-left:-0.10764em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathdefault mtight">t</span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="mord">∀</span><span class="mord mathdefault">t</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span><span class="mrel">∈</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mopen">[</span><span class="mord">0</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="mord">1</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="minner">⋯</span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="mord mathdefault" style="margin-right:0.13889em;">T</span><span class="mclose">]</span></span></span></span> and the number <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>T</mi></mrow><annotation encoding="application/x-tex">T</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.68333em;vertical-align:0em;"></span><span class="mord mathdefault" style="margin-right:0.13889em;">T</span></span></span></span>.
In another word, there are no unified scientific principle or guide to design the structure of deep neural network models.</p>
<p>Many recursive formula share the same <code>feedback</code> forms or hidden structure, where the next input is the output of previous or historical record or generated points.</p>
<ul>
<li><a href="http://www.vvz.ethz.ch/Vorlesungsverzeichnis/lerneinheit.view?lerneinheitId=136996&semkez=2020S&lang=en">401-3650-19L Numerical Analysis Seminar: Mathematics of Deep Neural Network Approximation</a></li>
<li><a href="http://www.mathcs.emory.edu/~lruthot/talks/">http://www.mathcs.emory.edu/~lruthot/talks/</a></li>
<li><a href="http://www.mathcs.emory.edu/~lruthot/courses/math789r-sp20.html">CS 584 / MATH 789R - Numerical Methods for Deep Learning</a></li>
<li><a href="https://github.com/IPAIopen/NumDL-CourseNotes">Numerical methods for deep learning</a></li>
<li><a href="http://www.mathcs.emory.edu/~lruthot/courses/NumDL/index.html">Short Course on Numerical Methods for Deep Learning</a></li>
<li><a href="http://www.ms.uky.edu/~qye/MA721/ma721F17.html">MA 721: Topics in Numerical Analysis: Deep Learning</a></li>
</ul>
<ul>
<li><a href="http://phys2018.csail.mit.edu/papers/29.pdf">Physics-Based Deep Learning for Fluid Flow</a></li>
</ul>
<h4 id="resnets">ResNets</h4>
<p><code>Deep Residual Networks</code> won the 1st places in: ImageNet classification, ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.
It inspired more efficient forward convolutional networks.</p>
<p>They take a standard feed-forward ConvNet and add skip connections that bypass (or shortcut) a few convolution layers at a time. Each bypass gives rise to a residual block in which the convolution layers predict a residual that is added to the block’s input tensor.</p>
<img src="https://raw.githubusercontent.com/torch/torch.github.io/master/blog/_posts/images/resnets_1.png" width="40%"/>
<ul>
<li><a href="https://github.com/KaimingHe/deep-residual-networks">https://github.com/KaimingHe/deep-residual-networks</a></li>
<li><a href="http://torch.ch/blog/2016/02/04/resnets.html">http://torch.ch/blog/2016/02/04/resnets.html</a></li>
<li><a href="https://zh.gluon.ai/chapter_convolutional-neural-networks/resnet.html">https://zh.gluon.ai/chapter_convolutional-neural-networks/resnet.html</a></li>
<li><a href="https://www.jiqizhixin.com/articles/042201">https://www.jiqizhixin.com/articles/042201</a></li>
<li><a href="http://www.smartchair.org/hp/MSML2020/Paper/">http://www.smartchair.org/hp/MSML2020/Paper/</a></li>
<li><a href="https://github.com/liuzhuang13/DenseNet">https://github.com/liuzhuang13/DenseNet</a></li>
<li><a href="https://arxiv.org/abs/1810.11741">https://arxiv.org/abs/1810.11741</a></li>
<li><a href="https://www.sciencedirect.com/science/article/pii/S0893608019301820?via%3Dihub">Depth with nonlinearity creates no bad local minima in ResNets</a></li>
<li><a href="https://arxiv.org/abs/1910.13157">LeanConvNets: Low-cost Yet Effective Convolutional Neural Networks</a></li>
</ul>
<p><strong>Reversible Residual Network</strong></p>
<ul>
<li><a href="https://arxiv.org/abs/1707.04585">The Reversible Residual Network: Backpropagation Without Storing Activations</a></li>
<li><a href="https://ai.googleblog.com/2020/01/reformer-efficient-transformer.html">https://ai.googleblog.com/2020/01/reformer-efficient-transformer.html</a></li>
<li><a href="https://arxiv.org/abs/2001.04451">https://arxiv.org/abs/2001.04451</a></li>
<li><a href="https://ameroyer.github.io/reading-notes/architectures/2019/05/07/the_reversible_residual_network.html">https://ameroyer.github.io/reading-notes/architectures/2019/05/07/the_reversible_residual_network.html</a></li>
<li><a href="https://arxiv.org/abs/1812.04352">Layer-Parallel Training of Deep Residual Neural Networks</a></li>
</ul>
<h4 id="differential-equations-motivated-deep-learning-methods">Differential Equations Motivated Deep Learning Methods</h4>
<p>This section is on insight from numerical analysis to inspire more effective deep learning architecture.</p>
<p><a href="https://web.stanford.edu/~yplu/proj/lm/">Many effective networks can be interpreted as different numerical discretizations of differential equations. This finding brings us a brand new perspective on the design of effective deep architectures.</a></p>
<p><a href="http://www.mathcs.emory.edu/~lruthot/courses/NumDL/index.html">We show that residual neural networks can be interpreted as discretizations of a nonlinear time-dependent ordinary differential equation that depends on unknown parameters, i.e., the network weights. We show how this insight has been used, e.g., to study the <code>stability of neural networks, design new architectures, or use established methods from optimal control methods for training ResNets</code>. Finally, we discuss open questions and opportunities for mathematical advances in this area.</a></p>
<ul>
<li><a href="https://elsc.huji.ac.il/all-publications/1050">Path integral approach to random neural networks</a></li>
<li><a href="https://rkevingibson.github.io/blog/neural-networks-as-ordinary-differential-equations/">NEURAL NETWORKS AS ORDINARY DIFFERENTIAL EQUATIONS</a></li>
<li><a href="https://zhenyu-liao.github.io/pdf/pre/GDD_iCODE.pdf">Dynamical aspects of Deep Learning</a></li>
<li><a href="http://www.doc.ic.ac.uk/~ae/teaching.html#complex">Dynamical Systems and Deep Learning</a></li>
<li><a href="https://zhuanlan.zhihu.com/p/71747175">https://zhuanlan.zhihu.com/p/71747175</a></li>
<li><a href="https://web.stanford.edu/~yplu/project.html">https://web.stanford.edu/~yplu/project.html</a></li>
<li><a href="https://github.com/2prime/ODE-DL/">https://github.com/2prime/ODE-DL/</a></li>
<li><a href="https://arxiv.org/pdf/1804.04272.pdf">Deep Neural Networks Motivated by Partial Differential Equations</a></li>
</ul>
<ul>
<li><a href="https://www.researchgate.net/scientific-contributions/2107227289_Eldad_Haber">https://www.researchgate.net/scientific-contributions/2107227289_Eldad_Haber</a></li>
</ul>
<p>Residual networks as discretizations of dynamic systems:</p>
<p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>Y</mi><mn>1</mn></msub><mo>=</mo><msub><mi>Y</mi><mn>0</mn></msub><mo>+</mo><mi>h</mi><mi>σ</mi><mo stretchy="false">(</mo><msub><mi>K</mi><mn>0</mn></msub><msub><mi>Y</mi><mn>0</mn></msub><mo>+</mo><msub><mi>b</mi><mn>0</mn></msub><mo stretchy="false">)</mo><mspace linebreak="newline"></mspace><mi><mi mathvariant="normal">⋮</mi><mpadded height="+0em" voffset="0em"><mspace mathbackground="black" width="0em" height="1.5em"></mspace></mpadded></mi><mspace linebreak="newline"></mspace><msub><mi>Y</mi><mi>N</mi></msub><mo>=</mo><msub><mi>Y</mi><mrow><mi>N</mi><mo>−</mo><mn>1</mn></mrow></msub><mo>+</mo><mi>h</mi><mi>σ</mi><mo stretchy="false">(</mo><msub><mi>K</mi><mrow><mi>N</mi><mo>−</mo><mn>1</mn></mrow></msub><msub><mi>Y</mi><mrow><mi>N</mi><mo>−</mo><mn>1</mn></mrow></msub><mo>+</mo><msub><mi>b</mi><mrow><mi>N</mi><mo>−</mo><mn>1</mn></mrow></msub><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">Y_1 = Y_0 +h \sigma(K_0 Y_0 + b_0)\\
\vdots \\
Y_N = Y_{N-1} +h \sigma(K_{N-1} Y_{N-1} + b_{N-1})
</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.83333em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathdefault" style="margin-right:0.22222em;">Y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.30110799999999993em;"><span style="top:-2.5500000000000003em;margin-left:-0.22222em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">1</span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2777777777777778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span></span><span class="base"><span class="strut" style="height:0.83333em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathdefault" style="margin-right:0.22222em;">Y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.30110799999999993em;"><span style="top:-2.5500000000000003em;margin-left:-0.22222em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">0</span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222222222222222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathdefault">h</span><span class="mord mathdefault" style="margin-right:0.03588em;">σ</span><span class="mopen">(</span><span class="mord"><span class="mord mathdefault" style="margin-right:0.07153em;">K</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.30110799999999993em;"><span style="top:-2.5500000000000003em;margin-left:-0.07153em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">0</span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord"><span class="mord mathdefault" style="margin-right:0.22222em;">Y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.30110799999999993em;"><span style="top:-2.5500000000000003em;margin-left:-0.22222em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">0</span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222222222222222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord"><span class="mord mathdefault">b</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.30110799999999993em;"><span style="top:-2.5500000000000003em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">0</span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mclose">)</span></span><span class="mspace newline"></span><span class="base"><span class="strut" style="height:1.53em;vertical-align:-0.03em;"></span><span class="mord"><span class="mord">⋮</span><span class="mord rule" style="border-right-width:0em;border-top-width:1.5em;bottom:0em;"></span></span></span><span class="mspace newline"></span><span class="base"><span class="strut" style="height:0.83333em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathdefault" style="margin-right:0.22222em;">Y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.32833099999999993em;"><span style="top:-2.5500000000000003em;margin-left:-0.22222em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathdefault mtight" style="margin-right:0.10903em;">N</span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2777777777777778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span></span><span class="base"><span class="strut" style="height:0.891661em;vertical-align:-0.208331em;"></span><span class="mord"><span class="mord mathdefault" style="margin-right:0.22222em;">Y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.328331em;"><span style="top:-2.5500000000000003em;margin-left:-0.22222em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathdefault mtight" style="margin-right:0.10903em;">N</span><span class="mbin mtight">−</span><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.208331em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222222222222222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathdefault">h</span><span class="mord mathdefault" style="margin-right:0.03588em;">σ</span><span class="mopen">(</span><span class="mord"><span class="mord mathdefault" style="margin-right:0.07153em;">K</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.328331em;"><span style="top:-2.5500000000000003em;margin-left:-0.07153em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathdefault mtight" style="margin-right:0.10903em;">N</span><span class="mbin mtight">−</span><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.208331em;"><span></span></span></span></span></span></span><span class="mord"><span class="mord mathdefault" style="margin-right:0.22222em;">Y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.328331em;"><span style="top:-2.5500000000000003em;margin-left:-0.22222em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathdefault mtight" style="margin-right:0.10903em;">N</span><span class="mbin mtight">−</span><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.208331em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222222222222222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord"><span class="mord mathdefault">b</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.328331em;"><span style="top:-2.5500000000000003em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathdefault mtight" style="margin-right:0.10903em;">N</span><span class="mbin mtight">−</span><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.208331em;"><span></span></span></span></span></span></span><span class="mclose">)</span></span></span></span></span></p>
<p>This is nothing but a forward Euler discretization of the <code>Ordinary Differential Equation (ODE)</code>:</p>
<p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi mathvariant="normal">∂</mi><mi>Y</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mo>=</mo><mi>σ</mi><mo stretchy="false">(</mo><mi>K</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mi>Y</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mo>+</mo><mi>b</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mo stretchy="false">)</mo><mo separator="true">,</mo><mi>Y</mi><mo stretchy="false">(</mo><mn>0</mn><mo stretchy="false">)</mo><mo>=</mo><msub><mi>Y</mi><mn>0</mn></msub><mo separator="true">,</mo><mi>t</mi><mo>∈</mo><mo stretchy="false">[</mo><mn>0</mn><mo separator="true">,</mo><mi>T</mi><mo stretchy="false">]</mo><mi mathvariant="normal">.</mi></mrow><annotation encoding="application/x-tex">\partial Y(t)=\sigma(K(t) Y(t) + b(t)), Y(0)=Y_0, t\in[0, T].
</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord" style="margin-right:0.05556em;">∂</span><span class="mord mathdefault" style="margin-right:0.22222em;">Y</span><span class="mopen">(</span><span class="mord mathdefault">t</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathdefault" style="margin-right:0.03588em;">σ</span><span class="mopen">(</span><span class="mord mathdefault" style="margin-right:0.07153em;">K</span><span class="mopen">(</span><span class="mord mathdefault">t</span><span class="mclose">)</span><span class="mord mathdefault" style="margin-right:0.22222em;">Y</span><span class="mopen">(</span><span class="mord mathdefault">t</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathdefault">b</span><span class="mopen">(</span><span class="mord mathdefault">t</span><span class="mclose">)</span><span class="mclose">)</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="mord mathdefault" style="margin-right:0.22222em;">Y</span><span class="mopen">(</span><span class="mord">0</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span></span><span class="base"><span class="strut" style="height:0.8777699999999999em;vertical-align:-0.19444em;"></span><span class="mord"><span class="mord mathdefault" style="margin-right:0.22222em;">Y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.30110799999999993em;"><span style="top:-2.5500000000000003em;margin-left:-0.22222em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">0</span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="mord mathdefault">t</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span><span class="mrel">∈</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mopen">[</span><span class="mord">0</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="mord mathdefault" style="margin-right:0.13889em;">T</span><span class="mclose">]</span><span class="mord">.</span></span></span></span></span></p>
<p>The goal is to plan a path (via <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>K</mi></mrow><annotation encoding="application/x-tex">K</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.68333em;vertical-align:0em;"></span><span class="mord mathdefault" style="margin-right:0.07153em;">K</span></span></span></span> and <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>b</mi></mrow><annotation encoding="application/x-tex">b</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.69444em;vertical-align:0em;"></span><span class="mord mathdefault">b</span></span></span></span>) such that the initial data can be linearly separated.</p>
<img src="http://www.mathcs.emory.edu/~lruthot/img/DeepLearning.png" width="80%" />
<p>Another idea is to ensure stability by design / constraints on <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>σ</mi></mrow><annotation encoding="application/x-tex">\sigma</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.43056em;vertical-align:0em;"></span><span class="mord mathdefault" style="margin-right:0.03588em;">σ</span></span></span></span> and <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>K</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mo separator="true">,</mo><mi>b</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">K(t), b(t)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathdefault" style="margin-right:0.07153em;">K</span><span class="mopen">(</span><span class="mord mathdefault">t</span><span class="mclose">)</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="mord mathdefault">b</span><span class="mopen">(</span><span class="mord mathdefault">t</span><span class="mclose">)</span></span></span></span>.</p>
<p>ResNet with antisymmetric transformation matrix:</p>
<p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi mathvariant="normal">∂</mi><mi>Y</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mo>=</mo><mi>σ</mi><mo stretchy="false">(</mo><mo stretchy="false">[</mo><mi>K</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mo>−</mo><mi>K</mi><mo stretchy="false">(</mo><mi>t</mi><msup><mo stretchy="false">)</mo><mi>T</mi></msup><mo stretchy="false">]</mo><mi>Y</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mo>+</mo><mi>b</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mo stretchy="false">)</mo><mo separator="true">,</mo><mi>Y</mi><mo stretchy="false">(</mo><mn>0</mn><mo stretchy="false">)</mo><mo>=</mo><msub><mi>Y</mi><mn>0</mn></msub><mo separator="true">,</mo><mi>t</mi><mo>∈</mo><mo stretchy="false">[</mo><mn>0</mn><mo separator="true">,</mo><mi>T</mi><mo stretchy="false">]</mo><mi mathvariant="normal">.</mi></mrow><annotation encoding="application/x-tex">\partial Y(t)=\sigma([K(t)-K(t)^T] Y(t) + b(t)), Y(0)=Y_0, t\in[0, T].
</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord" style="margin-right:0.05556em;">∂</span><span class="mord mathdefault" style="margin-right:0.22222em;">Y</span><span class="mopen">(</span><span class="mord mathdefault">t</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathdefault" style="margin-right:0.03588em;">σ</span><span class="mopen">(</span><span class="mopen">[</span><span class="mord mathdefault" style="margin-right:0.07153em;">K</span><span class="mopen">(</span><span class="mord mathdefault">t</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span></span><span class="base"><span class="strut" style="height:1.1413309999999999em;vertical-align:-0.25em;"></span><span class="mord mathdefault" style="margin-right:0.07153em;">K</span><span class="mopen">(</span><span class="mord mathdefault">t</span><span class="mclose"><span class="mclose">)</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8913309999999999em;"><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathdefault mtight" style="margin-right:0.13889em;">T</span></span></span></span></span></span></span></span><span class="mclose">]</span><span class="mord mathdefault" style="margin-right:0.22222em;">Y</span><span class="mopen">(</span><span class="mord mathdefault">t</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathdefault">b</span><span class="mopen">(</span><span class="mord mathdefault">t</span><span class="mclose">)</span><span class="mclose">)</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="mord mathdefault" style="margin-right:0.22222em;">Y</span><span class="mopen">(</span><span class="mord">0</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span></span><span class="base"><span class="strut" style="height:0.8777699999999999em;vertical-align:-0.19444em;"></span><span class="mord"><span class="mord mathdefault" style="margin-right:0.22222em;">Y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.30110799999999993em;"><span style="top:-2.5500000000000003em;margin-left:-0.22222em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">0</span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="mord mathdefault">t</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span><span class="mrel">∈</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mopen">[</span><span class="mord">0</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="mord mathdefault" style="margin-right:0.13889em;">T</span><span class="mclose">]</span><span class="mord">.</span></span></span></span></span></p>
<p>Hamiltonian-like ResNet</p>
<p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mfrac><mi mathvariant="normal">d</mi><mrow><mi mathvariant="normal">d</mi><mi>t</mi></mrow></mfrac><mo stretchy="false">(</mo><mi>Y</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mo separator="true">,</mo><mi>Z</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><msup><mo stretchy="false">)</mo><mi>T</mi></msup><mo>=</mo><mi>σ</mi><mo stretchy="false">[</mo><mo stretchy="false">(</mo><mi>K</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mi>Z</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mo separator="true">,</mo><mo>−</mo><mi>K</mi><mo stretchy="false">(</mo><mi>t</mi><msup><mo stretchy="false">)</mo><mi>T</mi></msup><mi>Y</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><msup><mo stretchy="false">)</mo><mi>T</mi></msup><mo>+</mo><mi>b</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mo stretchy="false">]</mo><mo separator="true">,</mo><mi>t</mi><mo>∈</mo><mo stretchy="false">[</mo><mn>0</mn><mo separator="true">,</mo><mi>T</mi><mo stretchy="false">]</mo><mi mathvariant="normal">.</mi></mrow><annotation encoding="application/x-tex">\frac{\mathrm d}{\mathrm d t}(Y(t), Z(t))^T=\sigma[(K(t)Z(t), -K(t)^T Y(t))^T + b(t)], t\in[0, T].
</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:2.05744em;vertical-align:-0.686em;"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.37144em;"><span style="top:-2.314em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord mathrm">d</span><span class="mord mathdefault">t</span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.677em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord mathrm">d</span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.686em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mopen">(</span><span class="mord mathdefault" style="margin-right:0.22222em;">Y</span><span class="mopen">(</span><span class="mord mathdefault">t</span><span class="mclose">)</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="mord mathdefault" style="margin-right:0.07153em;">Z</span><span class="mopen">(</span><span class="mord mathdefault">t</span><span class="mclose">)</span><span class="mclose"><span class="mclose">)</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8913309999999999em;"><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathdefault mtight" style="margin-right:0.13889em;">T</span></span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2777777777777778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span></span><span class="base"><span class="strut" style="height:1.1413309999999999em;vertical-align:-0.25em;"></span><span class="mord mathdefault" style="margin-right:0.03588em;">σ</span><span class="mopen">[</span><span class="mopen">(</span><span class="mord mathdefault" style="margin-right:0.07153em;">K</span><span class="mopen">(</span><span class="mord mathdefault">t</span><span class="mclose">)</span><span class="mord mathdefault" style="margin-right:0.07153em;">Z</span><span class="mopen">(</span><span class="mord mathdefault">t</span><span class="mclose">)</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="mord">−</span><span class="mord mathdefault" style="margin-right:0.07153em;">K</span><span class="mopen">(</span><span class="mord mathdefault">t</span><span class="mclose"><span class="mclose">)</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8913309999999999em;"><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathdefault mtight" style="margin-right:0.13889em;">T</span></span></span></span></span></span></span></span><span class="mord mathdefault" style="margin-right:0.22222em;">Y</span><span class="mopen">(</span><span class="mord mathdefault">t</span><span class="mclose">)</span><span class="mclose"><span class="mclose">)</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8913309999999999em;"><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathdefault mtight" style="margin-right:0.13889em;">T</span></span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222222222222222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathdefault">b</span><span class="mopen">(</span><span class="mord mathdefault">t</span><span class="mclose">)</span><span class="mclose">]</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="mord mathdefault">t</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span><span class="mrel">∈</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mopen">[</span><span class="mord">0</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="mord mathdefault" style="margin-right:0.13889em;">T</span><span class="mclose">]</span><span class="mord">.</span></span></span></span></span></p>
<p><code>Parabolic Residual Neural Networks</code></p>
<p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi mathvariant="normal">∂</mi><mi>Y</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mo>=</mo><mi>σ</mi><mo stretchy="false">(</mo><mi>K</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mi>Y</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mo>+</mo><mi>b</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mo stretchy="false">)</mo><mo separator="true">,</mo><mi>Y</mi><mo stretchy="false">(</mo><mn>0</mn><mo stretchy="false">)</mo><mo>=</mo><msub><mi>Y</mi><mn>0</mn></msub><mo separator="true">,</mo><mi>t</mi><mo>∈</mo><mo stretchy="false">[</mo><mn>0</mn><mo separator="true">,</mo><mi>T</mi><mo stretchy="false">]</mo><mi mathvariant="normal">.</mi></mrow><annotation encoding="application/x-tex">\partial Y(t)=\sigma(K(t) Y(t) + b(t)), Y(0)=Y_0, t\in[0, T].
</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord" style="margin-right:0.05556em;">∂</span><span class="mord mathdefault" style="margin-right:0.22222em;">Y</span><span class="mopen">(</span><span class="mord mathdefault">t</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathdefault" style="margin-right:0.03588em;">σ</span><span class="mopen">(</span><span class="mord mathdefault" style="margin-right:0.07153em;">K</span><span class="mopen">(</span><span class="mord mathdefault">t</span><span class="mclose">)</span><span class="mord mathdefault" style="margin-right:0.22222em;">Y</span><span class="mopen">(</span><span class="mord mathdefault">t</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathdefault">b</span><span class="mopen">(</span><span class="mord mathdefault">t</span><span class="mclose">)</span><span class="mclose">)</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="mord mathdefault" style="margin-right:0.22222em;">Y</span><span class="mopen">(</span><span class="mord">0</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span></span><span class="base"><span class="strut" style="height:0.8777699999999999em;vertical-align:-0.19444em;"></span><span class="mord"><span class="mord mathdefault" style="margin-right:0.22222em;">Y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.30110799999999993em;"><span style="top:-2.5500000000000003em;margin-left:-0.22222em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">0</span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="mord mathdefault">t</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span><span class="mrel">∈</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mopen">[</span><span class="mord">0</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="mord mathdefault" style="margin-right:0.13889em;">T</span><span class="mclose">]</span><span class="mord">.</span></span></span></span></span></p>
<p><code>Hyperbolic Residual Neural Networks</code></p>
<p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi mathvariant="normal">∂</mi><mi>Y</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mo>=</mo><mi>σ</mi><mo stretchy="false">(</mo><mi>K</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mi>Y</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mo>+</mo><mi>b</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mo stretchy="false">)</mo><mo separator="true">,</mo><mi>Y</mi><mo stretchy="false">(</mo><mn>0</mn><mo stretchy="false">)</mo><mo>=</mo><msub><mi>Y</mi><mn>0</mn></msub><mo separator="true">,</mo><mi>t</mi><mo>∈</mo><mo stretchy="false">[</mo><mn>0</mn><mo separator="true">,</mo><mi>T</mi><mo stretchy="false">]</mo><mi mathvariant="normal">.</mi></mrow><annotation encoding="application/x-tex">\partial Y(t)=\sigma(K(t) Y(t) + b(t)), Y(0)=Y_0, t\in[0, T].
</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord" style="margin-right:0.05556em;">∂</span><span class="mord mathdefault" style="margin-right:0.22222em;">Y</span><span class="mopen">(</span><span class="mord mathdefault">t</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathdefault" style="margin-right:0.03588em;">σ</span><span class="mopen">(</span><span class="mord mathdefault" style="margin-right:0.07153em;">K</span><span class="mopen">(</span><span class="mord mathdefault">t</span><span class="mclose">)</span><span class="mord mathdefault" style="margin-right:0.22222em;">Y</span><span class="mopen">(</span><span class="mord mathdefault">t</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathdefault">b</span><span class="mopen">(</span><span class="mord mathdefault">t</span><span class="mclose">)</span><span class="mclose">)</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="mord mathdefault" style="margin-right:0.22222em;">Y</span><span class="mopen">(</span><span class="mord">0</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span></span><span class="base"><span class="strut" style="height:0.8777699999999999em;vertical-align:-0.19444em;"></span><span class="mord"><span class="mord mathdefault" style="margin-right:0.22222em;">Y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.30110799999999993em;"><span style="top:-2.5500000000000003em;margin-left:-0.22222em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">0</span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="mord mathdefault">t</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span><span class="mrel">∈</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mopen">[</span><span class="mord">0</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="mord mathdefault" style="margin-right:0.13889em;">T</span><span class="mclose">]</span><span class="mord">.</span></span></span></span></span></p>
<p><code>Hamiltonian CNN</code></p>
<p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi mathvariant="normal">∂</mi><mi>Y</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mo>=</mo><mi>σ</mi><mo stretchy="false">(</mo><mi>K</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mi>Y</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mo>+</mo><mi>b</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mo stretchy="false">)</mo><mo separator="true">,</mo><mi>Y</mi><mo stretchy="false">(</mo><mn>0</mn><mo stretchy="false">)</mo><mo>=</mo><msub><mi>Y</mi><mn>0</mn></msub><mo separator="true">,</mo><mi>t</mi><mo>∈</mo><mo stretchy="false">[</mo><mn>0</mn><mo separator="true">,</mo><mi>T</mi><mo stretchy="false">]</mo><mi mathvariant="normal">.</mi></mrow><annotation encoding="application/x-tex">\partial Y(t)=\sigma(K(t) Y(t) + b(t)), Y(0)=Y_0, t\in[0, T].
</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord" style="margin-right:0.05556em;">∂</span><span class="mord mathdefault" style="margin-right:0.22222em;">Y</span><span class="mopen">(</span><span class="mord mathdefault">t</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathdefault" style="margin-right:0.03588em;">σ</span><span class="mopen">(</span><span class="mord mathdefault" style="margin-right:0.07153em;">K</span><span class="mopen">(</span><span class="mord mathdefault">t</span><span class="mclose">)</span><span class="mord mathdefault" style="margin-right:0.22222em;">Y</span><span class="mopen">(</span><span class="mord mathdefault">t</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathdefault">b</span><span class="mopen">(</span><span class="mord mathdefault">t</span><span class="mclose">)</span><span class="mclose">)</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="mord mathdefault" style="margin-right:0.22222em;">Y</span><span class="mopen">(</span><span class="mord">0</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span></span><span class="base"><span class="strut" style="height:0.8777699999999999em;vertical-align:-0.19444em;"></span><span class="mord"><span class="mord mathdefault" style="margin-right:0.22222em;">Y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.30110799999999993em;"><span style="top:-2.5500000000000003em;margin-left:-0.22222em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">0</span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="mord mathdefault">t</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span><span class="mrel">∈</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mopen">[</span><span class="mord">0</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="mord mathdefault" style="margin-right:0.13889em;">T</span><span class="mclose">]</span><span class="mord">.</span></span></span></span></span></p>
<ul>
<li><a href="https://github.com/IPAIopen/NumDL-CourseNotes">Numerical methods for deep learning</a></li>
<li><a href="http://www.mathcs.emory.edu/~lruthot/courses/NumDL/index.html">Short Course on Numerical Methods for Deep Learning</a></li>
<li><a href="http://www.mathcs.emory.edu/~lruthot/talks/2019-LR-IPAM-ODE-handout.pdf">Deep Neural Networks Motivated By Ordinary Differential Equations</a></li>
<li><a href="http://www.mathcs.emory.edu/~lruthot/courses/NumDL/3-NumDNNshort-ContinuousModels.pdf">Continuous Models: Numerical Methods for Deep Learning</a></li>
<li><a href="https://arxiv.org/abs/1905.10484">Fully Hyperbolic Convolutional Neural Networks</a></li>
<li><a href="https://eldad-haber.webnode.com/selected-talks/">https://eldad-haber.webnode.com/selected-talks/</a></li>
<li><a href="http://www.mathcs.emory.edu/~lruthot/courses/NumDL/3-NumDNNshort-ContinuousModels.pdf">http://www.mathcs.emory.edu/~lruthot/courses/NumDL/3-NumDNNshort-ContinuousModels.pdf</a></li>
</ul>
<img src="https://pic4.zhimg.com/80/v2-542db02f15d327ccc7558df7a8e6e137_hd.jpg" width="60%"/>
<p><code>Numerical differential equation inspired networks</code>:</p>
<p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><msub><mi>Y</mi><mrow><mi>t</mi><mo>+</mo><mn>1</mn></mrow></msub><mo>=</mo><mo stretchy="false">(</mo><mn>1</mn><mo>−</mo><msub><mi>k</mi><mi>t</mi></msub><mo stretchy="false">)</mo><msub><mi>Y</mi><mrow><mi>t</mi><mo>−</mo><mn>1</mn></mrow></msub><mo>+</mo><msub><mi>k</mi><mi>t</mi></msub><msub><mi>Y</mi><mi>t</mi></msub><mo>+</mo><mi>h</mi><mi>σ</mi><mo stretchy="false">(</mo><msub><mi>K</mi><mi>t</mi></msub><msub><mi>Y</mi><mi>t</mi></msub><mo>+</mo><msub><mi>b</mi><mi>t</mi></msub><mo stretchy="false">)</mo><mi mathvariant="normal">.</mi></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(Linear multi-step structure)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">Y_{t+1} = (1-k_t)Y_{t-1} + k_t Y_t + h \sigma(K_{t} Y_{t} + b_{t})\tag{Linear multi-step structure}.
</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.891661em;vertical-align:-0.208331em;"></span><span class="mord"><span class="mord mathdefault" style="margin-right:0.22222em;">Y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.301108em;"><span style="top:-2.5500000000000003em;margin-left:-0.22222em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathdefault mtight">t</span><span class="mbin mtight">+</span><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.208331em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2777777777777778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mopen">(</span><span class="mord">1</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord"><span class="mord mathdefault" style="margin-right:0.03148em;">k</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2805559999999999em;"><span style="top:-2.5500000000000003em;margin-left:-0.03148em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathdefault mtight">t</span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mclose">)</span><span class="mord"><span class="mord mathdefault" style="margin-right:0.22222em;">Y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.301108em;"><span style="top:-2.5500000000000003em;margin-left:-0.22222em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathdefault mtight">t</span><span class="mbin mtight">−</span><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.208331em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222222222222222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span></span><span class="base"><span class="strut" style="height:0.84444em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathdefault" style="margin-right:0.03148em;">k</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2805559999999999em;"><span style="top:-2.5500000000000003em;margin-left:-0.03148em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathdefault mtight">t</span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord"><span class="mord mathdefault" style="margin-right:0.22222em;">Y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2805559999999999em;"><span style="top:-2.5500000000000003em;margin-left:-0.22222em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathdefault mtight">t</span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222222222222222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathdefault">h</span><span class="mord mathdefault" style="margin-right:0.03588em;">σ</span><span class="mopen">(</span><span class="mord"><span class="mord mathdefault" style="margin-right:0.07153em;">K</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2805559999999999em;"><span style="top:-2.5500000000000003em;margin-left:-0.07153em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathdefault mtight">t</span></span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord"><span class="mord mathdefault" style="margin-right:0.22222em;">Y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2805559999999999em;"><span style="top:-2.5500000000000003em;margin-left:-0.22222em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathdefault mtight">t</span></span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222222222222222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord"><span class="mord mathdefault">b</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2805559999999999em;"><span style="top:-2.5500000000000003em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathdefault mtight">t</span></span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mclose">)</span><span class="mord">.</span></span><span class="tag"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord">L</span><span class="mord">i</span><span class="mord">n</span><span class="mord">e</span><span class="mord">a</span><span class="mord">r</span><span class="mord"> </span><span class="mord">m</span><span class="mord">u</span><span class="mord">l</span><span class="mord">t</span><span class="mord">i</span><span class="mord">-</span><span class="mord">s</span><span class="mord">t</span><span class="mord">e</span><span class="mord">p</span><span class="mord"> </span><span class="mord">s</span><span class="mord">t</span><span class="mord">r</span><span class="mord">u</span><span class="mord">c</span><span class="mord">t</span><span class="mord">u</span><span class="mord">r</span><span class="mord">e</span></span><span class="mord">)</span></span></span></span></span></span></p>
<ul>
<li><a href="https://web.stanford.edu/~yplu/proj/lm/">Bridging Deep Architects and Numerical Differential Equations</a></li>
<li><a href="http://helper.ipam.ucla.edu/publications/glws3/glws3_15460.pdf">BRIDGING DEEP NEURAL NETWORKS AND DIFFERENTIAL EQUATIONS FOR IMAGE ANALYSIS AND BEYOND</a></li>
<li><a href="https://arxiv.org/abs/1710.10121">Beyond Finite Layer Neural Networks: Bridging Deep Architectures and Numerical Differential Equations</a></li>
<li><a href="http://bicmr.pku.edu.cn/~dongbin/">http://bicmr.pku.edu.cn/~dongbin/</a></li>
<li><a href="https://arxiv.org/pdf/1906.02762.pdf">https://arxiv.org/pdf/1906.02762.pdf</a></li>
<li><a href="https://zhuanlan.zhihu.com/p/87999707">Neural ODE Paper List</a></li>
</ul>
<ul>
<li><a href="https://ieeexplore.ieee.org/document/8281501">A Multiscale and Multidepth Convolutional Neural Network for Remote Sensing Imagery Pan-Sharpening</a></li>
<li><a href="https://arxiv.org/abs/1808.02376">https://arxiv.org/abs/1808.02376</a></li>
<li><a href="https://www.nature.com/articles/s41598-018-22871-z">Multimodal and Multiscale Deep Neural Networks for the Early Diagnosis of Alzheimer’s Disease using structural MR and FDG-PET images</a></li>
</ul>
<p><code>MgNet</code></p>
<p><a href="https://arxiv.org/pdf/1901.10415.pdf">As the solution space is often the dual of the data space in PDEs, the
analogous concept of feature space and data space (which are dual to each other) is introduced
in CNN. With such connections and new concept in the unified model, the function of various
convolution operations and pooling used in CNN can be better understood.</a></p>
<ul>
<li><a href="https://arxiv.org/pdf/1901.10415.pdf">MgNet: A Unified Framework of Multigrid and Convolutional Neural Network</a></li>
<li><a href="http://www.multigrid.org/img2019/img2019/Index/shortcourse.html">http://www.multigrid.org/img2019/img2019/Index/shortcourse.html</a></li>
<li><a href="https://deepai.org/machine-learning/researcher/jinchao-xu">https://deepai.org/machine-learning/researcher/jinchao-xu</a></li>
</ul>
<hr>
<ul>
<li><a href="http://www.ms.uky.edu/~qye/MA721/ma721F17.html">MA 721: Topics in Numerical Analysis: Deep Learning</a></li>
<li><a href="http://www.mathcs.emory.edu/~lruthot/teaching.html">http://www.mathcs.emory.edu/~lruthot/teaching.html</a></li>
<li><a href="https://www.math.ucla.edu/applied/cam">https://www.math.ucla.edu/applied/cam</a></li>
<li><a href="http://www.mathcs.emory.edu/~lruthot/">http://www.mathcs.emory.edu/~lruthot/</a></li>
<li><a href="https://autodiff-workshop.github.io/slides/Hueckelheim_nips_autodiff_CNN_PDE.pdf">Automatic Differentiation of Parallelised Convolutional Neural Networks - Lessons from Adjoint PDE Solvers</a></li>
<li><a href="https://www.math.tu-berlin.de/fachgebiete_ag_modnumdiff/angewandtefunktionalanalysis/v_menue/mitarbeiter/kutyniok/v_menue/kutyniok_publications/">A Theoretical Analysis of Deep Neural Networks and Parametric PDEs.</a></li>
<li><a href="https://raoyongming.github.io/">https://raoyongming.github.io/</a></li>
<li><a href="https://sites.google.com/prod/view/haizhaoyang/">https://sites.google.com/prod/view/haizhaoyang/</a></li>
<li><a href="https://github.com/HaizhaoYang">https://github.com/HaizhaoYang</a></li>
<li><a href="https://www.stat.uchicago.edu/events/rtg/index.shtml">https://www.stat.uchicago.edu/events/rtg/index.shtml</a></li>
</ul>
<h3 id="control-theory-and-deep-learning">Control Theory and Deep Learning</h3>
<p><a href="http://scriptedonachip.com/ml-control">It arose out of control theory literature when people were trying to identify highly complex and nonlinear dynamical systems. Neural networks – artificial neural networks – were first used in a supervised learning scenario in control theory. Hornik, if I remember correctly, was the first to find that neural networks were universal approximators.</a></p>
<blockquote>
<p>Supervised Deep Learning Problem
Given training data, <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>Y</mi><mn>0</mn></msub></mrow><annotation encoding="application/x-tex">Y_0</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.83333em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathdefault" style="margin-right:0.22222em;">Y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.30110799999999993em;"><span style="top:-2.5500000000000003em;margin-left:-0.22222em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">0</span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span>, and labels, <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>C</mi></mrow><annotation encoding="application/x-tex">C</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.68333em;vertical-align:0em;"></span><span class="mord mathdefault" style="margin-right:0.07153em;">C</span></span></span></span>, find network parameters <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>θ</mi></mrow><annotation encoding="application/x-tex">\theta</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.69444em;vertical-align:0em;"></span><span class="mord mathdefault" style="margin-right:0.02778em;">θ</span></span></span></span> and
classification weights <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>W</mi><mo separator="true">,</mo><mi>μ</mi></mrow><annotation encoding="application/x-tex">W, \mu</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8777699999999999em;vertical-align:-0.19444em;"></span><span class="mord mathdefault" style="margin-right:0.13889em;">W</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="mord mathdefault">μ</span></span></span></span> such that the DNN predicts the data-label
relationship (and generalizes to new data), i.e., solve</p>
</blockquote>
<p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mo><mi mathvariant="normal">minimize</mi><mo></mo></mo><mrow><mi>θ</mi><mo separator="true">,</mo><mi>W</mi><mo separator="true">,</mo><mi>μ</mi></mrow></msub><mi>l</mi><mi>o</mi><mi>s</mi><mi>s</mi><mo stretchy="false">[</mo><mi>g</mi><mo stretchy="false">(</mo><mi>W</mi><mo separator="true">,</mo><mi>μ</mi><mo stretchy="false">)</mo><mo separator="true">,</mo><mi>C</mi><mo stretchy="false">]</mo><mo>+</mo><mi>r</mi><mi>e</mi><mi>g</mi><mi>u</mi><mi>l</mi><mi>a</mi><mi>r</mi><mi>i</mi><mi>z</mi><mi>e</mi><mi>r</mi><mo stretchy="false">[</mo><mi>θ</mi><mo separator="true">,</mo><mi>W</mi><mo separator="true">,</mo><mi>μ</mi><mo stretchy="false">]</mo></mrow><annotation encoding="application/x-tex">\operatorname{minimize}_{ \theta,W,\mu} loss[g(W, \mu), C] + regularizer[\theta,W,\mu]
</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.036108em;vertical-align:-0.286108em;"></span><span class="mop"><span class="mop"><span class="mord mathrm">m</span><span class="mord mathrm">i</span><span class="mord mathrm">n</span><span class="mord mathrm">i</span><span class="mord mathrm">m</span><span class="mord mathrm">i</span><span class="mord mathrm">z</span><span class="mord mathrm">e</span></span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3361079999999999em;"><span style="top:-2.5500000000000003em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathdefault mtight" style="margin-right:0.02778em;">θ</span><span class="mpunct mtight">,</span><span class="mord mathdefault mtight" style="margin-right:0.13889em;">W</span><span class="mpunct mtight">,</span><span class="mord mathdefault mtight">μ</span></span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.286108em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="mord mathdefault" style="margin-right:0.01968em;">l</span><span class="mord mathdefault">o</span><span class="mord mathdefault">s</span><span class="mord mathdefault">s</span><span class="mopen">[</span><span class="mord mathdefault" style="margin-right:0.03588em;">g</span><span class="mopen">(</span><span class="mord mathdefault" style="margin-right:0.13889em;">W</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="mord mathdefault">μ</span><span class="mclose">)</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="mord mathdefault" style="margin-right:0.07153em;">C</span><span class="mclose">]</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathdefault" style="margin-right:0.02778em;">r</span><span class="mord mathdefault">e</span><span class="mord mathdefault" style="margin-right:0.03588em;">g</span><span class="mord mathdefault">u</span><span class="mord mathdefault" style="margin-right:0.01968em;">l</span><span class="mord mathdefault">a</span><span class="mord mathdefault" style="margin-right:0.02778em;">r</span><span class="mord mathdefault">i</span><span class="mord mathdefault" style="margin-right:0.04398em;">z</span><span class="mord mathdefault">e</span><span class="mord mathdefault" style="margin-right:0.02778em;">r</span><span class="mopen">[</span><span class="mord mathdefault" style="margin-right:0.02778em;">θ</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="mord mathdefault" style="margin-right:0.13889em;">W</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="mord mathdefault">μ</span><span class="mclose">]</span></span></span></span></span></p>
<p>This can rewrite in a compact form</p>
<p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mo><mi mathvariant="normal">minimize</mi><mo></mo></mo><mrow><mi>θ</mi><mo separator="true">,</mo><mi>W</mi><mo separator="true">,</mo><mi>μ</mi></mrow></msub><mi>l</mi><mi>o</mi><mi>s</mi><mi>s</mi><mo stretchy="false">[</mo><mi>g</mi><mo stretchy="false">(</mo><mi>W</mi><mo stretchy="false">(</mo><mi>T</mi><mo stretchy="false">)</mo><mi>Y</mi><mo stretchy="false">(</mo><mi>T</mi><mo stretchy="false">)</mo><mo>+</mo><mi>μ</mi><mo stretchy="false">)</mo><mo separator="true">,</mo><mi>C</mi><mo stretchy="false">]</mo><mo>+</mo><mi>r</mi><mi>e</mi><mi>g</mi><mi>u</mi><mi>l</mi><mi>a</mi><mi>r</mi><mi>i</mi><mi>z</mi><mi>e</mi><mi>r</mi><mo stretchy="false">[</mo><mi>θ</mi><mo separator="true">,</mo><mi>W</mi><mo separator="true">,</mo><mi>μ</mi><mo stretchy="false">]</mo><mspace linebreak="newline"></mspace><mtext>subject to </mtext><msub><mi mathvariant="normal">∂</mi><mi>t</mi></msub><mi>Y</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mo>=</mo><mi>f</mi><mo stretchy="false">(</mo><mi>Y</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mo separator="true">,</mo><mi>θ</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mo stretchy="false">)</mo><mo separator="true">,</mo><mi>Y</mi><mo stretchy="false">(</mo><mn>0</mn><mo stretchy="false">)</mo><mo>=</mo><msub><mi>Y</mi><mn>0</mn></msub><mi mathvariant="normal">.</mi></mrow><annotation encoding="application/x-tex">\operatorname{minimize}_{ \theta,W,\mu} loss[g(W(T)Y(T)+\mu), C] + regularizer[\theta,W,\mu]\\
\text{subject to }\partial_t Y(t) = f (Y(t), \theta(t)), Y(0) = Y_0.</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.036108em;vertical-align:-0.286108em;"></span><span class="mop"><span class="mop"><span class="mord mathrm">m</span><span class="mord mathrm">i</span><span class="mord mathrm">n</span><span class="mord mathrm">i</span><span class="mord mathrm">m</span><span class="mord mathrm">i</span><span class="mord mathrm">z</span><span class="mord mathrm">e</span></span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3361079999999999em;"><span style="top:-2.5500000000000003em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathdefault mtight" style="margin-right:0.02778em;">θ</span><span class="mpunct mtight">,</span><span class="mord mathdefault mtight" style="margin-right:0.13889em;">W</span><span class="mpunct mtight">,</span><span class="mord mathdefault mtight">μ</span></span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.286108em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="mord mathdefault" style="margin-right:0.01968em;">l</span><span class="mord mathdefault">o</span><span class="mord mathdefault">s</span><span class="mord mathdefault">s</span><span class="mopen">[</span><span class="mord mathdefault" style="margin-right:0.03588em;">g</span><span class="mopen">(</span><span class="mord mathdefault" style="margin-right:0.13889em;">W</span><span class="mopen">(</span><span class="mord mathdefault" style="margin-right:0.13889em;">T</span><span class="mclose">)</span><span class="mord mathdefault" style="margin-right:0.22222em;">Y</span><span class="mopen">(</span><span class="mord mathdefault" style="margin-right:0.13889em;">T</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathdefault">μ</span><span class="mclose">)</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="mord mathdefault" style="margin-right:0.07153em;">C</span><span class="mclose">]</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathdefault" style="margin-right:0.02778em;">r</span><span class="mord mathdefault">e</span><span class="mord mathdefault" style="margin-right:0.03588em;">g</span><span class="mord mathdefault">u</span><span class="mord mathdefault" style="margin-right:0.01968em;">l</span><span class="mord mathdefault">a</span><span class="mord mathdefault" style="margin-right:0.02778em;">r</span><span class="mord mathdefault">i</span><span class="mord mathdefault" style="margin-right:0.04398em;">z</span><span class="mord mathdefault">e</span><span class="mord mathdefault" style="margin-right:0.02778em;">r</span><span class="mopen">[</span><span class="mord mathdefault" style="margin-right:0.02778em;">θ</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="mord mathdefault" style="margin-right:0.13889em;">W</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="mord mathdefault">μ</span><span class="mclose">]</span></span><span class="mspace newline"></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord text"><span class="mord">subject to </span></span><span class="mord"><span class="mord" style="margin-right:0.05556em;">∂</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2805559999999999em;"><span style="top:-2.5500000000000003em;margin-left:-0.05556em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathdefault mtight">t</span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord mathdefault" style="margin-right:0.22222em;">Y</span><span class="mopen">(</span><span class="mord mathdefault">t</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathdefault" style="margin-right:0.10764em;">f</span><span class="mopen">(</span><span class="mord mathdefault" style="margin-right:0.22222em;">Y</span><span class="mopen">(</span><span class="mord mathdefault">t</span><span class="mclose">)</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="mord mathdefault" style="margin-right:0.02778em;">θ</span><span class="mopen">(</span><span class="mord mathdefault">t</span><span class="mclose">)</span><span class="mclose">)</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="mord mathdefault" style="margin-right:0.22222em;">Y</span><span class="mopen">(</span><span class="mord">0</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span></span><span class="base"><span class="strut" style="height:0.83333em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathdefault" style="margin-right:0.22222em;">Y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.30110799999999993em;"><span style="top:-2.5500000000000003em;margin-left:-0.22222em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">0</span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord">.</span></span></span></span></span></p>
<ul>
<li><a href="https://arxiv.org/abs/1908.10920">Deep Learning Theory Review: An Optimal Control and Dynamical Systems Perspective</a></li>
<li><a href="http://proceedings.mlr.press/v80/li18b/li18b.pdf">An Optimal Control Approach to Deep Learning and Applications to Discrete-Weight Neural Networks</a></li>
<li><a href="https://web.stanford.edu/~yplu/DynamicOCNN.pdf">Dynamic System and Optimal Control Perspective of Deep Learning</a></li>
<li><a href="https://www.nsf.gov/awardsearch/showAward?AWD_ID=1751636">A Flexible Optimal Control Framework for Efficient Training of Deep Neural Networks</a></li>
<li><a href="https://arxiv.org/pdf/1904.05657.pdf">Deep learning as optimal control problems: models and numerical methods</a></li>
<li><a href="https://deepai.org/publication/a-mean-field-optimal-control-formulation-of-deep-learning">A Mean-Field Optimal Control Formulation of Deep Learning</a></li>
<li><a href="http://scriptedonachip.com/ml-control">Control Theory and Machine Learning</a></li>
<li><a href="https://faculty.sites.uci.edu/khargonekar/files/2018/04/Control_ML_AI_Final.pdf">Advancing Systems and Control Research in the Era of ML and AI</a></li>
<li><a href="http://marcogallieri.micso.it/Home.html">http://marcogallieri.micso.it/Home.html</a></li>
<li><a href="http://www.eventideib.polimi.it/events/deep-learning-meets-control-theory-research-at-nnaisense-and-polimi/">Deep Learning meets Control Theory: Research at NNAISENSE and Polimi</a></li>
<li><a href="https://github.com/lakehanne/awesome-neurocontrol">Machine Learning-based Control</a></li>
<li><a href="https://www.nsf.gov/awardsearch/showAward?AWD_ID=1751636">CAREER: A Flexible Optimal Control Framework for Efficient Training of Deep Neural Networks</a></li>
<li><a href="https://www.zhihu.com/question/315809187/answer/623687046">https://www.zhihu.com/question/315809187/answer/623687046</a></li>
<li><a href="https://www4.comp.polyu.edu.hk/~cslzhang/paper/CVPR19-FOCNet.pdf">https://www4.comp.polyu.edu.hk/~cslzhang/paper/CVPR19-FOCNet.pdf</a></li>
</ul>
<h3 id="neural-ordinary-differential-equations">Neural Ordinary Differential Equations</h3>
<p><code>Neural ODE</code></p>
<ul>
<li><a href="http://www.cs.toronto.edu/~rtqichen/pdfs/neural_ode_slides.pdf">Neural Ordinary Differential Equations</a></li>
</ul>
<img src="https://rkevingibson.github.io/img/ode_networks_1.png" width="80%" />
<ul>
<li><a href="https://www.arxiv-vanity.com/papers/1908.03190/">NeuPDE: Neural Network Based Ordinary and Partial Differential Equations for Modeling Time-Dependent Data</a></li>
<li><a href="https://rajatvd.github.io/Neural-ODE-Adversarial/">Neural Ordinary Differential Equations and Adversarial Attacks</a></li>
<li><a href="http://ganguli-gang.stanford.edu/">Neural Dynamics and Computation Lab</a></li>
<li><a href="https://arxiv.org/abs/1908.03190">NeuPDE: Neural Network Based Ordinary and Partial Differential Equations for Modeling Time-Dependent Data</a></li>
<li><a href="https://math.ethz.ch/sam/research/reports.html?year=2019">https://math.ethz.ch/sam/research/reports.html?year=2019</a></li>
</ul>
<h2 id="dynamics-and-deep-learning">Dynamics and Deep Learning</h2>
<ul>
<li><a href="http://roseyu.com/">http://roseyu.com/</a></li>
<li><a href="https://link.springer.com/article/10.1007/s40304-017-0103-z">A Proposal on Machine Learning via Dynamical Systems</a></li>
<li><a href="http://www.scholarpedia.org/article/Attractor_network">http://www.scholarpedia.org/article/Attractor_network</a></li>
<li><a href="http://proceedings.mlr.press/v37/jozefowicz15.pdf">An Empirical Exploration of Recurrent Network Architectures</a></li>
<li><a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3984152/">An Attractor-Based Complexity Measurement for Boolean Recurrent Neural Networks</a></li>
<li><a href="https://doaj.org/article/9d9172e9bf324cc6ac6d48ff8e234a85">Deep learning for universal linear embeddings of nonlinear dynamics</a></li>
<li><a href="http://ganguli-gang.stanford.edu/pdf/DynamLearn.pdf">Exact solutions to the nonlinear dynamics of learning in deep linear neural networks</a></li>
<li><a href="https://www.sciencedirect.com/science/article/pii/S0925231213009338">Continuous attractors of higher-order recurrent neural networks with infinite neurons</a></li>
<li><a href="https://www.researchgate.net/profile/Jiali_Yu3">https://www.researchgate.net/profile/Jiali_Yu3</a></li>
<li><a href="https://cbmm.mit.edu/sites/default/files/publications/aaai-abstract%20%281%29.pdf">Markov Transitions between Attractor States in a Recurrent Neural Network</a></li>
<li><a href="https://sagarverma.github.io/others/lit_rev_physics.pdf">A Survey on Machine Learning Applied to Dynamic Physical Systems</a></li>
<li><a href="https://deepdrive.berkeley.edu/project/dynamical-view-machine-learning-systems">https://deepdrive.berkeley.edu/project/dynamical-view-machine-learning-systems</a></li>
</ul>
<h3 id="stability-for-neural-networks">Stability For Neural Networks</h3>
<ul>
<li><a href="https://folk.uio.no/vegarant/">https://folk.uio.no/vegarant/</a></li>
<li><a href="https://www.mn.uio.no/math/english/people/aca/vegarant/index.html">https://www.mn.uio.no/math/english/people/aca/vegarant/index.html</a></li>
<li><a href="https://arxiv.org/pdf/1710.11029.pdf">https://arxiv.org/pdf/1710.11029.pdf</a></li>
<li><a href="http://www.vision.jhu.edu/tutorials/ICCV15-Tutorial-Math-Deep-Learning-Raja.pdf">http://www.vision.jhu.edu/tutorials/ICCV15-Tutorial-Math-Deep-Learning-Raja.pdf</a></li>
<li><a href="https://arxiv.org/abs/1705.03341">https://arxiv.org/abs/1705.03341</a></li>
<li><a href="https://izmailovpavel.github.io/">https://izmailovpavel.github.io/</a></li>
<li><a href="https://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Zheng_Improving_the_Robustness_CVPR_2016_paper.pdf">https://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Zheng_Improving_the_Robustness_CVPR_2016_paper.pdf</a></li>
</ul>
<h2 id="differential-equation-and-deep-learning">Differential Equation and Deep Learning</h2>
<p>This section is on how to use deep learning or more general machine learning to solve differential equation numerically.</p>
<p>We derive upper bounds on the complexity of ReLU neural networks approximating the solution maps of parametric partial differential equations.
In particular, without any knowledge of its concrete shape, we use the inherent low-dimensionality of the solution manifold to obtain approximation rates
which are significantly superior to those provided by classical approximation results.
We use this low dimensionality to guarantee the existence of a reduced basis.
<a href="https://www.math.tu-berlin.de/fileadmin/i26_fg-kutyniok/Kutyniok/Papers/Parametric_PDEs_and_NNs_.pdf">Then, for a large variety of parametric partial differential equations, we construct neural networks that yield approximations of the parametric maps not suffering from a curse of dimension and essentially only depending on the size of the reduced basis.</a></p>
<ul>
<li><a href="https://math.ethz.ch/sam/research/reports.html?year=2019">https://math.ethz.ch/sam/research/reports.html?year=2019</a></li>
<li><a href="https://aimath.org/workshops/upcoming/deeppde/">https://aimath.org/workshops/upcoming/deeppde/</a></li>
<li><a href="https://github.com/IBM/pde-deep-learning">https://github.com/IBM/pde-deep-learning</a></li>
<li><a href="https://arxiv.org/abs/1804.04272">https://arxiv.org/abs/1804.04272</a></li>
<li><a href="https://deepai.org/machine-learning/researcher/weinan-e">https://deepai.org/machine-learning/researcher/weinan-e</a></li>
<li><a href="https://deepxde.readthedocs.io/en/latest/">https://deepxde.readthedocs.io/en/latest/</a></li>
<li><a href="https://github.com/IBM/pde-deep-learning">https://github.com/IBM/pde-deep-learning</a></li>
<li><a href="https://github.com/ZichaoLong/PDE-Net">https://github.com/ZichaoLong/PDE-Net</a></li>
<li><a href="https://github.com/amkatrutsa/DeepPDE">https://github.com/amkatrutsa/DeepPDE</a></li>
<li><a href="https://github.com/maziarraissi/DeepHPMs">https://github.com/maziarraissi/DeepHPMs</a></li>
<li><a href="https://github.com/markovmodel/deeptime">https://github.com/markovmodel/deeptime</a></li>
<li><a href="https://arxiv.org/abs/1801.06637">Deep Hidden Physics Models: Deep Learning of Nonlinear Partial Differential Equations</a></li>
<li><a href="https://rse-lab.cs.washington.edu/papers/spnets2018.pdf">SPNets: Differentiable Fluid Dynamics for Deep Neural Networks</a></li>
<li><a href="https://maziarraissi.github.io/DeepHPMs/">https://maziarraissi.github.io/DeepHPMs/</a></li>
<li><a href="https://www.math.tu-berlin.de/fileadmin/i26_fg-kutyniok/Kutyniok/Papers/Parametric_PDEs_and_NNs_.pdf">A Theoretical Analysis of Deep Neural Networks and Parametric PDEs</a></li>
<li><a href="http://ins.sjtu.edu.cn:3300/conferences/7/talks/314">Deep Approximation via Deep Learning</a></li>
</ul>
<h3 id="deep-learning-for-pdes">Deep Learning for PDEs</h3>
<ul>
<li><a href="https://link.springer.com/article/10.1007/s40304-018-0127-z">The Deep Ritz Method: A Deep Learning-Based Numerical Algorithm for Solving Variational Problems</a></li>
<li><a href="http://utstat.toronto.edu/~ali/papers/PDEandDeepLearning.pdf">Solving Nonlinear and High-Dimensional Partial Differential Equations via Deep Learning</a></li>
<li><a href="https://www.sciencedirect.com/science/article/pii/S0021999118305527">DGM: A deep learning algorithm for solving partial differential equations</a></li>
<li><a href="https://julialang.org/blog/2017/10/gsoc-NeuralNetDiffEq">NeuralNetDiffEq.jl: A Neural Network solver for ODEs</a></li>
<li><a href="https://www.pims.math.ca/scientific-event/190722-pcssdlcm">PIMS CRG Summer School: Deep Learning for Computational Mathematics</a></li>
</ul>
<ul>
<li><a href="https://arxiv.org/abs/1806.07366">https://arxiv.org/abs/1806.07366</a></li>
<li><a href="https://mat.univie.ac.at/~grohs/">https://mat.univie.ac.at/~grohs/</a></li>
<li><a href="https://rse-lab.cs.washington.edu/">https://rse-lab.cs.washington.edu/</a></li>
<li><a href="http://www.ajentzen.de/">http://www.ajentzen.de/</a></li>
<li><a href="https://web.math.princeton.edu/~jiequnh/">https://web.math.princeton.edu/~jiequnh/</a></li>
</ul>
<h3 id="mathcal-h-matrix-and-deep-learning"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi mathvariant="script">H</mi></mrow><annotation encoding="application/x-tex">\mathcal H</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.68333em;vertical-align:0em;"></span><span class="mord mathcal" style="margin-right:0.00965em;">H</span></span></span></span> matrix and deep learning</h3>
<p><a href="https://web.stanford.edu/~lexing/mnnh.pdf">In this work we introduce a new multiscale artificial neural network based on the structure of H-matrices. This network generalizes the latter to the nonlinear case by introducing a local deep neural network at each spatial scale. Numerical results indicate that the network is able to efficiently approximate discrete nonlinear maps obtained from discretized nonlinear partial differential equations, such as those arising from nonlinear Schodinger equations and the KohnSham density functional theory.</a></p>
<ul>
<li><a href="https://web.stanford.edu/~lexing/mnnh.pdf">A multiscale neural network based on hierarchical matrices</a></li>
<li><a href="https://link.springer.com/article/10.1007%2Fs40687-019-0183-3">A multiscale neural network based on hierarchical nested bases</a></li>
</ul>
<p><a href="https://www.researchgate.net/project/Mathematical-Theory-for-Deep-Neural-Networks">We aim to build a theoretical foundation for the analysis of deep neural networks to answer questions such as "What are the correct approximation spaces for deep neural networks?", "What is the advantage of deep versus shallow networks?", or "To which extent are deep neural networks able to detect low dimensional structures in high dimensional data?".</a></p>
<ul>
<li><a href="https://www.researchgate.net/profile/Gitta_Kutyniok">https://www.researchgate.net/profile/Gitta_Kutyniok</a></li>
<li><a href="https://www.researchgate.net/project/Mathematical-Theory-for-Deep-Neural-Networks">https://www.researchgate.net/project/Mathematical-Theory-for-Deep-Neural-Networks</a></li>
<li><a href="https://www.academia-net.org/profil/prof-dr-gitta-kutyniok/1133890">https://www.academia-net.org/profil/prof-dr-gitta-kutyniok/1133890</a></li>
<li><a href="https://www.tu-berlin.de/index.php?id=168945">https://www.tu-berlin.de/index.php?id=168945</a></li>
<li><a href="https://www.math.tu-berlin.de/?108957">https://www.math.tu-berlin.de/?108957</a></li>
<li><a href="https://arxiv.org/abs/1801.05894">Deep Learning: An Introduction for Applied Mathematicians</a></li>
</ul>
<h3 id="stochastic-differential-equations-and-deep-learning">Stochastic Differential Equations and Deep Learning</h3>
<ul>
<li><a href="http://www.stochasticlifestyle.com/neural-jump-sdes-jump-diffusions-and-neural-pdes/">Neural Jump SDEs (Jump Diffusions) and Neural PDEs</a></li>
<li><a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3366314">Deep-Learning Based Numerical BSDE Method for Barrier Options</a></li>
<li><a href="https://www.sam.math.ethz.ch/sam_reports/reports_final/reports2017/2017-49.pdf">Machine learning approximation algorithms for high-dimensional fully nonlinear partial differential equations and second-order backward stochastic differential equations</a></li>
</ul>
<h3 id="finite-element-methods-and-deep-learning">Finite Element Methods and Deep Learning</h3>
<ul>
<li><a href="http://www.multigrid.org/index.php?id=13">http://www.multigrid.org/index.php?id=13</a></li>
<li><a href="http://casopisi.junis.ni.ac.rs/index.php/FUMechEng/article/view/309">http://casopisi.junis.ni.ac.rs/index.php/FUMechEng/article/view/309</a></li>
<li><a href="http://people.math.sc.edu/imi/DASIV/">http://people.math.sc.edu/imi/DASIV/</a></li>
<li><a href="https://www.sam.math.ethz.ch/sam_reports/reports_final/reports2019/2019-07.pdf">Deep ReLU Networks and High-Order Finite Element Methods</a></li>
<li><a href="https://math.psu.edu/events/35992">https://math.psu.edu/events/35992</a></li>
<li><a href="https://olemiss.edu/sciencenet/trefftz/Trefftz/Exeter/Javadi.pdf">Neural network for constitutive modelling in finite element analysis</a></li>
<li><a href="https://arxiv.org/abs/1807.03973">https://arxiv.org/abs/1807.03973</a></li>
<li><a href="https://royalsocietypublishing.org/doi/10.1098/rsif.2017.0844">A deep learning approach to estimate stress distribution: a fast and accurate surrogate of finite-element analysis</a></li>
<li><a href="https://repository.tudelft.nl/islandora/object/uuid%3A615f2151-bcae-4e78-a2cb-3f1891a28275">An Integrated Machine Learning and Finite Element Analysis Framework, Applied to Composite Substructures including Damage</a></li>
<li><a href="https://github.com/oleksiyskononenko/mlfem">https://github.com/oleksiyskononenko/mlfem</a></li>
<li><a href="https://people.math.gatech.edu/~wliao60/">https://people.math.gatech.edu/~wliao60/</a></li>
<li><a href="https://www.math.tu-berlin.de/fileadmin/i26_fg-kutyniok/Kutyniok/Papers/main.pdf">https://www.math.tu-berlin.de/fileadmin/i26_fg-kutyniok/Kutyniok/Papers/main.pdf</a></li>
</ul>
<h2 id="approximation-theory-for-deep-learning">Approximation Theory for Deep Learning</h2>
<p>Universal approximation theory show the expression power of deep neural network of some wide while shallow neural network.
The section will extend the approximation to the deep neural network.</p>
<p><a href="https://epubs.siam.org/doi/pdf/10.1137/18M118709X">We derive fundamental lower bounds on the connectivity and the memory requirements of deep neural networks guaranteeing uniform approximation rates for arbitrary function classes in <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mi>L</mi><mn>2</mn></msup><mo stretchy="false">(</mo><msup><mi mathvariant="double-struck">R</mi><mi>d</mi></msup><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">L^2(\mathbb R^d)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.099108em;vertical-align:-0.25em;"></span><span class="mord"><span class="mord mathdefault">L</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8141079999999999em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">2</span></span></span></span></span></span></span></span><span class="mopen">(</span><span class="mord"><span class="mord mathbb">R</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.849108em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathdefault mtight">d</span></span></span></span></span></span></span></span><span class="mclose">)</span></span></span></span>. In other words, we establish a connection between the complexity of a function class and the complexity of deep neural networks approximating functions from this class to within a prescribed accuracy.</a></p>
<ul>
<li><a href="https://arxiv.org/abs/1901.02220">Deep Neural Network Approximation Theory</a></li>
<li><a href="https://cpb-us-w2.wpmucdn.com/blog.nus.edu.sg/dist/d/11132/files/2019/07/paper_cnn_copy.pdf">Approximation Analysis of Convolutional Neural Networks</a></li>
<li><a href="https://arxiv.org/abs/1608.03287">Deep vs. shallow networks : An approximation theory perspective</a></li>
<li><a href="https://arxiv.org/abs/1901.02220">Deep Neural Network Approximation Theory</a></li>
<li><a href="https://cpsc.yale.edu/sites/default/files/files/tr1513(1).pdf">Provable approximation properties for deep neural networks</a></li>
<li><a href="https://epubs.siam.org/doi/pdf/10.1137/18M118709X">Optimal Approximation with Sparsely Connected Deep Neural Networks</a></li>
<li><a href="http://helper.ipam.ucla.edu/publications/dlt2018/dlt2018_14936.pdf">Deep Learning: Approximation of Functions by Composition</a></li>
<li><a href="http://www.mit.edu/~9.520/fall16/Classes/deep_approx.html">Deep Neural Networks: Approximation Theory and Compositionality</a></li>
<li><a href="http://voigtlaender.xyz/DNNBonnHandout.pdf">DNN Bonn</a></li>
<li><a href="http://npfsa2017.uni-jena.de/l_notes/vybiral.pdf">From approximation theory to machine learning</a></li>
<li><a href="https://arxiv.org/abs/1808.04947">Collapse of Deep and Narrow Neural Nets</a></li>
<li><a href="https://www.math.tamu.edu/~foucart/publi/DDFHP.pdf">Nonlinear Approximation and (Deep) ReLU Networks</a></li>
<li><a href="http://www.ipam.ucla.edu/abstract/?tid=15953&pcode=GLWS3">Deep Approximation via Deep Learning</a></li>
<li><a href="https://github.com/loliverhennigh/Steady-State-Flow-With-Neural-Nets">Convolutional Neural Networks for Steady Flow Approximation</a></li>
<li><a href="https://www.eurandom.tue.nl/wp-content/uploads/2018/11/Johannes-Schmidt-Hieber-lecture-1-2.pdf">https://www.eurandom.tue.nl/wp-content/uploads/2018/11/Johannes-Schmidt-Hieber-lecture-1-2.pdf</a></li>
<li><a href="https://arxiv.org/abs/2006.00294">https://arxiv.org/abs/2006.00294</a></li>
<li><a href="https://www.sam.math.ethz.ch/sam_reports/reports_final/reports2019/2019-64_fp.pdf">Efficient approximation of high-dimensional functions with deep neural networks</a></li>
</ul>
<h4 id="workshop">Workshop</h4>
<ul>
<li><a href="https://www.mfo.de/occasion/1842b">https://www.mfo.de/occasion/1842b</a></li>
<li><a href="https://www.mfo.de/occasion/1947a">https://www.mfo.de/occasion/1947a</a></li>
<li><a href="https://github.com/juliusberner/oberwolfach_workshop">https://github.com/juliusberner/oberwolfach_workshop</a></li>
<li><a href="https://www.math.tu-berlin.de/fileadmin/i26_fg-kutyniok/Petersen/DGD_Approximation_Theory.pdf">DGD Approximation Theory Workshop</a></li>
</ul>
<h4 id="labs-and-groups">Labs and Groups</h4>
<ul>
<li><a href="https://deepai.org/profile/julius-berner">https://deepai.org/profile/julius-berner</a></li>
<li><a href="https://www.cityu.edu.hk/ma/people/profile/zhoudx.htm">https://www.cityu.edu.hk/ma/people/profile/zhoudx.htm</a></li>
<li><a href="https://dblp.uni-trier.de/pers/hd/y/Yang:Haizhao">https://dblp.uni-trier.de/pers/hd/y/Yang:Haizhao</a></li>
<li><a href="https://math.duke.edu/people/ingrid-daubechies">https://math.duke.edu/people/ingrid-daubechies</a></li>
<li><a href="http://www.pc-petersen.eu/">http://www.pc-petersen.eu/</a></li>
<li><a href="https://wwwhome.ewi.utwente.nl/~schmidtaj/">https://wwwhome.ewi.utwente.nl/~schmidtaj/</a></li>
<li><a href="https://personal-homepages.mis.mpg.de/montufar/">https://personal-homepages.mis.mpg.de/montufar/</a></li>
<li><a href="https://www.math.tamu.edu/~foucart/">https://www.math.tamu.edu/~foucart/</a></li>
<li><a href="http://www.damtp.cam.ac.uk/user/sl767/#about">http://www.damtp.cam.ac.uk/user/sl767/#about</a></li>
<li><a href="http://voigtlaender.xyz/publications.html">http://voigtlaender.xyz/publications.html</a></li>
</ul>
<h3 id="the-f-principle">The F-Principle</h3>
<blockquote>
<p>Understanding the training process of Deep Neural Networks (DNNs) is a fundamental problem in the area of deep learning. The study of the training process from the frequency perspective makes important progress in understanding the strength and weakness of DNN, such as generalization and converging speed etc., which may consist in “a reasonably complete picture about the main reasons behind the success of modern machine learning” (E et al., 2019).</p>
</blockquote>
<blockquote>
<p>The “Frequency Principle” was first named in the paper (Xu et al., 2018), then (Xu 2018; Xu et al., 2019) use more convincing experiments and a simple theory to demonstrate the university of the Frequency Principle. Bengio's paper (Rahaman et al., 2019) also uses the the simple theory in (Xu 2018; Xu et al., 2019) to understand the mechanism underlying the Frequency Principle for ReLU activation function. Note that the second version of Rahaman et al., (2019) points out this citation clearly but they reorganize this citation to “related works” in the final version. Later, Luo et al., (2019) studies the Frequency Principle in the general setting of deep neural networks and mathematically proves Frequency Principle with the assumption of infinite samples. Zhang et al., (2019) study the Frequency Principle in the NTK regime with finite sample points. Zhang et al., (2019) explicitly shows that the converging speed for each frequency and can accurately predict the learning results.</p>
</blockquote>
<p><a href="https://www.researchgate.net/project/Deep-learning-in-Fourier-domain">We aim to develop a theoretical framework on Fourier domain to analyze the Deep Neural Network (DNN) training process and understand the DNN generalization. We exemplified our theoretical results through DNNs fitting 1-d functions and the MNIST dataset.</a></p>
<ul>
<li><a href="https://www.researchgate.net/project/Deep-learning-in-Fourier-domain">Deep learning in Fourier domain</a></li>
<li><a href="http://ins.sjtu.edu.cn:3300/conferences/7/talks/319">Deep Learning Theory: The F-Principle and An Optimization Framework</a></li>
<li><a href="https://arxiv.org/abs/1901.06523">Frequency Principle: Fourier Analysis Sheds Light on Deep Neural Networks</a></li>
<li><a href="https://arxiv.org/abs/1811.01316">Nonlinear Collaborative Scheme for Deep Neural Networks</a></li>
<li><a href="https://arxiv.org/abs/1906.00425">The Convergence Rate of Neural Networks for Learned Functions of Different Frequencies</a></li>
<li><a href="https://arxiv.org/abs/1811.10146">Frequency Principle in Deep Learning with General Loss Functions and Its Potential Application</a></li>
<li><a href="https://arxiv.org/pdf/1906.09235v1.pdf">Theory of the Frequency Principle for General Deep Neural Networks</a></li>
<li><a href="https://arxiv.org/pdf/1905.10264.pdf">Explicitizing an Implicit Bias of the Frequency Principle in Two-layer Neural Networks</a></li>
<li><a href="https://www.researchgate.net/profile/Zhiqin_Xu">https://www.researchgate.net/profile/Zhiqin_Xu</a></li>
<li><a href="https://github.com/xuzhiqin1990/F-Principle">https://github.com/xuzhiqin1990/F-Principle</a></li>
<li><a href="https://ins.sjtu.edu.cn/people/xuzhiqin/">https://ins.sjtu.edu.cn/people/xuzhiqin/</a></li>
</ul>
<h2 id="inverse-problem-and-deep-learning">Inverse Problem and Deep Learning</h2>
<p><a href="https://deep-inverse.org/">There is a long history of algorithmic development for solving inverse problems arising in sensing and imaging systems and beyond.
Examples include medical and computational imaging, compressive sensing, as well as community detection in networks. Until recently,
most algorithms for solving inverse problems in the imaging and network sciences were based on static signal models derived from physics or intuition,
such as wavelets or sparse representations.</a></p>
<p><a href="https://deep-inverse.org/">Today</a>, the best performing approaches for the aforementioned image reconstruction and sensing problems are based on deep learning,
which learn various elements of the method including
i) signal representations,
ii) stepsizes and parameters of iterative algorithms,
iii) regularizers, and iv) entire inverse functions.
For example, it has recently been shown that solving a variety of inverse problems by transforming an iterative, physics-based algorithm into a deep network
whose parameters can be learned from training data, offers faster convergence and/or a better quality solution.
Moreover, even with very little or no learning, deep neural networks enable superior performance for classical linear inverse problems
such as denoising and compressive sensing. Motivated by those success stories, researchers are redesigning traditional imaging and sensing systems.</p>
<ul>
<li><a href="https://earthscience.rice.edu/mathx2019/">MATH + X SYMPOSIUM ON INVERSE PROBLEMS AND DEEP LEARNING IN SPACE EXPLORATION</a></li>
</ul>
<ul>
<li><a href="http://cpaior2019.uowm.gr/">Sixteenth International Conference on the Integration of Constraint Programming, Artificial Intelligence, and Operations Research</a></li>
<li><a href="https://github.com/mughanibu/Deep-Learning-for-Inverse-Problems">https://github.com/mughanibu/Deep-Learning-for-Inverse-Problems</a></li>
<li><a href="https://cv.snu.ac.kr/research/VDSR/">Accurate Image Super-Resolution Using Very Deep Convolutional Networks</a></li>
<li><a href="https://earthscience.rice.edu/mathx2019/">https://earthscience.rice.edu/mathx2019/</a></li>
<li><a href="https://www.researchgate.net/publication/329395098_On_Deep_Learning_for_Inverse_Problems">https://www.researchgate.net/publication/329395098_On_Deep_Learning_for_Inverse_Problems</a></li>
<li><a href="https://www.dlip.org/">Deep Learning and Inverse Problem</a></li>
<li><a href="https://www.scec.org/publication/8768">https://www.scec.org/publication/8768</a></li>
<li><a href="https://amds123.github.io/">https://amds123.github.io/</a></li>
<li><a href="https://github.com/IPAIopen">https://github.com/IPAIopen</a></li>
<li><a href="https://imaginary.org/snapshot/deep-learning-and-inverse-problems">https://imaginary.org/snapshot/deep-learning-and-inverse-problems</a></li>
<li><a href="https://www.researchgate.net/scientific-contributions/2150388821_Jaweria_Amjad">https://www.researchgate.net/scientific-contributions/2150388821_Jaweria_Amjad</a></li>
<li><a href="https://zif.ai/inverse-reinforcement-learning/">https://zif.ai/inverse-reinforcement-learning/</a></li>
<li><a href="https://kailaix.github.io/ADCMESlides/Inverse.pdf">Physics Based Machine Learning for Inverse Problems</a></li>
<li><a href="https://www.ece.nus.edu.sg/stfpage/elechenx/Papers/TGRS_Learning.pdf">https://www.ece.nus.edu.sg/stfpage/elechenx/Papers/TGRS_Learning.pdf</a></li>
</ul>
<h3 id="deep-learning-for-inverse-problems">Deep Learning for Inverse Problems</h3>
<ul>
<li><a href="https://arxiv.org/abs/1803.00092">Deep Learning for Inverse Problems</a></li>
<li><a href="https://deep-inverse.org/">Solving inverse problems with deep networks</a></li>
<li><a href="https://arxiv.org/abs/1901.03707">Neumann Networks for Inverse Problems in Imaging</a></li>
<li><a href="https://deepai.org/publication/unsupervised-deep-learning-algorithm-for-pde-based-forward-and-inverse-problems">https://deepai.org/publication/unsupervised-deep-learning-algorithm-for-pde-based-forward-and-inverse-problems</a></li>
</ul>
<h3 id="deep-inverse-optimization">Deep Inverse Optimization</h3>
<ul>
<li><a href="https://github.com/tankconcordia/deep_inv_opt">deep inverse optimization</a></li>
<li><a href="https://ori.ox.ac.uk/deep-irl/">https://ori.ox.ac.uk/deep-irl/</a></li>
<li><a href="https://physai.sciencesconf.org/data/pages/perez_2019_03_Institut_Pascal_AI_and_Physics_noanim.pdf">https://physai.sciencesconf.org/data/pages/perez_2019_03_Institut_Pascal_AI_and_Physics_noanim.pdf</a></li>
</ul>
<h2 id="random-matrix-theory-and-deep-learning">Random Matrix Theory and Deep Learning</h2>
<p>Random matrix focus on the matrix, whose entities are sampled from some specific probability distribution.
Weight matrices in deep nerual network are initialed in random.
However, the model is over-parametered and it is hard to verify the role of one individual parameter.</p>
<ul>
<li><a href="http://romaincouillet.hebfree.org/">http://romaincouillet.hebfree.org/</a></li>
<li><a href="https://zhenyu-liao.github.io/">https://zhenyu-liao.github.io/</a></li>
<li><a href="https://dionisos.wp.imt.fr/">https://dionisos.wp.imt.fr/</a></li>
<li><a href="https://project.inria.fr/paiss/">https://project.inria.fr/paiss/</a></li>
<li><a href="https://zhenyu-liao.github.io/activities/">https://zhenyu-liao.github.io/activities/</a></li>
<li><a href="https://arxiv.org/abs/1810.01075">Implicit Self-Regularization in Deep Neural Networks: Evidence from Random Matrix Theory and Implications for Learning</a></li>
<li><a href="https://zhenyu-liao.github.io/pdf/pre/Matrix_talk_liao_handout.pdf">Recent Advances in Random Matrix Theory for Modern Machine Learning</a></li>
<li><a href="https://ir.library.louisville.edu/cgi/viewcontent.cgi?article=2227&context=etd">Features extraction using random matrix theory</a></li>
<li><a href="https://papers.nips.cc/paper/6857-nonlinear-random-matrix-theory-for-deep-learning.pdf">Nonlinear random matrix theory for deep learning</a></li>
<li><a href="https://arxiv.org/pdf/1702.05419.pdf">A RANDOM MATRIX APPROACH TO NEURAL NETWORKS</a></li>
<li><a href="http://proceedings.mlr.press/v48/couillet16.pdf">A Random Matrix Approach to Echo-State Neural Networks</a></li>
<li><a href="https://hal.archives-ouvertes.fr/hal-01962073">Harnessing neural networks: A random matrix approach</a></li>
<li><a href="https://www.csail.mit.edu/event/tensor-programs-swiss-army-knife-nonlinear-random-matrix-theory-deep-learning-and-beyond">Tensor Programs: A Swiss-Army Knife for Nonlinear Random Matrix Theory of Deep Learning and Beyond</a></li>
<li><a href="https://arxiv.org/abs/1902.04760">Scaling Limits of Wide Neural Networks with Weight Sharing: Gaussian Process Behavior, Gradient Independence, and Neural Tangent Kernel Derivation</a></li>
<li><a href="http://www-math.mit.edu/~edelman/publications/random_matrix_theory_innovative.pdf">Random Matrix Theory and its Innovative Applications∗</a></li>
<li><a href="https://romaincouillet.hebfree.org/docs/conf/ELM_icassp.pdf">https://romaincouillet.hebfree.org/docs/conf/ELM_icassp.pdf</a></li>
<li><a href="https://romaincouillet.hebfree.org/docs/conf/NN_ICML.pdf">https://romaincouillet.hebfree.org/docs/conf/NN_ICML.pdf</a></li>
<li><a href="http://www.vision.jhu.edu/tutorials/CVPR16-Tutorial-Math-Deep-Learning-Raja.pdf">http://www.vision.jhu.edu/tutorials/CVPR16-Tutorial-Math-Deep-Learning-Raja.pdf</a></li>
<li><a href="https://www.lri.fr/TAU_seminars/videos/Romain_Couillet_12juin2017/talk_lri.pdf">A Random Matrix Framework for BigData Machine Learning</a></li>
</ul>
<h3 id="nonlinear-random-matrix-theory">Nonlinear Random Matrix Theory</h3>
<ul>
<li><a href="https://ai.google/research/pubs/pub46342">https://ai.google/research/pubs/pub46342</a></li>
<li><a href="http://people.cs.uchicago.edu/~pworah/nonlinear_rmt.pdf">http://people.cs.uchicago.edu/~pworah/nonlinear_rmt.pdf</a></li>
<li><a href="https://toc.csail.mit.edu/node/1314">A SWISS-ARMY KNIFE FOR NONLINEAR RANDOM MATRIX THEORY OF DEEP LEARNING AND BEYOND</a></li>
<li><a href="https://simons.berkeley.edu/talks/9-24-mahoney-deep-learning">https://simons.berkeley.edu/talks/9-24-mahoney-deep-learning</a></li>
<li><a href="https://cs.stanford.edu/people/mmahoney/">https://cs.stanford.edu/people/mmahoney/</a></li>
<li><a href="https://www.stat.berkeley.edu/~mmahoney/f13-stat260-cs294/">https://www.stat.berkeley.edu/~mmahoney/f13-stat260-cs294/</a></li>
<li><a href="https://arxiv.org/abs/1902.04760">https://arxiv.org/abs/1902.04760</a></li>
<li><a href="https://melaseddik.github.io/">https://melaseddik.github.io/</a></li>
<li><a href="https://thayafluss.github.io/">https://thayafluss.github.io/</a></li>
</ul>