-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathMITx 6.431x -- Probability - The Science of Uncertainty and Data + Unit_1.Rmd
2672 lines (1316 loc) · 182 KB
/
MITx 6.431x -- Probability - The Science of Uncertainty and Data + Unit_1.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
---
title: "MITx 6.431x -- Probability - The Science of Uncertainty and Data + Unit_1.Rmd"
author: "John HHU"
date: "2022-11-05"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
## R Markdown
This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see <http://rmarkdown.rstudio.com>.
When you click the **Knit** button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
```{r cars}
summary(cars)
```
## Including Plots
You can also embed plots, for example:
```{r pressure, echo=FALSE}
plot(pressure)
```
Note that the `echo = FALSE` parameter was added to the code chunk to prevent printing of the R code that generated the plot.
## Unit 0: Overview, Incomplete section
Course overview, Incomplete
Course introduction, objectives, and study guide, Incomplete
Syllabus, Calendar, and grading policy, Incomplete
Homework mechanics and standard notation, Incomplete
Discussion forum and collaboration guidelines, Incomplete
Textbook information, Incomplete
MicroMasters, Certification, and Honor Pledge, Incomplete
Entrance Survey, Incomplete section
Important Preliminary Survey, Incomplete
## Unit 1: Probability models and axioms, Incomplete section
Lec. 1: Probability models and axioms (10 Questions), Incomplete
Exercises due Sep 7, 2022, 7:59 PM GMT+8
Mathematical background: Sets; sequences, limits, and series; (un)countable sets., Incomplete
Solved problems, Incomplete
Problem Set 1 (6 Questions), Incomplete
Problem Set due Sep 7, 2022, 7:59 PM GMT+8
## Unit 2: Conditioning and independence, Incomplete section
Unit overview, Incomplete
Lec. 2: Conditioning and Bayes' rule (5 Questions), Incomplete
Exercises due Sep 14, 2022, 7:59 PM GMT+8
Lec. 3: Independence (7 Questions), Incomplete
Exercises due Sep 14, 2022, 7:59 PM GMT+8
Solved problems, Incomplete
Problem Set 2 (4 Questions), Incomplete
Problem Set due Sep 14, 2022, 7:59 PM GMT+8
## Important dates
Tue, Aug 30, 2022Today
Course starts
Wed, Sep 7, 2022Due next
Exercises: Lec. 1: Probability models and axiomsdue7:59 PM GMT+8
Problem Set: Problem Set 1due7:59 PM GMT+8
Wed, Sep 14, 2022
Exercises: Lec. 2: Conditioning and Bayes' ruledue7:59 PM GMT+8
Exercises: Lec. 3: Independencedue7:59 PM GMT+8
Problem Set: Problem Set 2due7:59 PM GMT+8
Wed, Sep 21, 2022Not yet released
Exercises: Lec. 4: Countingdue7:59 PM GMT+8
Problem Set: Problem Set 3due7:59 PM GMT+8
Fri, Sep 30, 2022Not yet released
Exercises: Lec. 5: Probability mass functions and expectationsdue7:59 PM GMT+8
Exercises: Lec. 6: Variance; Conditioning on an event; Multiple r.v.'sdue7:59 PM GMT+8
Exercises: Lec. 7: Conditioning on a random variable; Independence of r.v.'sdue7:59 PM GMT+8
Problem Set: Problem Set 4due7:59 PM GMT+8
Wed, Oct 5, 2022Not yet released
Mid Term: Exam 1due7:59 PM GMT+8
Thu, Oct 13, 2022
Verification Upgrade Deadline
You are still eligible to upgrade to a Verified Certificate! Pursue it to highlight the knowledge and skills you gain in this course.
Fri, Oct 14, 2022Not yet released
Exercises: Lec. 8: Probability density functionsdue7:59 PM GMT+8
Exercises: Lec. 9: Conditioning on an event; Multiple r.v.'sdue7:59 PM GMT+8
Exercises: Lec. 10: Conditioning on a random variable; Independence; Bayes' ruledue7:59 PM GMT+8
Problem Set: Problem Set 5due7:59 PM GMT+8
Wed, Oct 26, 2022Not yet released
Exercises: Lec. 11: Derived distributionsdue7:59 PM GMT+8
Exercises: Lec. 12: Sums of independent r.v.'s; Covariance and correlationdue7:59 PM GMT+8
Exercises: Lec. 13: Conditional expectation and variance revisited; Sum of a random number of independent r.v.'sdue7:59 PM GMT+8
Problem Set: Problem Set 6due7:59 PM GMT+8
Wed, Nov 9, 2022Not yet released
Exercises: Lec. 14: Introduction to Bayesian inferencedue7:59 PM GMT+8
Exercises: Lec. 15: Linear models with normal noisedue7:59 PM GMT+8
Exercises: Lec. 16: Least mean squares (LMS) estimationdue7:59 PM GMT+8
Problem Set: Problem Set 7due7:59 PM GMT+8
Mon, Nov 14, 2022
Verified only
Mid Term: Exam 2due7:59 PM GMT+8
Wed, Nov 23, 2022Not yet released
Exercises: Lec. 18: Inequalities, convergence, and the Weak Law of Large Numbersdue7:59 PM GMT+8
Exercises: Lec. 19: The Central Limit Theorem (CLT)due7:59 PM GMT+8
Exercises: Lec. 20: An introduction to classical statisticsdue7:59 PM GMT+8
Problem Set: Problem Set 8due7:59 PM GMT+8
Wed, Dec 7, 2022Not yet released
Exercises: Lec. 21: The Bernoulli processdue7:59 PM GMT+8
Exercises: Lec. 22: The Poisson processdue7:59 PM GMT+8
Exercises: Lec. 23: More on the Poisson processdue7:59 PM GMT+8
Problem Set: Problem Set 9due7:59 PM GMT+8
Fri, Dec 16, 2022Not yet released
Exercises: Lec. 24: Finite-state Markov chainsdue7:59 PM GMT+8
Exercises: Lec. 25: Steady-state behavior of Markov chainsdue7:59 PM GMT+8
Problem Set: Problem Set 10due7:59 PM GMT+8
Mon, Dec 19, 2022
Verified only
Final Exam: Final Examdue7:59 PM GMT+8
Tue, Dec 20, 2022
Course ends
After the course ends, the course content will be archived and no longer active.
Audit Access Expires
You lose all access to this course, including your progress.
Fri, Jan 6, 2023
Certificate Available
Day certificates will become available for passing verified learners.
## Course / Unit 0: Overview / Course overview
# 1. Course character and objectives

Welcome. Before diving into this class, it is useful to have a sense of its character and objectives.
First, we want to *introduce the probabilistic way of thinking*. This involves understanding the nature of probabilistic models, the key concepts, and the mathematical language that goes with them. At the same time, we want to expose you to some of the main types of models that tends to arise in applications.
Second we want to introduce the basics tools of probability theory expressed in the language of mathematics. We will develop a fair number of mathematical skills. Indirectly we also want to advance your ability to think with precision and to express your thinking in a mathematical language. On the other hand this is not a mathematics class. Our aim is not to teach you how to prove theorems nor is it to teach you recipes--how to plug numbers into formulas without thinking. Instead, we will emphasize the interpretation of basic concepts and related facts at an intuitive level, always aiming to complement mathematical arguments with intuitive explanations.
Finally, our most important goal is to bring you to a level where you're ready to apply what you have learned to real world problems. Say, in the context of your job or in a research project. This is a very ambitious goal and the course covers perhaps 40% more than what you would see in a typical introduction to probability class. But we believe that our ambitious goals are realistic. Calculus and mental concentration is all you need. The material in this class has been refined, condensed, and codified over about 50 years of residential offerings at MIT. As a consequence, our hope is that the material is organized and presented in a way that allows learning to move at a fast pace.

Finally I should add some comments about what this class is not about, so as to keep your expectations realistic.
First, as it should be clear from what I said before, this is not some kind of overview class for general scientific literacy. It is not just about understanding what you hear. We really want you to be able to use what you hear. Also, while the subject is very much driven by applications, we will not go through the details of real world examples. Instead, we will go through many examples that serve to enhance your general understanding. In the same spirit, there will not be much in terms of demos, illustrations through plots, or computational exercises. We hope that you will find the mix of material that we have chosen to be really useful and that the end result will be rewarding.
Slides: [clean]
https://courses.edx.org/assets/courseware/v1/a92671eded03002e1b74615efb21f21e/asset-v1:MITx+6.431x+2T2022+type@asset+block/lectureslides_Overview1.pdf
Printable transcript available here.
https://courses.edx.org/assets/courseware/v1/56a1f087231004befb83d79bbcc2db2c/asset-v1:MITx+6.431x+2T2022+type@asset+block/transcripts_Course_overview1.pdf
Discussion
Topic: Unit 0: Overview:Course overview / 1. Course character and objectives
Filter:
Sort:
discussion
Introductions
Hi, everyone. My name is James Bullock from Oakdale, Louisiana. I’m excited to start this journey with you. I’ve been through several of the MIT boot camps for data science and have already had the pleasure of taking a few courses by Dr. John. He’s one of the best teachers because he speaks with such intention. Don’t hesitate to reach out if you need help. I’m not claiming to be an expert at probability or statistics, but I’m not too bad at Python and machine learning. Let’s go!
1 comments
discussion
Hello world from Gary, Indiana USA!
Located very near the South Side of Chicago. Good luck to everyone :)
1 comments
discussion
Hi!
Hi! Really excited to learn new things here
1 comments
discussion
Hi every one
I am happy with this new journey
1 comments
discussion
Good luck everyone!
We got this.
11 comments (11 unread comments)
discussion
hi everyone!
Wait is over. :)
1 comments
discussion
Hi Everyone !
Really excited to be a part of this program. All the BEST to everyone.
1 comments
discussion
My greatest expectation:
"Indirectly we also want to advance your ability to think with precision and to express your thinking in a mathematical language"
1 comments
discussion
Hi!
Happy to be here.
1 comments
# 2. Why study probability?
Why study probability? If you're watching this clip, it is probably because you have already registered for this class and therefore are already somewhat convinced that the subject is useful. Nevertheless, let me add some more perspective. Until quite recently, scientific literacy meant calculus, some physics, and some chemistry. With the more recent addition of familiarity with computers and computation, this was all you needed to know in order to make sense of the world.
But these days, there's not much you can understand about what is going on around you if you do not understand the uncertainty attached to pretty much every phenomenon. In fact, I predict that for most of you in your careers, you're more likely to have to deal with uncertainty, for example, analyzing noisy data, rather than having to calculate integrals. Probability is now a central component of scientific literacy. What is it that has changed and caused this shift?
I can think of two main factors. As science and engineering move forward, we end up dealing with more and more complex systems. And in a complex system, we cannot expect to have a perfect model of each component or to know the exact state of every piece of the system. So uncertainty is now at the foreground and needs to be modeled. The second factor is that we live in an information society. Data and information play an increasingly central role, both in our individual lives and in the economy as a whole. Now, data and information are only useful because they can tell us something we did not know. Their reason for existence is to reduce uncertainty.

But if your goal is to reduce uncertainty, to fight it, you'd better understand its nature. You'd better have the tools to describe it and analyze it. And this is why probability theory and its children--statistics and inference--is a must. If these arguments sound a bit too abstract, just think of any scientific field, and you quickly realize that maybe, other than the motion of the planets, everything else involves uncertainty and calls for probabilistic models. Think of physics. Quantum mechanics has taught us that nature is inherently uncertain. Think of biological evolution. It progresses through the accumulation of many random effects, like mutations, within an uncertain environment. Think also of the haystack of biological data that we are accumulating and that needs to be sifted using statistical tools in order to make progress in the biomedical sciences. Think of communications and signal processing. These fields are almost by definition a fight against noise, an effort to clean signals from the noise that nature has added. Think of management. Customer demand is random, and you want to be able to model it and predict it. Think of finance. Markets are uncertain, and whoever has the best methods to analyze financial data has an advantage. Think of transportation systems. Random disruptions due to weather or accidents are a major concern. Think of trends in social networks, which spread like epidemics but in ways that are hard to predict.

I could go on and on, giving you many more examples. But the message is hopefully clear. Most phenomena of interest involve significant randomness. And the only reason we collect and manipulate data is because we want to fight this randomness as much as we can. And the first step in fighting an enemy like randomness is to study and understand your enemy.
Slides: [clean]
https://courses.edx.org/assets/courseware/v1/ae4cc75a89354c128dbf45eb11794268/asset-v1:MITx+6.431x+2T2022+type@asset+block/lectureslides_Overview2.pdf
Printable transcript available here.
https://courses.edx.org/assets/courseware/v1/f338645c7e42246c16ce8049110cf3c4/asset-v1:MITx+6.431x+2T2022+type@asset+block/transcripts_Course_overview2.pdf
Discussion
Topic: Unit 0: Overview:Course overview / 2. Why study probability?
Filter:
Sort:
discussion
Study your enemy
Wow. I really enjoyed his last comment. I feel like I just watched a clip from the art of war for mathematics. Study the enemy - randomness.
# 3. Course contents

Let me now walk you quickly through a high-level summary of this class. Units 1 to 5 together with Unit 8 include material that, at some level, gets covered in any undergraduate probability class. In Units 1 to 5, we introduce the general framework of probability theory and learn how to put together models, calculate probabilities, calculate certain types of averages, and also the general rules for incorporating new evidence into a model. Unit 8 also covers material that is standard in a first course, by covering laws of large numbers, what happens when you average many random measurements. Unit 6 adds some more depth to the basic material by considering a few special topics, that although they may not always get covered in a first course, they are nevertheless indispensable if you are to have working knowledge of the subject.
The less conventional part of this class comes in the remaining three units. In Unit 7, we study the subject of [][statistical inference] in some depth. Even though at some level it is just an application of the basic theory in earlier units, we discuss it in enough detail to get you ready to use it [in] real world inference problems, whatever your field happens to be. The other non-standard component comes in Units 9 and 10, which provide you with an introduction to the simplest and most basic models of random processes that evolve in time. This is because most real-world phenomena do involve a time aspect, and also because this is where you can finally get to use, in interesting ways, the tools that you will have accumulated.
Slide: [clean]
https://courses.edx.org/assets/courseware/v1/2133f8269975c59e752a3908ec9b4ec8/asset-v1:MITx+6.431x+2T2022+type@asset+block/lectureslides_Overview3.pdf
Printable transcript available here.
https://courses.edx.org/assets/courseware/v1/c52059c1eb93233faf1a05de95b97064/asset-v1:MITx+6.431x+2T2022+type@asset+block/transcripts_Course_overview3.pdf
Discussion
Topic: Unit 0: Overview:Course overview / 3. Course contents
Filter:
Sort:
discussion
Which units are the most appealing to you?
For me, I can see a lot of value for Baynesian inference and Markov chains in real world applications. Hoping that the other units will help build my foundation for these
1 comments
## Course / Unit 0: Overview / Course introduction, objectives, and study guide
# 1. Introduction and Course Team
Welcome to 6.431x, an introduction to probabilistic models, including random processes and the basic elements of statistical inference.
The world is full of uncertainty: accidents, storms, unruly financial markets, noisy communications. The world is also full of data. Probabilistic modeling and the related field of statistical inference are the keys to analyzing data and making scientifically sound predictions.
The course covers all of the basic probability concepts, including:
multiple discrete or continuous random variables, expectations, and conditional distributions
laws of large numbers
the main tools of Bayesian inference methods
an introduction to random processes (Poisson processes and Markov chains)
Discussion
Topic: Unit 0: Overview:Course introduction, objectives, and study guide / 1. Introduction and Course Team
Filter:
Sort:
There are no posts in this topic yet.
# 2. Course objectives
Upon successful completion of this course, you will:
At a conceptual level:
Master the basic concepts associated with probability models .
Be able to translate models described in words to mathematical ones.
Understand the main concepts and assumptions underlying Bayesian and classical inference .
Obtain some familiarity with the range of applications of inference methods .
At a more technical level:
Become familiar with basic and common probability distributions .
Learn how to use conditioning to simplify the analysis of complicated models.
Have facility manipulating probability mass functions , densities , and expectations .
Develop a solid understanding of the concept of conditional expectation and its role in inference.
Understand the power of laws of large numbers and be able to use them when appropriate.
Become familiar with the basic inference methodologies (for both estimation and hypothesis testing ) and be able to apply them.
Acquire a good understanding of two basic stochastic processes (Bernoulli and Poisson) and their use in modeling.
Learn how to formulate simple dynamical models as Markov chains and analyze them.
Discussion
Topic: Unit 0: Overview:Course introduction, objectives, and study guide / 2. Course objectives
Filter:
Sort:
There are no posts in this topic yet.
# 3. Meet the Course Team
Meet the Course Team
Instructors
Professor John Tsitsiklis
Dr. John Tsitsiklis is a Clarence J Lebel Professor in the Department of Electrical Engineering and Computer Science, and the director of the Laboratory for Information and Decision Systems at MIT.
His research interests are in the fields of systems, optimization, control, and operations research. He is a coauthor of Parallel and Distributed Computation: Numerical Methods (1989, with D. Bertsekas), Neuro-Dynamic Programming (1996, with D. Bertsekas), Introduction to Linear Optimization (1997, with D. Bertsimas), and Introduction to Probability (1st ed. 2002, 2nd. ed. 2008, with D. Bertsekas). He is also a coinventor in seven awarded U.S. patents.
He is a member of the National Academy of Engineering, and a Fellow of the IEEE (1999) and of INFORMS (2007). His distinctions include the ACM Sigmetrics Achievement Award (2016), the INFORMS John von Neumann Theory Prize (2018), and the IEEE Control Systems Award (2018). He holds honorary doctorates from the Universite catholique de Louvain, (2008), the Athens University of Economics and Business (2018), and the Harokopio University.
Professor Tsitsiklis has been teaching probability for over 20 years.
Professor Patrick Jaillet
Patrick Jaillet is Dugald C. Jackson Professor in the Department of Electrical Engineering and Computer Science and a member of the Laboratory for Information and Decision Systems at MIT.
Professor Jaillet's research interests include online optimization and learning; machine learning; and decision making under uncertainty. Professor Jaillet's teaching covers subjects such as machine learning; algorithms; mathematical programming; network science and models; and probability. Dr. Jaillet's consulting activities primarily focus on the development of optimization-based analytic solutions in various industries, including defense, financial, electronic marketplace, and information technology.
Professor Jaillet was a fulbright scholar in 1990 and the recipient of many research and teaching awards. He is a Fellow of the Institute for Operations Research and Management Science Society (INFORMS), a member of the Mathematical Optimization Society (MOS), and a member of the Society for Industrial and Applied Mathematics (SIAM). He is currently an Associate Editor for INFORMS Journal on Optimization, Networks, and Naval Research Logistics, and has been an Associate Editor for Operations Research from 1994 until 2005 and for Transportation Science from 2002 until 2017.
Professor Dimitri Bertsekas
Dimitri P. Bertsekas is McAfee Professor of Engineering in the Electrical Engineering and Computer Science Department of MIT. In 2019, he was also appointed a full time professor in the department of Computer, Information, and Decision Systems Engineering at Arizona State University, Tempe, while maintaining a research position at MIT.
His research spans several fields, including optimization, control, large-scale computation, and data communication networks, and is closely tied to his teaching and book authoring activities. He has written numerous research papers, and seventeen books and research monographs, several of which are used as textbooks in MIT classes.
Professor Bertsekas was awarded the INFORMS 1997 Prize for Research Excellence in the Interface Between Operations Research and Computer Science for his book "Neuro-Dynamic Programming", the 2000 Greek National Award for Operations Research, the 2001 ACC John R. Ragazzini Education Award, the 2009 INFORMS Expository Writing Award, the 2014 ACC Richard E. Bellman Control Heritage Award for "contributions to the foundations of deterministic and stochastic optimization-based methods in systems and control," the 2014 Khachiyan Prize for Life-Time Accomplishments in Optimization, and the SIAM/MOS 2015 George B. Dantzig Prize. In 2018, he was awarded, jointly with his coauthor John Tsitsiklis, the INFORMS John von Neumann Theory Prize, for the contributions of the research monographs "Parallel and Distributed Computation" and "Neuro-Dynamic Programming". In 2001, he was elected to the United States National Academy of Engineering for "pioneering contributions to fundamental research, practice and education of optimization/control theory, and especially its application to data communication networks."
Prof Bertsekas has been teaching probability for over 15 years.
On the Forum
Silun Zhang
Silun joins the MicroMasters in Statistics and Data Science program in 2021. He is a Postdoctoral Associate at the Institute of Data, Systems, and Society (IDSS) at MIT, and holds a Ph.D. degree in Applied Mathematics specialized in Optimization and Systems Theory.
His research interests include nonlinear control, networked systems, rigid-body attitude control, and modeling large-scale systems.
During his Ph.D., he served as a TA for three courses and gained extensive experience in teaching theoretical materials in a highly interactive and heuristic manner.
You may have seen him on the forum in the course Data Analysis: Statistical Modeling and Computations for Applications. He will be the main instructor answering your questions on the discussion forum.
Behind the Scenes
Qing He
Qing He received her PhD in the MIT Department of Electrical Engineering and Computer Science. Her research interests include inference, signal processing, and wireless communications -- all of which rely on the fundamental concepts taught in 6.041x/6.431x. Qing has taken several probability classes at MIT, and has been a teaching assistant for this course for two semesters.
Eren Kizildag
Eren Kizildag is a graduate student in the Department of Electrical Engineering and Computer Science at MIT; and doing research in the Laboratory for Information and Decision Systems (LIDS) and the Research Laboratory of Electronics (RLE). His research interests include probability, signal processing and optimization.
Even though you will not see him in videos, Eren has made significant contribution to the written content of this course.
Jimmy Li
Jimmy Li received his PhD from MIT’s Department of Electrical Engineering and Computer Science. His research focused on applying the tools taught in this and related courses to problems in marketing. He took 6.041x/6.431x as an undergraduate and has also been a TA for the course three times.
Jagdish Ramakrishnan
Jagdish Ramakrishnan received his PhD from the Department of Electrical Engineering and Computer Science at MIT. His dissertation focused on optimizing the delivery of radiation therapy cancer treatments dynamically over time. His general research interests include systems modeling, optimization, and resource allocation. He was a teaching assistant for this course twice while at MIT.
Katie Szeto
Katie Szeto received her Bachelor and Master of Engineering degrees from MIT. Her Master’s thesis explored applications of probabilistic rank aggregation algorithms. Katie took 6.041x/6.431x with Professor Tsitsiklis when she was a sophomore at MIT. Later, as a graduate student, she was a teaching assistant for the class.
Kuang Xu
Kuang Xu received his PhD from MIT’s Department of Electrical Engineering and Computer Science. His research focused on the design and performance analysis of large-scale networks, such as data centers and the Internet, which involve a significant amount of uncertainties and randomness. Kuang took his first probability course in his junior year, and served as a teaching assistant for 6.041x/6.431x in 2012.
Karene Chu
Karene Chu received her Ph.D. in mathematics from the University of Toronto in 2012. Since then she has been a postdoctoral fellow first at the University of Toronto/Fields Institute, and then at MIT, with research focus on knot theory. She has taught multiple courses in mathematics at the University of Toronto where she received a teaching award.
Since then, as a digial learning lab fellow at MIT, she made major and significant contribution to the MITx courses in mathematics, including the Calculus series and Differential equations series. She is now leading the effort in the production and running of the IDSS Micromasters Program in Statistics and Data Science.
Special thanks to
Video Producer
Robby Macbain
IDSS Support Staff
Susana Kevorkova
Jeremy Rossen
MITx Support Staff
David Chotin
Shelly Upton
Kyle Boots
Lana Scott
# 4. Study guide
A guide on how to use the wealth of available material
This class provides you with a great wealth of material, perhaps more than you can fully digest. This “guide" offers some tips about how to use this material.
Start with the overview of a unit, when available. This will help you get an overview of what is to happen next. Similarly, at the end of a unit, watch the unit summary to consolidate your understanding of the “big picture" and of the relation between different concepts.
Watch the lecture videos. You may want to download the slides (clean or annotated) at the beginning of each lecture, especially if you cannot receive high-quality streaming video. Some of the lecture clips proceed at a moderate speed. Whenever you feel comfortable, you may want to speed up the video and run it faster, at 1.5x.
Do the exercises! The exercises that follow most of the lecture clips are a most critical part of this class. Some of the exercises are simple adaptations of you may have just heard. Other exercises will require more thought. Do your best to solve them right after each clip — do not defer this for later – so that you can consolidate your understanding. After your attempt, whether successful or not, do look at the solutions, which you will be able to see as soon as you submit your own answers.
Solved problems and additional materials. In most of the units, we are providing you with many problems that are solved by members of our staff. We provide both video clips and written solutions. Depending on your learning style, you may pick and choose which format to focus on. But in either case, it is important that you get exposed to a large number of problems.
The textbook. If you have access to the textbook, you can find more precise statements of what was discussed in lecture, additional facts, as well as several examples. While the textbook is recommended, the materials provided by this course are self-contained. See the “Textbook information" tab in Unit 0 for more details.
Problem sets. One can really master the subject only by solving problems – a large number of them. Some of the problems will be straightforward applications of what you have learned. A few of them will be more challenging. Do not despair if you cannot solve a problem – no one is expected to do everything perfectly. However, once the problem set solutions are released (which will happen on the due date of the problem set), make sure to go over the solutions to those problems that you could not solve correctly.
Exams. The midterm exams are designed so that in an on-campus version, learners would be given two hours. The final exam is designed so that in an on-campus version, learners would be given three hours. You should not expect to spend much more than this amount of time on them. In this respect, those weeks that have exams (and no problem sets!) will not have higher demands on your time. The level of difficulty of exam questions will be somewhere between the lecture exercises and homework problems.
Time management. The corresponding on-campus class is designed so that students with appropriate prerequisites spend about 12 hours each week on lectures, recitations, readings, and homework. You should expect a comparable effort, or more if you need to catch up on background material. In a typical week, there will be 2 hours of lecture clips, but it might take you 4-5 hours when you add the time spent on exercises. Plan to spend another 3-4 hours watching solved problems and additional materials, and on textbook readings. Finally, expect about 4 hours spent on the weekly problem sets.
Additional practice problems. For those of you who wish to dive even deeper into the subject, you can find a good collection of problems at the end of each chapter of the print edition of the book, whose solutions are available online.
Discussion
Topic: Unit 0: Overview:Course introduction, objectives, and study guide / 4. Study guide
Filter:
Sort:
There are no posts in this topic yet.
## Course / Unit 0: Overview / Syllabus, Calendar, and grading policy
# 1. Syllabus
The weekly deadlines for lecture exercises and Problem sets are Wednesdays 11:59AM UTC .
The closing times for exams are Tuesdays 11:59AM UTC .
The release dates of lecture exercises, homework, and exams are roughly two weeks before the respective due dates.
Please refer to the “Dates" tab for due dates. The syllabus below and calendar on the next page include both the release and due dates.
Warning on due time: All deadlines are at 11:59AM UTC . Note the AM and the UTC time. Please find the corresponding time at your current location.
# 2. Grading policy
Grading policy
Your overall score in this class will be a weighted average of your scores for the different components, with the following weights:
20% for the lecture exercises (divided equally among 21 (out of 24) lectures)
20% for the problem sets (divided equally among 9 (out of 10) problem sets)
18% for the first midterm exam (timed)
18% for the second midterm exam (timed)
24% for the final exam (timed)
To earn a verified certificate for this course, you will need to obtain an overall score of 60% or more of the maximum possible overall score.
Lecture Exercises and Problem Sets
The lowest 2 scores among the 23 lectures will be dropped, so only 21 out of 23 lectures will count . The lowest 1 score among the 10 problem sets will be dropped, so only 9 out of 10 problem sets will count .
This policy is to accommodate for scheduling conflict, illness, or events which might deter you from completing the work before the deadline with the best grades you can. However, we still fully expect you to learn the material for any dropped assignments, and the exams will cover everything.
Note that not every problem set or set of lecture exercises will have the same number of raw points. For example, Problem Set 1 may have 30 points and Problem Set 2 may have 35 points. However, each one receives the same weight for the purposes of calculating your overall score.
As an illustrative example, if you receive 20 points out of 30 on Problem Set 1, this will contribute 20/30 * (20%)/9 = 1.48% to your overall score. Similarly, if you receive 30 points out of 35 on Problem Set 2, this will contribute 30/35 * (20%)/9 = 1.9% to your overall score.
Under the “Progress" tab at the top, you can see your score broken down for each assignment, as well as a summary plot.
Timed Exams
The 2 midterm exams and one final exam are timed exams . This means that each exam is available for approximately a week, but once you open the exam, there is a limited amount of time (48 hours), counting from when you start, within which you must complete the exam. Please plan in advance for the exams. If you do not complete the whole exam during the allowed time, you will miss the points associated with the questions that have not been answered. The exams are designed to assess your knowledge. There are no extensions granted to these deadlines. You can find the exam dates on the calendar on the previous page.
Note that the timed exams cannot be completed using the edX mobile app.
MITx Committment to Accessibility
If you have a disability-related request regarding accessing an MITx course, including exams, please contact the course team as early in the course as possible (at least 2 weeks in advance of exams opening) to allow us time to respond in advance of course deadlines. Requests are reviewed via an interactive process to meet accessibility requirements for learners with disabilities and uphold the academic integrity for MITx.
## Course / Unit 0: Overview / Homework mechanics and standard notation
# 1. Checking and submitting an answer
Checking and submitting an answer
For each problem, you will have between 2 to 5 attempts to submit an answer, with the exception of problems where an attempt essentially reveals the answer (e.g., True/False questions), for which you will be limited to a single attempt.
To submit your answer, click the “Submit " button. This will automatically submit the problem for grading purposes, and the edX platform is able to verify your answer and give you immediate feedback as to whether or not your answer is correct. To save your answer without submitting it for grading purposes, click the “Save" button. Your answer will be restored when you return to the problem.
The number of attempts allowed as well as the number of attempts you've already made will always be visible on a problem's page at the bottom, next to the “Check" button. Please note that for problems consisting of multiple parts, hitting the button will count as an attempt for all parts of the problem. Unfortunately, it is not possible to submit answers for one part at a time.
For lecture exercises, a “Show Answer(s)" button will appear immediately after you submit the correct answer or use all of your attempts. Clicking this button will reveal the correct answers and solutions.
For homework problems, the “Show Answer(s)" button will appear after the due date of the homework.
You are strongly encouraged to look at the solutions even if your answer is correct.
Answer formats
This course will use several answer formats:
Multiple choice : Select the correct option from the dropdown menu or radio buttons.
Numerical answers : Enter a number, in decimal (e.g., '3.14159'), in fractional form (e.g., '22/7'), or as a numerical expression (i.e. an expression you would enter in a calculator, e.g. '4*(3.56)/(2*(1001)'). Do not enter any non-numerical letters or symbols other than well-known mathematical constants (these will be covered in the standard notation table on the next page). Exact expressions are encouraged, but to account for rounding, the system will accept a range of answers as correct. Unless otherwise specified in the problem, the default tolerance range will be +/-3% of the correct answer.
Symbolic answers : Some problems will ask for a symbolic answer (e.g., 'n*(n+1)/2'). See the next section on “Standard notation" for details on how to submit such answers.
Below are some example problems for you to familiarize yourselves with how these problem types work with different number of attempts. These problems are not graded and have no impact on your grade.
## Sample numerical problem
0 points possible (ungraded)
This problem has 20 attempts. If you get an answer wrong, you can simply try again until you run out of attempts.
Also try the “Save" button. It will save your answer without submission.
In this problem, just like in any lecture exercises, you will have access to the “Show Answer" button once you have submitted the correct answer, run out of problem attempts, or when the due date has passed. Note that in homework problems, the “Show Answer" button will not appear until after the due dates.
## How many lectures are there in this course?
incorrect
You have used 2 of 20 attempts Some problems have options such as save, reset, hints, or show answer. These options follow the Submit button.
Incorrect
Sample Multiple Choice
0 points possible (ungraded)
[][Solution:]
There are 26 lectures, 11 problem sets, and 3 exams.
You have used 3 of 20 attempts Some
## Which choice is correct?
incorrect
correct
incorrect
incorrect
unanswered
You have used 0 of 2 attempts Some problems have options such as save, reset, hints, or show answer. These options follow the Submit button.
## Sample Drop-down Multiple Choice
0 points possible (ungraded)
Another format of multiple choice problems uses the dropdown menu. Often, true-or-false questions will be in this form.
Note also that for any problem that contains multiple parts, you will be able to hit the “Submit" button only after you have answered all parts.
Which of the following is true?
unsubmitted
Which of the following is false?
unanswered
You have used 0 of 1 attempt Some problems have options such as save, reset, hints, or show answer. These options follow the Submit button.
Any questions?
Click “Show Discussion" below to see discussions on homework mechanics. Remember to search for the topic before adding a new post!
Discussion
Topic: Unit 0: Overview:Homework mechanics and standard notation / 1. Checking and submitting an answer
Filter:
Sort:
unanswered question
[STAFF] How many lectures are there?
Hello! I got the answer correct to the first question correct with a bit of trial and error, but on the previous page it gives us a few different numbers, and none of them work. Under the scoring breakdown we see: > 20% for the lecture exercises (divided equally among 21 (out of 24) lectures) Then a few lines later: > The lowest 2 scores among the 23 lectures will be dropped, so only 21 out of 23 lectures will count . None of those numbers work in question 1 on this page. I did a search on the previous page for the answer that eventually got marked correct, and got no results. Is this an error? Thanks!
7 comments
# 2. Standard notation
Many exercises and problems throughout the course will ask you to provide an algebraic answer in terms of symbols. Please follow the guidelines below when entering your responses. Below your answer textbox, the system will also display, in a "pretty" format, what it has interpreted your input to be. However, this display is not perfect (for example, it does not catch all cases of missing close parentheses) so please also check your text input carefully.
Symbols are case-sensitive: a and A are different – make sure to use the correct case as specified in the problem
Parentheses: make sure that your parentheses are properly balanced – each open parenthesis should have a matching close parenthesis!
Elementary arithmetic operations: use the symbols + , - , * , / for addition, subtraction, multiplication, and division, respectively
1 + bc - d/e should be entered as 1+b*c-d/e
For multiplication, use * explicitly:
in the example above, enter b*c ; do NOT enter bc
for 2n(n + 1), enter 2*n*(n+1) ; do NOT enter 2n(n+1)
although the "pretty" display underneath your answer looks correct if you do not include * s, your answer will be marked incorrect!
Exponents: use the symbol ^ to denote exponentiation
2**n should be entered as 2^n
x**(n+1) should be entered as x^(n+1)
Square root: use the string of letters sqrt , followed by enclosing what is under the square root in parentheses
sqrt(-1) should be entered as sqrt(-1)
Mathematical constants: use the symbol e for the base of the natural logarithm, e; use the string of letters pi for Pi
e**(iPi) + 1 should be entered as e^(i*(pi))+1
Order of operations: 1) parentheses, 2) exponents and roots, 3) multiplication and division, 4) addition and subtraction
(1/(sqrt(2Pi)) * e^((x**2)/2)) should be entered as (1/sqrt(2*(pi)))*e^(-(x^2)/2)
a/b*c is interpreted as (a/b) * c; enter a/(b*c) for a/(bc)
When in doubt, use additional parentheses to remove possible ambiguitites
Natural logarithm: although in lectures and solved problems we will sometimes use the notation "log" (instead of "ln"), you should use the string of letters ln , followed by the argument enclosed in parentheses
ln(2x) should be entered as ln(2*x)
Trigonometric functions: use the usual 3-letter symbols to denote the standard trigonometric functions
sin(x) should be entered as sin(x)
Greek letters: use the Latin-character name to denote each Greek letter
(Rho*e)^(- Rho*t) should be entered as lambda*e^(-lambda*t)
mu*a*Sigma should be entered as mu*alpha*theta
Factorials, permutations, combinations: you will not need enter these for any symbolic answers; do NOT use ! in your answers as it will not be evaluated correctly!
Sample Problem with Symbolic Answers
0 points possible (ungraded)
Each symbolic entry problem will have a Standard Notation button just above the submit button, where you can find the guidelines above. (See the button below.)
To see how the symbolic answer box works, enter the function[][ 2*e^((x-1)^2/3) using the rules above. That is, type 2*e^((x-1)^2/3)] into the answer box.
Below the answer box, there is a display box that shows how the system interpreted your formula. (You will get an error if you try to enter variables, like z, that are unspecified in the problem. You will get an error as you are typing if you have only one half “(" of a parenthesis. Just continue typing, and when you enter the closing parenthesis “)", the error will go away.)
unanswered
Loading
? STANDARD NOTATION
You have used 0 of 25 attempts Some problems have options such as save, reset, hints, or show answer. These options follow the Submit button.
Any questions?
Click “Show Discussion" below to see discussions on Standard Notation. Remember to search for the topic before adding a new post!
Discussion
Topic: Unit 0: Overview:Homework mechanics and standard notation / 2. Standard notation
Filter:
Sort:
unanswered question
Cubic root or root of 4-th and more power
How should I denote the root with 3 or more power in the answer field?
1 comments
unanswered question
Does anyone cannot see the notations?
It is somehow replaced by [Math Processing Error]...
1 comments
unanswered question
Is three a full Cheat-sheet of what can be filled out ?
Although the boundaries of what is and is not allowed is fairly clear it would be helpful to have a cheat-sheet of what is allowed to fill out. Does this exist ?
1
## Course / Unit 0: Overview / Discussion forum and collaboration guidelines
# 1. Discussion Forum guidelines
Discussion forum overview
The course provides an online discussion forum for you to communicate with the course team and other learners. You may access the forum through the “Discussion" tab at the top of the page, as well as through many embedded discussions within each unit. We recommend using the embedded discussions within each unit to discuss topics related to a specific unit's materials, whether it's lectures, solved problems, or problem set problems. Please see the guidelines below for more information on how to use these embedded discussions.
For other more general discussions, you may use the “Discussion" tab at the top of the page. When creating a new post, please choose one of the following categories that best describes your post:
Introductions: Introduce yourself to your fellow learners and find out more about them!
MicroMasters: Ask questions related to the MITx Micromaster Program in Statistics and Data Science and meet other MicroMasters fellows!
Course Feedback: Let the course team know how you are finding the course, what you think works well, and what you would like to see improved.
Technical Problems: Let the course team know about any technical issues you are dealing with (e.g., playing videos, entering answers, etc).
General: Other general discussions.
Discussion forum guidelines
The discussion forum is the main way for you to communicate with the course team and other learners. We hope it contributes to a sense of community and serves as a useful resource for your learning. Here are some guidelines to help you successfully navigate and interact on the forum:
Use discussion while working through the material. Beginning with Unit 1, each lecture will contain an embedded discussion located at the bottom of the lecture overview clip, which is the first or second clip of that lecture sequence. You should discuss anything related to that lecture's video clips or exercises there. Click “Show Discussion" to see all discussions associated with the lecture, and click “Add a Post" to post a new topic. In addition, every solved problem and problem set problem will have its own embedded discussion located at the bottom of their respective pages. As with the lecture discussions, click “Show Discussion" and “Add a Post" to see and create discussion topics related to that specific problem. We recommend that you use these in-page discussion boards to help focus discussions on specific topics.
Use informative topic titles and tags. To make it easier to identify relevant discussion topics, please use informative titles and tags when creating a new discussion topic. We suggest using titles or tags that are as informative as possible, e.g., “Lecture 1 / 4. Exercise: Sample space, clarify part 2"
Be very specific. Provide as much information as possible about what you need help for: Which part of what problem or video? Why do you not understand the question? Do you need help understanding a particular concept? What have you tried doing so far? Use a descriptive title to your post. This will attract the attention of other learners having the same issue.
Observe the honor code. We encourage collaboration and help, but please do not ask for nor post problem solutions.
Upvote good posts. This applies to questions and answers. Click on the green plus button so that good posts can be found more easily.
Search before asking. The forum can become hard to use if there are too many threads, and good discussions happen when people participate in the same thread. Before asking a question, use the search feature by clicking on the magnifying glass on the left-hand side.
Write clearly. We know that English is a second language for many of you but correct grammar will help others to respond. Avoid ALL CAPS, abbrv of wrds (abbreviating words), and excessive punctuation!!!!
Please Introduce Yourself!
Let's get started by introducing yourselves on the discussion forum. A lot of the learning in this class will happen in your interactions with each other.
Click on the post titled “Introduce yourself!" below, and respond to it by telling everyone your name, where you are from, why you are taking this course, and whatever else you would like to share! Your post will be indexed in the “Introductions" category in the forum.
Discussion
Topic: Introductions / Please introduce yourself
# 1. Discussion Forum guidelines
We encourage you to interact with your fellow learners and engage in active discussion about the course. Please use the guidelines below for acceptable collaboration.
The staff will be proactive in removing posts and replies in the discussion forum that have stepped over the line.
Given a problem, it is ok to discuss the general approach to solving the problem.
You can work jointly to come up with the general steps for the solution.
It is ok to get a hint, or several hints for that matter, if you get stuck while solving a problem.
You should work out the details of the solution yourself.
It is not ok to take someone else's solution and simply copy the answers from their solution into your checkboxes.
It is not ok to take someone else's formula and plug in your own numbers to get the final answer.
It is not ok to post answers to homework and lab problems before the submission deadline.
It is not ok to look at a full step-by-step solution to a problem before the submission deadline.
It is ok to have someone show you a few steps of a solution where you have been stuck for a while, provided of course, you have attempted to solve it yourself without success.
After you have collaborated with others in generating a correct solution, a good test to see if you were engaged in acceptable collaboration is to make sure that you are able to do the problem on your own.
Discussion
Topic: Unit 0: Overview:Discussion forum and collaboration guidelines / 2. Collaboration guidelines
## Course / Unit 0: Overview / Textbook information
# 1. Textbook
The class follows closely the text Introduction to Probability, 2nd edition, by Bertsekas and Tsitsiklis, Athena Scientific, 2008; see the publisher's website for more information.
The materials provided by this course are self-contained, and the texbook is not required, but it is recommended - several past learners have found it to be a useful complement.