What does a point on the normal distribution represent?

Here’s another Quora answer I’m reposting here. This is the question, followed by my answer.

What does the value of a point on the normal distribution actually represent, if anything?

 

It’s important to note the difference between discrete and continuous random variables as we answer this question. Though naming conventions vary, I think most mathematicians would agree that a discrete random variable has a Probability Mass Function (PMF) and a continuous random variable has a Probability Density Function (PDF).

The words mass and density go a long way in helping to capture the difference between discrete and continuous random variables. For a discrete random variable, the PMF evaluated at a certain x gives the probability of x. For a continuous random variable, the PDF at a certain x does not give the probability at all, it gives the density. (As advertised!)

So what is the probability that a continuous random variable takes on a certain value? For example, assume a certain type of fish has length X that is normally distributed with mean 22 cm and standard deviation 1.6 cm. What is the probability of selecting a fish exactly 26 cm long? That is, what is P(X=26)?


The answer, for any continuous random variable, is zero. More formally, if X is a continuous random variable with support \mathcal{S}, then P(X=x)=0 for all x\in\mathcal{S}.

For the fish problem, this actually does make sense. Think about it. You pull a fish out of the water which you claim is 26 cm long. But is it really 26 cm long? Exactly 26 cm long? Like 26.00000… cm long? With what precision did you make that measurement? This should explain why the probability is zero.

If instead you want to ask about the probability of getting a fish between 25.995 and 26.005 cm long, that’s perfectly fine, and you’ll get a positive answer for the probability (it’s a small answer :-).

Let’s return to the words mass and density for a second. Think about what those words mean in a physics context. Imagine having a point mass–this is in an ideal case–then the mass of that point is defined by a discrete function. In reality, though, we have density functions that assign a density to each point in an object.

Think about a 1-dimmensional rod with density function \rho(x)=x, x\in (0,10). What is the mass of this rod at x=5? Of course, the answer is zero! This should make intuitive sense. Of course, we can get meaningful answers to questions like: What is the mass of the rod between x=5 and x=6? The answer is \int_5^6 xdx=5.5.

Does the physical understanding of mass vs density clear things up for you?

In Defense of Calculus

In the following article, I expand and clarify my arguments that first appeared in this post.

A colleague recently sent me another article (thanks Doug) claiming that Statistics should replace Calculus as the most important math class for high school students.

Which peak to climb? (CCL, click on image for source)

The argument usually goes: Most kids won’t use Calculus. Statistics is more useful.

As you might know already, I disagree that the most important reason for teaching math is because it is useful. I don’t disagree that math is useful. Math is not just useful, but essential for STEM careers. So “usefulness” is certainly one reason for teaching math. But I don’t think it’s the most important reason for teaching math.

The most important reason for teaching math is because it is beautiful and eternal. Math is the single place in school where students can find deductive certainty and eternal truth. Even when human activity ceases, math will persist. When we study math, we tap into something bigger than ourselves. We taste the divine!

We are teaching students to think deductively—like a mathematician would. This is such an important area of knowledge for students to explore. They need to know what it means to prove something. A proof provides a kind of truth that is unattainable in other subjects, even the hard sciences. At best, the scientific method is still just guesses compared to math.

This is the most important thing we pass on to our students. Though some will, most of our students will not directly use the math we teach. This is actually true about every subject in high school. Most students will not remember the details of The Great Gatsby or remember the chemical formula for Ammonium Nitrate. But we do hope they learn the bigger skills: analyzing text and thinking scientifically. In math, the “bigger skills” are the ones I outlined above—proof, logic, reasoning, argumentation, problem solving. They can always look up the formulas.

Math is a subject that stands on its own and it is not the servant of other subjects. If we treat math as simply a subject that serves other subjects by providing useful formulas, we turn math into magic. We don’t need to defend math in this way. It stands on its own!

Calculus = The Mona Lisa

If students can take both Statistics and Calculus, that is ideal. But if I had to choose one, I would pick Calculus. The development of “the Calculus” is one of the great achievements of mankind and it’s a real crime to go through life never having been exposed to it. Can you imagine never having seen The Mona Lisa? Calculus is like the Mona Lisa of mathematics :-).

Random Walks Mural

I’ve been meaning to give the back wall of my classroom a makeover for a while. This summer I finally found some time to tackle the big project. I took down all the decorations and posters. I fixed up the wall and painted it a nice tan color. Then, I let loose the randomness!

and some added, inspirational, text :-)I struggled with what the new mural would be–I’ve thought about it over the last few years. I considered doing some kind of fractal like the Mandelbrot Set. But it should have been obvious, given the name of my blog!! What you see in the picture above is three two-dimensional random walks in green, blue, and red. In the limiting case, one gets Brownian motion:

Brownian motion of a yellow particle in a gas. (CCL)

I honestly didn’t know what it was going to look like until I did it. I generated it as I went, rolling a die to determine the direction I would go each time. I weighted the left and right directions because of the shape of the wall (1,2=right; 3,4=left; 5=up; 6=down). For more details about the process of making it, here’s a documentary-style youtube video that explains all:

Actually, I lied–it doesn’t tell “all.” If you really want to know more of my thought process and some of the math behind what I did, watch the Extended Edition video which has way more mathematical commentary from me. I’ve also posted the time lapse footage of the individual green, blue, and red. Just for fun, here’s an animated random walk with 25,000 iterations:

Wikipedia, Creative Commons License

A two-dimensional random walk with 25,000 iterations. Click the image for an animated version! (CCL)

I think the mural turned out pretty well! It was scary to be permanently marking my walls, not knowing where each path would take me, or how it would end up looking. At first I thought I would only do ONE random walk. However, the first random walk (in blue) went off the ceiling so I stopped. And then I decided to add two more random walks.

In retrospect, it actually makes complete sense. I teach three different courses (Algebra 2, Precalculus, and Calculus) and I’ve always associated with each of theses courses a “class color”–green, blue, and red, respectively. I use the class color to label their bins, to write their objective and homework on the board, and many other things.

The phrase “Where will mathematics take you?” was also a last-minute addition, if you can believe it. There just happened to be a big space between the blue and red random walks and it was begging for attention.

good question!What a good question for our students. The random walks provide an interesting analogy for the classroom. I’d like to say I’m always organized in my teaching. But some of the richest conversations come from a “random walk” into unexpected territory when interesting questions are raised.

Speaking of interesting questions that are raised, here are a few:

  • Can you figure out how many iterations occurred after looking at a “finished” random walk? Or perhaps a better question: What’s the probability that there were more than n iterations if we see m line segments in the random walk?
  • Given probabilities p_1, p_2, p_3, p_4 of going in the four cardinal directions, can we predict how wide and how high the random walk will grow after n iterations? Can we provide confidence intervals? (might be nice to share this info with the mural creator!)
  • After looking at a few random walks, can we detect any bias in a die? How many random walks would want to see in order to confidently claim that a die is biased in favor of “up” or “left”…etc?

Some of the questions are easy, some are hard. If you love this stuff, you might be interested in taking a few courses in Stochastic Processes. Any other questions you can think of?

Where will math take you this coming academic year? Welcome back everyone!

Math on Quora

quora iconI may not have been very active on my blog recently (sorry for the three-month hiatus), but it’s not because I haven’t been actively doing math. And in fact, I’ve also found other outlets to share about math.

Have you used Quora yet?

Quora, at least in principle, is a grown-up version of yahoo answers. It’s like stackoverflow, but more philosophical and less technical. You’ll (usually) find thoughtful questions and thoughtful answers. Like most question-answer sites, you can ‘up-vote’ an answer, so the best answers generally appear at the top of the feed.

The best part about Quora is that it somehow attracts really high quality respondents, including: Ashton Kutcher, Jimmy Wales, Jermey Lin, and even Barack Obama. Many other mayors, famous athletes, CEOs, and the like, seem to darken the halls of Quora. For a list of famous folks on Quora, check out this Quora question (how meta!).

Also contributing quality answers is none other than me. It’s still a new space for me, but I’ve made my foray into Quora in a few small ways. Check out the following questions for which I’ve contributed answers, and give me some up-votes, or start a comment battle with me or something :-).

And here are a few posts where my comments appear:

Lego Price Statistics

Do you ever get the feeling that Lego Bricks are becoming more expensive? When we were kids, boy, it felt like they were cheaper, right? I mean, the biggest sets were $150 at most. I have a HUGE Lego collection, and it definitely seems like Legos back in my day were more affordable.

Trouble is, that’s not really true. It turns out that Lego bricks have actually gotten cheaper, by almost every measure you can think of (weight/number of pieces/licensed sets). Check out this incredibly thorough post on Lego Price statistics over time. The article is entitled, “What Happened with LEGO” by Andrew Sielen. It’s very thoughtfully done.

[ht: Gene Chase]

Why Calculus still belongs at the top

AP Calculus is often seen as the pinnacle of the high school mathematics curriculum*–or the “summit” of the mountain as Professor Arthur Benjamin calls it. Benjamin gave a compelling TED talk in 2009 making the case that this is the wrong summit and the correct summit should be AP Statistics. The talk is less than 3 minutes, so if you haven’t yet seen it, I encourage you to check it out here and my first blog post about it here.

I love Arthur Benjamin and he makes a lot of good points, but I’d like to supply some counter-points in this post, which I’ve titled “Why Calculus still belongs at the top.”

Full disclosure: I teach AP Calculus and I’ve never taught AP Statistics. However I DO know and love statistics–I just took a grad class in Stat and thoroughly enjoyed it. But I wouldn’t want to teach it to high school students. Here’s why: For high school students, non-Calculus based Statistics seems more like magic than mathematics.

When I teach math I try, to the extent that it’s possible, to never provide unjustified statements or unproven claims. (Of course this is not always possible, but I try.) For example, in my Algebra 2 class I derive the quadratic formula. In my Precalculus class, I derive all the trig identities we ask the students to know. And in my Calculus class, I “derive” the various rules for differentiation or integration. I often tell the students that copying down the proof is completely optional and the proof will not be tested–“just sit back and relax and enjoy the show!”

But such an approach to mathematical thinking can rarely be applied in a high school Statistics course because statistics rests SO heavily on calculus and so the ‘proofs’ are inaccessible. I’d like to make a startling claim: I claim that 99.99% of AP Statistics students and 99% of AP Statistics teachers cannot even give the function-rule for the normal distribution.

Image used by permission from Interactive Mathematics. Click the image to go there and learn all about the normal distribution!

Image used by permission from Interactive Mathematics. Click the image to go there and learn all about the normal distribution!

In what other math class would you talk about a function ALL YEAR and never give its rule? The normal distribution is the centerpiece (literally!) of the Statistics curriculum. And yet we never even tell them its equation nor where it comes from. That should be some kind of mathematical crime. We might as well call the normal distribution the “magic curve.”

Furthermore, a kid can go through all of AP Statistics and never think about integration, even though that’s what their doing every single time they look up values in those stat tables in the back of the book.

I agree that statistics is more applicable to the ‘real world’ of most of these kids’ lives, and on that point, I agree with Arthur Benjamin. But I would argue that application is not the most important reason we teach mathematics. The most important thing we teach kids is mathematical thinking.

The same thing is true of every other high school subject area. Will most students ever need to know particular historical facts? No. We aim to train them in historical thinking. What about balancing an equation in Chemistry? Or dissecting a frog? They’ll likely never do that again, but they’re getting a taste of what scientists do and how they think. In general, two of our aims as secondary educators are to (1) provide a liberal education for students so they can engage in intelligent conversations with all people in all subject areas in the adult world and (2) to open doors for a future career in a more narrow field of study.

So where does statistics fit into all of this? I think it’s still worth teaching, of course. It’s very important and has real world meaning. But the value I find in teaching statistics feels VERY different than the value I find in teaching every other math class. Like I said before, it feels a bit more like magic than mathematics.**

I argue that Calculus does a better job of training students to think mathematically.

But maybe that’s just how I feel. Maybe we can get Art Benjamin to stop by and weigh in!

  .

….

*In our school, and in many other schools, we actually have many more class options beyond Calculus for those students who take Calculus in their Sophomore or Junior year and want to be exposed to even more math.

** Many parts of basic Probability and Statistics can be taught with explanations and proof, namely the discrete portions–and this should be done. But working with continuous distributions can only be justified using Calculus.

Product Failure

I’ve been taking a grad course in statistics this semester and so I’ve been thinking about all sorts of real world examples of math, including the classic product-failure example that’s a mainstay of most stat classes.

One of the simplest continuous distributions is the exponential distribution which is a pretty decent way to model product failures. The probability of failure f(t) after time t is given by

f(t)=\frac{1}{\lambda}e^{-t/\lambda}.

I read this great article about product failure and testing in Wired this week. I encourage you to check it out. Read the last page of the article especially, where it talks about how cutting-edge companies are modeling minute variations in materials using an electron microscope and some statistics. Instead of actually testing the product over and over again using a fatigue machine, they can create surprisingly accurate models of the materials using computers. Prior to this, the behavior of materials was somewhat unpredictable.

Of course I was excited to see this figure in the article, which shows the Weibull distribution modeling failures of steel bars in a fatigue machine.

The Weibull distribution, unlike the exponential distribution, takes the age of a product into account. If the parameter k is greater than zero, than the rate of product failure increases with time. The probability of failure f(t) after time t is

f(t)=\frac{k}{\lambda}\left(\frac{t}{\lambda}\right)^{k-1}e^{-(t/\lambda)^k}.

The first obvious thing to note is that the exponential distribution is just a special case of the Weibull distribution, with k=1. The next thing to say is that this distribution is single-peaked. So how is the above a Weibull distribution? The article says it is, but I think it might be a linear combination of two Weibull distributions, don’t you? Whatever–normalize, and you’ve got yourself a probability distribution.

[pun warning!]

The real question is, if this is TWO Weibulls, would you settle for the lesser of two Weibulls?

Sorry. I had to.