Geometric Proofs of Trigonometric Identities

Sparked by a conversation this past weekend about the usefulness of the half-angle identities, I constructed geometric proofs for \sin(x/2) and \cos(x/2). Since I’ve never seen these anywhere before, I thought I’d share.

And while I was at it, I thought I’d share all my other geometric proofs, so here they are, posted mostly without comment.

Some of these are so well-known as to be not worth mentioning. Many of them have been stolen from Proofs Without Words I or Proofs Without Words II. I came up with a few of them myself. Frustratingly, almost none of them are to be found in Precalculus textbooks, where they might be learned and appreciated.

Pythag 1

________________________________________________________

Pythag 2

__________________________________________________________

Pythag 3

_____________________________________________________________

Pythag 4

________________________________________________________________

Sincos of a sum 1

_______________________________________________________________Sincos of a sum 2

______________________________________________________________

sincos of a diff 1

_________________________________________________________________

sincos of a diff 2

______________________________________________________________

Though this one is my favorite:

sine and cosine of a sum best 1

____________________________________________________________________________

sine and cosine of a sum best 2

_______________________________________________________________________________________

Partially because of the way it naturally generalizes into the proof of the derivative of sine. If you just let \beta approach 0, \cos(\beta) approaches 1 and that point in the interior of the circle ends up on the circle, where \sin(\beta) merges with \beta itself.

Proof of derivative of sinx

_____________________________________________________________________________________

double angle 1

______________________________________________________________________________________

double angle 2

_______________________________________________________________________________________

double angle 3

___________________________________________________________________________________

half angle 1

__________________________________________________________________________________

half angle 2

___________________________________________________________________________________________

half angle 3

____________________________________________________________________________________

 

half angle 4____________________________________________________________________________________

And finally, one that shows that the sum of a sine and cosine function of the same argument is also a sinusoid. Since I lost the original picture and don’t feel like remaking it, you’ll have to complete the proof on your own!

sum of sine and cosine

____________________________________________________________________________________

Update: After some feedback on twitter, I’ve decided to add a few more diagrams. Tim Brzezinski sent me a link to his website of geometric proofs of trig identities and he had some that I’ve never seen before.

Check it out!

https://www.geogebra.org/m/DxAcj8E2#material/QedMT7Pw

I’ve taken two of his diagrams and added them below.

tan of a sum 1

_________________________________________________________

tan of a sum 2

____________________________________________________________________________________

tan of a diff 1

_________________________________________________________________________________________

tan of a diff 2

National Math Festival 2017

There was mathematical mayhem in DC on Saturday!

Did you miss it? Let me try to capture the day with some photos:

That’s just ONE room, just one part of a very large and increasingly popular National Math Festival.

This was the second festival which is held every two years (alternating with the the US Science and Engineering Festival). The festival was a huge success and was very well attended. I was a little cautious about attendance predictions, given that the festival moved to the convention center from the DC Mall–a location which benefited from wandering foot-traffic.

This year, however, we benefited from the rain. It was dark and rainy all day long, but the National Math Festival provided a wonderful rainy-day escape from the dreary weather. See? Look at all the fun we’re having!

The photos you’re seeing here are all from the travelling exhibits brought to us by the Museum of Mathematics in NYC. I helped MoMATH coordinate volunteers this year, just as I did two years ago. And our volunteers were AWESOME!

We engaged thousands of people throughout the course of the day in meaningful mathematical play. There is a great need for this kind of popular-focus on mathematics, illuminating the beauty, joy, and fun of mathematics, rather than the impression people have of difficulty and drudgery.

All my photos are MoMATH-focused, since that’s where I spent my day. You can find even more of my photos here. And you can see more coverage in my twitter feed. For example, here’s a little clip of some juggling-math:

//platform.twitter.com/widgets.js

Did you miss this year’s festival? Mark your calendars for April 2019 and make it a priority!

 

2017 Pi Day Puzzle Hunt Recap

Imagine 150 teens sleuthing around the school solving puzzles, skipping lunch every day to gain advantages over other teams, students voluntarily solving extremely difficult puzzles.

Welcome to the Third Annual RMHS Pi Day Puzzle Hunt. This year 36 teams competed for $200 in prize money, trophies and swag, and of course, GLORY. 🙂

There were eight challenging puzzles this year. A mural maze had students visiting other murals throughout the school in order to obtain the URL that gained them access to the next puzzle. The puzzles took students online, to classrooms, lockers, and making phone calls. Teams also received a UV light during the hunt in order to reveal secret messages (or cryptograms that still required decryption!). This year we did a better job of making the puzzles start out easy and slowly get more difficult, so as not to discourage teams right away. Here are links to descriptions of all of the 2017 puzzles:

Each year we have tried to improve the hunt in substantial ways, including the appearance of “Stars” throughout the hunt that earned students extra points by rewarding teams that could find hidden elements of puzzle or solve daily bonus puzzles. We also made the prize money and trophies better this year.

We had some bumps in the road, but overall, the 2017 hunt was a success. Months of work, and now our third puzzle hunt is in the books.

For more details, including photos, videos, and the puzzles, visit the Pi Day Puzzle Hunt Website.

See you next year, kids!

Derivatives of Trigonometric Functions

First, let’s present the standard approach. This is from the calculus textbook I teach out of.

der of sinx

This was, as far as I was concerned, the only possible proof. The pedagogical flexibility lay entirely in how to frame the question, how to get students to discover the fact on their own (via graphical techniques), and how to add extra meaning to the result.

The most important question, so I thought for years, was really how one introduces and understands the fact that \lim_{x \to 0} \frac{\sin x}{x}=1. Some textbooks introduce it more or less out of the blue as “an important limit to know” and prove it via the Squeeze Theorem. Others prefer to wait until halfway through the above proof, realizing only then that this limit is important and solving it with a purpose in mind. There is also a difference of opinion as to how much rigor is required to establish the key inequality, that \sin \theta < \theta < \tan \theta. My textbook uses an area argument, but others prove the inequality with a nested sequence of segment inequalities.

My personal preference is for students to encounter \lim_{x \to 0} \frac{\sin x}{x}=1 “naturally” by attempting to graph y=\frac{\sin x}{x} in precalculus, along with other interesting functions like y=x \cdot \sin x, y=x \cdot \cos x, y = x + \sin x, y = e^{-x} \cdot \sin x, and y = \sin(1/x). These are more or less exercises in recognizing the so-called “envelope” of the product or sum of a periodic function and another function and have various scientific applications. The very informal geometric argument for why \lim_{x \to 0} \frac{\sin x}{x}=1 that one encounters in precalc prepares one for the more formal proof in calculus via the Squeeze Theorem.

All of this hard work to prove that \lim_{x \to 0} \frac{\sin x}{x}=1 almost seems to make it the real theorem and leaves \frac{d}{dx} [\sin x] = \cos x as a corollary.

By contrast, consider this:

Proof of derivative of sinx

I’m tempted to make no further comment, since this beautiful and striking diagram so thoroughly and clearly explains why the derivative of sine is cosine. Tiny changes in the sine of an angle are proportional to the cosine of that angle since the red arc length above is effectively a tangent to the circle. I would go so far as to say that until you see a diagram like this, you don’t even really understand the theorem at all. Why don’t we teach the derivative of sine this way? Why is this figure not in all the textbooks? I think I know the answers to these questions. The answers involve a long story about the history of calculus, the banishment of infinitesimals during the quest for rigor, and the abandonment of geometry as a satisfactory basis for analysis. But these diagrams are just too beautiful to give up and it’s cruel of us to keep them hidden from our students.

Here’s another calculus proof:

Proof of derivative of arcsinx

Compare this to the standard treatment you find in textbooks:

crapderarcsine

Which one of these proofs excites you? Which one makes you really feel like you understand the theorem and why it’s true?

I have created an entire series and I post them here without further comment.

Proof of derivative of tanx


Proof of derivative of arctanx


Proof of derivative of secx


Proof of derivative of arcsecx

 

The Product Rule

At some point in every calculus class, we must discover and prove the product rule for derivatives. How a calculus teacher chooses to do this probably says a lot about their pedagogy and educational priorities.

Some teachers might simply write the rule on the board, expect students to accept it, and immediately launch into examples. Should we try to let the students discover the formula on their own? Should we perhaps lead them into a trap by suggesting that the derivative of a product of two functions is the product of the derivatives and let them find counterexamples? Should we state the theorem, but let the students try to prove it on their own? Should we perhaps have an entire mini-lesson on what it even means to have a product of two functions?

Should we try to motivate the entire discussion with a particularly intuitive pair of functions whose product has some real-world significance? Should we interpret the product of two functions geometrically, as the area of the corresponding rectangle? If properly motivated and explained, do we actually gain anything by doing the rigorous proof via limits?

The Status Quo

As a foil, here is the introduction to and proof of the product rule from the textbook that I teach out of.

product rule

I understand that textbooks have limited space and are no substitute for a full curriculum, but I think we can all agree that this is awful. There is no motivating example and no geometric intuition is called upon. The author merely proves the theorem, dryly and without understanding or purpose. The author even admits that the proof is unsatisfying and unedifying and apologizes in advance for its opaque maneuvers! Some proofs involve “clever steps that may appear unmotivated to a reader”.

In other words, reader, I am clever and you are not. This proof crucially involves cleverness, but since you’re not clever, you never would have thought of it yourself. I will perform some algebraic manipulations here in blue — they may appear unmotivated to you, but that’s your fault. In fact, I haven’t motivated them at all, but I don’t need to explain my clever methods to you. This is a calculus textbook after all, not a motivational textbook on explaining one’s cleverness. I have proved the rule, what else do you want me to do? If you want meaning and understanding, please consult your local religious figures for guidance.

Can we do better? Yes, I think we can. My friend James Key and I have used the phrase “tyranny of the blue text” to refer to totally opaque and unmotivated algebraic moves in textbook math proofs, since the offending expressions are often rendered in blue. Proving an important theorem to students via seemingly arbitrary, unmotivated algebraic tricks is an intellectual crime, and we should endeavor to banish the tyranny of the blue text from our classrooms and from our consciousness.

Idea #1: A Word Problem

Suppose a particular factory produces toys 24 hours a day.

Let W(t) be a continuous model of the number of workers at the factory at time t. The value of this function fluctuates throughout the day as workers leave and arrive according to their various particular schedules.

Let E(t) be a continuous model of the number of toys produced per worker per hour at time t. This function measures the overall efficiency of the factory at a particular time of day. This could reasonably be expected to fluctuate due to external factors like the electricity supply, the weather (solar panels!), or the tiredness of the workers.

Then (WE)(t) = W(t) \cdot E(t) is the total rate at which the factory produces toys, measured in toys per hour, at a particular time t.

W'(t) is the rate of change of W with respect to t, in other words the rate at which the workforce at the factory is rising or falling, as workers leave and arrive.

E'(t) is the rate of change of E with respect to t, in other words the rate at which the efficiency of the factory (on a per worker basis) is changing at a particular time t.

(WE)'(t) is the rate at which the factory’s output is changing, at a given time t. In other words, if (WE)'(t) is positive and big, the factory’s output is increasing a lot at that moment, but if (WE)'(t) is positive and small, the factory’s output is increasing only a little at that time t.

Using our own common sense, what should (WE)'(t) depend on? Surely W'(t) is relevant, since even if efficiency holds steady, if workers are pouring into the factory at time t, the factory’s output will go up. But surely E'(t) is also relevant, since even if the workforce holds steady, if the workers are becoming more efficient, then the factory’s overall output will go up. But the current size of the workforce, W(t), is also relevant, since if, for example, efficiency is going up but the current workforce is very small, those gains in efficiency will not translate into large increases in output. And the current efficiency, E(t), is also relevant, since if, for example, workers are pouring into the factory, but the current toy production per worker per hour is very small, then those extra workers will also not translate into large increases in output.

Just by having these conversations, we prime our students to have a deep appreciation of what the product rule is about, what differentiation is about, why we would ever want to multiply two functions, and why we would ever want to learn calculus.

This year, when giving this exact introduction to the product rule, I had a student guess the product rule right there on the spot, just from talking out the logic of the toy factory.

Idea #2: A Geometric Interpretation

Some calculus textbooks motivate the product rule geometrically, by interpreting the product of two functions as the area of the rectangle whose side lengths are the values of the two functions at a given time.

prod rule #1

This sloppy picture is taken from a presentation I gave at an NCTM conference a few years ago about calculus proofs. The area of the rectangle with side lengths f and g represents the value of the fg function at a particular time. A moment later, both f and g change, and the derivative wants to measure the size of the change. Here again, we can read the product rule directly off the diagram. A “proof” like this was probably totally sufficient to a mathematician of the 18th century, but in a post-Cauchy/Weierstrass world, we need to verify these intuitions via the definition of the derivative as a limit.

But we can hold onto our geometric intuition and have our rigor as well!

prod rule #2

The same diagram can be used to interpret that mysterious numerator in the definition of the derivative and avoid the tyranny of the blue text. The diagram motivates, but the rigor is preserved, since the limit just under the rectangle can be expanded and verified to be equivalent to the limit just above the diagram. But this time we are doing the proof with meaning and understanding.

Teaching the product rule this way might even be considered “standard”. The only drawback is that you do kind of have to be clever to think to do all this! Would a student come up with the idea to make a rectangle on their own? I’m not sure. I don’t claim to be the first or the only one to use a rectangle to discover, motivate, and even prove the product rule, but the following is something I have never seen anywhere before and that I just came up with a month ago. It is the excuse to write this blog post.

Idea #3: Start with a particular example

prodx2

I’m getting a bit tired, so I pasted this picture in. The idea is simple. Combine a function with a known derivative and a generic second function f. And then just try to find the derivative of the product. No understanding or geometric intuition is required, but no teacher help or input is probably required either.

A reasonable calculus student who is confident, good at algebra, and experienced with limits and computing derivatives from the definition should be able to get to the end by just doing what comes natural.

I have not tried this before in class, so I can’t say how well it will work. But if it works, then the students have proved themselves to be every bit as clever as is required, and they’ve done it on their own. The teacher can then add extra layers of understanding to the general phenomenon of the product rule and lead the class through the general proof, possibly using geometry as a guide. But the students who figured out this particular example will feel that they could have done the general case on their own. And they will be right.

Area models for multiplication throughout the K-12 curriculum

Let’s take a look at area models, shall we?

My thesis today is that area models should be ubiquitous across the entire curriculum because mathematics is a sense making discipline. As math educators, we ought to encourage our students to take every opportunity to visualize their mathematics in an effort to illuminate, explain, prove, and bring intuition.

So let’s take a walk through the K-12 math curriculum and highlight the use of area models as they might apply to arithmeticalgebra, and calculus.

base-ten-blocks

Arithmetic

Students experience area models for the first time in elementary school as they work to visualize multi-digit multiplication. This can also be used for division as well, just running the logic in reverse–that is, seeking an unknown “side length” rather than an unknown area. And Base Ten Blocks can be used to help students understand the building blocks of our number system.

Here’s how you might work out 27\times 54:

27\times 54 = (20+7)(50+4)=(20)(50)+(20)(4)+(7)(50)+(7)(4)

area-model-multiplication

27\times 54=1000+80+350+28=1458

The advantage of using a visual model like this is that you can easily see your calculation and explain why constituent calculations, taken together, faithfully produce the desired result. If you do a “man on the street” interview with most users or purveyors of the standard algorithm, you would almost certainly not get crystal clear explanations for why it produces results. For a further discussion of area models for multi-digit multiplication, see this article, or read Jo Boaler’s now famous book Mathematical Mindsets.

Algebra

In middle school, as students first encounter algebra, they may use area models to support their algebraic reasoning around multiplying polynomials. And in an Algebra 2 course they may learn about polynomial division and support their thinking using an area model in the same way they used area models to do division in elementary school. Here Algebra Tiles can be used as physical manipulatives to support student learning.

Here’s how you might work out (x+4)(2x+3):

(x+4)(2x+3)=(x)(2x)+(x)(3)+(4)(2x)+(4)(3)

area-model-polynomials

(x+4)(2x+3)=2x^2+3x+8x+12=2x^2+11x+12

Notice also that if you let x=10, you obtain the following result from arithmetic:

14\times 23 = 200+110+12=322

The Common Core places special emphasis on making such connections. I agree with this effort, even though I can also commiserate with fellow math teachers who say things like, “My Precalculus students still use the box method for multiplying polynomials!” We definitely want to move our students toward fluency, but perhaps we should wait for them to realize that they don’t need their visual models. Eventually most students figure out on their own that it would be more efficient to do without the models.

Calculus

Later in high school, as students first study calculus, area models can be used to bring understanding to the Product Rule–a result that is often memorized without any understanding. Even the usual “textbook proof” justifies but does not illuminate.

Here’s an informal proof of the Product Rule using an area model:

The “change in” the quantity L\cdot W can be thought of as the change in the area of a rectangle with side lengths L and W. That is, let A=LW. As we change L and W by amounts \Delta L and \Delta W, we are wondering how the overall area changes (that is, what is \Delta A?).

If the side length L increases by \Delta L, the new side length is L+\Delta L. Similarly, the width is now W+\Delta W. It follows that the new area is:

A+\Delta A=(L+\Delta L)(W+\Delta W)=LW+L\Delta W+W\Delta L+\Delta L\Delta W

area-model-product-rule

Keeping in mind that A=LW, we can subtract this quantity from both sides to obtain:

\Delta A=L\Delta W+W\Delta L+\Delta L\Delta W

Dividing through by \Delta x gives:

\frac{\Delta A}{\Delta x}=L\cdot\frac{\Delta W}{\Delta x}+W\cdot\frac{\Delta L}{\Delta x}+\frac{\Delta L}{\Delta x} \frac{\Delta W}{\Delta x} \Delta x

And taking limits as \Delta x\to 0 gives the desired result:

\frac{dA}{dx}=L\cdot\frac{dW}{dx}+W\cdot\frac{dL}{dx}

Conclusion

If you’re like me, you once looked down on area models as being for those who can’t handle the “real” algebra. But if we take that view, there’s a lot of sense-making that we’re missing out on. Area models are an important tool in our tool belt for bringing clarity and connections to our math students.

Okay, so last question: Base Ten Blocks exist, and Algebra Tiles exist. What do you think? Shall we manufacture and sell Calculus DX Tiles © ? 🙂

I’m back

Hey everyone.

I took a two year hiatus from blogging. Life got busy and I let the blog slide. I’m sorry.

But I’m back, and my New Year’s Resolution for 2017 is to post at least once a month!

new-year_resolutions_list

Here’s what I’ve been up to over the last two years:

  • Twitter. When people ask why I haven’t blogged, I say “twitter ate my blog.” It’s true. Twitter keeps feeding me brilliant things to read, engaging me in wonderful conversations, and providing the amazing fellowship of the MTBoS.
  • James Key. I consistently receive mathematical distractions from my colleague and friend, James, who has a revolutionary view on math education and a keen love for geometry. This won’t be the last time I mention his work. Go check out his blog and let’s start the revolution.

    with my nerdy friends named James

    with my nerdy friends named James

  • My Masters. I finally finished my 5-year long masters program at Johns Hopkins. I now have a MS in Applied and Computational Mathematics…whatever that means!
  • Life. My wife and I had our second daughter, Heidi. We’re super involved in our church. I tutor two nights a week. Sue me for having a life! 🙂
family photo

family photo

  • New curriculum. In our district, like many others, we’ve been rolling out new Common Core aligned curriculum. This has been good for our district, but also a monumental chore. I’m a huge fan of the new math standards, and I’d love to chat with you about the positive transitions that come with the CCSS.
  • Curriculum development. I’ve been working with our district, helping review curriculum, write assessments, and I even helped James Key make some video resources for teachers.
  • Books. Here are a few I’ve read in the last few months: The Joy of x, Mathematical Mindsets, The Mathematical Tourist, Principles to Actions
  • Math Newsletters. Do you get the newsletters from Chris Smith or James Tanton (did you know he’s pushing three essays on us these days?). Email these guys and they’ll put you on their mailing list immediately.
  • Growing. I’ve grown a lot as a teacher in the last two years. For example, my desks are finally in groups. See?
my classroom

my classroom

  • Pi day puzzle hunt! Two years ago we started a new annual tradition. To correspond with the “big” pi-day back in 2015, we launched a giant puzzle hunt that involves dozens of teams of players in a multi-day scavenger hunt. Each year we outdo ourselves. Check out some of the puzzles we’ve done in the last two years.
  • Quora. This question/answer site is awesome, but careful. You’ll be on the site and an hour later you’ll look up and wonder what happened. Here are some of the answers I’ve written recently, most of which are math-related. I know, I know, I should have been pouring that energy into blog posts. I promise I won’t do it again.
  • National Math Festival. Two years ago we had the first ever National Math Festival on the mall in DC. It was a huge success. I helped coordinate volunteers for MoMATH and I’ll be doing it again this year. See you downtown on April 22!
famous mathematicians you might run into at the National Math Festival

famous mathematicians you might run into at the National Math Festival

Now you’ll hopefully find me more regularly hanging out here on my blog. I have some posts in mind that I think you’ll like, and I also invited my colleague Will Rose to write some guest posts here on the blog. Please give him a warm welcome.

Thanks for all the love and comments on recent posts. Be assured that Random Walks is back in business!

Extraneous Solutions – Part 2 of 3

Solving an Equation as a Sequence of Equation Replacement Operations

Part 1 was so long because I wanted to be extremely thorough and to present things to an audience that perhaps hadn’t thought much about the logic of equation solving at all. Since we’re now all experts, perhaps it’s worth it to summarize everything very succinctly.

Given an equation in one free variable, we want to find the solution set. To do this, we replace that equation with an equivalent equation whose solution set is more obvious.

(1) 8x - 5 = 5x + 1

(2) 8x = 5x + 6

(3) 3x = 6

(4) x = 2

If in the transition from (1)-(2), from (2)-(3), and from (3)-(4) we are careful to replace each equation with an equivalent equation, then by the transitivity of equivalence, the original equation and terminal equation are guaranteed to be equivalent. Since the solution set of the terminal equation is obvious, we know the solution set of the original equation, as well. Thus solving an equation requires establishing that certain equation replacement operations are indeed equivalence preserving and having the creativity and experience to know which ones to apply and in what order.

What are the Equivalence-Preserving Operations on Equations?

If a = b, then f(a) = f(b) for any well-defined function f. If a and b are expressions containing a free-variable, then any value of that variable which satisfies a = b will also satisfyf(a) = f(b). In other words, if you find it useful, feel free to replace any equation with a new equation which is the result of applying any function to both sides of the original equation. Any solution to the original equation will also be a solution to the new equation.

If the function f is also one-to-one, then by definition, f(a) = f(b) \Rightarrow a = b so any solution of f(a) = f(b) will also be a solution to a = b. Thus applying f to both sides of an equation is equivalence-preserving. If f is not one-to-one, then in general, the operation is not equivalence-preserving.

In solving equation (1), we applied f(n) = n + 5, g(n) = n - 5x and h(n) = n/2 in that order. Since all three of the functions are one-to-one, we are assured that (1) and (4) are equivalent. If we had cause to apply a non-one-to-one function, then we should be vigilant for extraneous solution.

A More Interesting Example

Consider

(5) \sqrt{6x-2} - \sqrt{x+1} = 2

As I mentioned in the other post, these square roots are begging to be squared, but since there are two of them, one squaring will not be enough. Even though it’s not necessary to do so, it’s helpful to move one radical expression to the other side.

(6) \sqrt{6x-2} = 2 + \sqrt{x+1}

(7) 6x - 2 = 4 + 4\sqrt{x + 1} + x + 1 We squared!

(8) 5x - 7 = 4\sqrt{x+1}

(9) 25x^2 - 70x + 49 = 16x + 16 We squared again!

(10) 25x^2 - 86x + 33 = 0

(11) (25x - 11)(x - 3) = 0

So x \in \{\frac{11}{25}, 3\}

Since in the transition from (6)-(7) and again in the transition from (8)-(9) we had reason to apply the non-one-to-one function f(n) = n^2, we should be vigilant for extraneous solutions. [Note: since both sides of (6) are necessarily positive, applying f(n) = n^2 is equivalence-preserving, so no extraneous roots will be created there.] By checking back in the original equation, we see that 3 is a solution, but \frac{11}{25} is not. I am more or less content to leave it at that. But some may ask for more clarity as to exactly what happened and when, so let’s indulge them.

I will now list each equation in reverse order along with its solution set:

(11) (25x - 11)(x - 3) = 0                            \{\frac{11}{25}, 3\}

(10) 25x^2 - 86x + 33 = 0                             \{\frac{11}{25}, 3\}

(9) 25x^2 - 70x + 49 = 16x + 16                 \{\frac{11}{25}, 3\}

(8) 5x - 7 = 4\sqrt{x+1}                                     \{3\}

Since 5\cdot\frac{11}{25} - 7 = \frac{11}{5} - \frac{35}{5} = -\frac{24}{5} \neq 4\sqrt{\frac{11}{25} + 1} = 4\sqrt{\frac{11}{25} + \frac{25}{25}} = 4\sqrt{\frac{36}{25}} = 4\cdot\frac{6}{5} = \frac{24}{5}

So we have isolated the precise moment when the extraneous solution x = \frac{11}{25} is created and it appears exactly where we would expect it, in the transition from (8) to (9) as we replaced (8) with the result of applying the non-one-to-one function f(n) = n^2 to both sides.

More specifically, if x = \frac{11}{25}, (8) reads - \frac{24}{5} = \frac{24}{5}, which is false, but (9) reads (- \frac{24}{5})^2 = ( \frac{24}{5})^2, which is true. For this particular value of x, we squared both sides and replaced a false statement with a true statement. In retrospect, we can say that x =\frac{11}{25} is not a solution to (8) or to any previous equation in the solving sequence, but is a solution to (9) and thus to all subsequent equations in the solving sequence.

(7) 6x - 2 = 4 + 4\sqrt{x + 1} + x + 1                          \{3\}

Since both sides of (7) are positive when x = 3, it does not surprise us that,

(6) \sqrt{6x-2} = 2 + \sqrt{x+1}                               \{3\}

(5) \sqrt{6x-2} - \sqrt{x+1} = 2                                \{3\}

By fully analyzing the logic behind each step of our equation replacement sequence, we not only:

  • confirm that x = 3 is a solution and that x = \frac{11}{25} is not and
  • understand that squaring both sides may produce an extraneous solution

but also

  • isolate the precise step in the solving sequence in which this extraneous solution was created answering the why, how, and when for this problem
  • confirm that the non-solution status of x = \frac{11}{25} is not merely due to an error of algebra or arithmetic, but is a direct result of that fact that this value produces an equation (8) of the form a = -a

That last point is crucial in distinguishing the phenomenon of extraneous roots from the phenomenon of user error in algebra or arithmetic. If our equation solving sequence consists solely of equivalence-preserving operations, we do not even need to check to see if solutions to our terminal equation are also solutions to our original equation. If we do decide to check, perhaps out of an abundance of caution, and find a discrepancy, then user error must be to blame.

On the other hand, if a solver does employ solution-set-enlarging operations in the solving sequence and finds that a solution to the terminal equation is not a solution to the original equation, is this because the solution is extraneous or due to user error? One could perform an analysis like I did above and confirm that the non-solution is not due to user error, but instead to the logic of the process.

Extraneous Solutions – Part 1 of 3?

Disclaimer

Within my small inner circle of math teachers, the mystery of extraneous solutions seems to be the issue of the year. I have so much to say on this topic (algebraic, logical, pedagogical, historical, linguistic) that I don’t really know where to begin. My only disclaimer is that I’m not really sure if this topic is all that important.

Solving an Equation with a Radical Expression

Consider the following equation:

(1) 2\sqrt{x+8} +5 = 11

One hardly needs algebra skills or prior knowledge to solve this, but prior experience suggests trying to isolate x.

(2) 2\sqrt{x+8} = 6 (we subtract 5 from both sides)

(3) \sqrt{x+8} = 3 (we divide both sides by 2)

Now, if the square root of something is 3, then that something must be 9, so it immediately follows that

(4) x+8 = 9

(5) x = 1 (we subtract 8 from both sides)

Squaring Both Sides

In my transition from (3) to (4), I used a bit of reasoning. Some conversational common sense told me that “if the square root of something is 3, then that something must be 9”. But that logic is usually just reduced to an algebraic procedure: “squaring both sides”. If we square both sides of equation (3), we get equation (4).

On the one hand, this seems like a natural move. Since the meaning of \sqrt{a} is “the (positive) quantity which when squared is a“, the expression \sqrt{a} is practically begging us to square it. Only then can we recover what lies inside. A quantity “which when squared is a” is like a genie “which when summoned will grant three wishes”. In both cases you know exactly what to do next.

Unfortunately, squaring both sides of an equation is problematic. If a = b is true, then a^2 = b^2 is also true. But the converse does not hold. If a^2 = b^2, we cannot conclude that a = b, because opposites have the same square.

This leads to problems when solving an equation if one squares both sides indiscriminately.

A Silly Equation Leads to Extraneous Solutions

Consider the equation,

(6) x = 4

This is an equation with one free variable. It’s a statement, but it’s a statement whose truth is impossible to determine. So it’s not quite a proposition. Logicians would call it a predicate. Linguistically, it’s comparable to a sentence with an unresolved anaphor. If someone begins a conversation with the sentence “He is 4 years old”, then without context we can’t process it. Depending on who “he” refers to, the sentence may be true or false. The goal of solving an equation is to find the solution set, the set of all values for the free variable(s) which make the sentence true.

Equation (6) is only true if x has value 4. So the solution set is \left\{4 \right\} . But if we square both sides for some reason…

(7) x^2 = 16 has solution set \left\{4, -4\right\}

We began with x = 4, “did some algebra”, and ended up with x^2 = 16. By inspection, -4 is a solution to x^2 = 16, but not to the original equation which we were solving, so we call -4 an “extraneous solution”. [Extraneous – irrelevant or unrelated to the subject being dealt with]

Note that the appearance of the extraneous solution in the algebra of (6)-(7) did not involve the square root operation at all. But this example was also a bit silly because no one would square both sides when presented with equation (6), so let’s look at a slightly less silly example.

Another Radical Equation

(8) 2\sqrt{x+8} + 5 = -1

(9) 2\sqrt{x+8} = -6

(10) \sqrt{x+8} = -3

People paying attention might stop here and conclude (correctly) that (10) has no solutions, since the square root of a number can not be negative. Closer inspection of the logic of the algebraic operations in (8)-(10) enables us to conclude that the original equation (8) has no solutions either. Since a = b \iff a - 5 = b -5, any solution to (8) will also be a solution to (9) and vice versa. Since a = b \iff a/2 = b/2, any solution to (9) will also be a solution to (10) and vice versa. So equations (8), (9), and (10) are all “equivalent” in the sense that they have the same solution set.

But what if the equation solver does not notice this fact about (10) and decides to square both sides to get at that information hidden inside the square root?

(11) x+8 = 9

(12) x = 1

Again we have an extraneous solution. x = 1 is a solution to (12), but not to the original equation (8). Where did everything go wrong? By the previous logic, (8), (9), and (10) are all equivalent. (11) and (12) are also equivalent. So the extraneous solution somehow arose in the transition from (10) to (11), by squaring both sides.

So unlike subtracting 5 from both sides or dividing both sides by 2, squaring both sides is not an equivalence-preserving operation. But we tolerate this operation because the implication goes in the direction that matters. If a = b, then a^2 = b^2, so if a and b are expressions containing a free variable x, any value of x that makes a = b true will also make a^2 = b^2 true.

In other words, squaring both sides can only enlarge the solution set. So if one is vigilant when squaring both sides to the possible creation of extraneous solutions, and is willing to test solutions to the terminal equation back into the original equation, the process of squaring both sides is innocent and unproblematic.

Those Who are Still Not Satisfied

Still there are some who are not satisfied with this explanation: “Why does this happen? What is really going on? Where do the extraneous solutions come from? What do they mean?”

One source of the problem is the square root operation itself. \sqrt{a} is, by the conventional definition, the positive quantity which when squared is a. The reason that we have to stress the positive quantity is that there are always two real numbers that when squared equal any given positive real number. There are a few slightly different ways of making this same point. The operation of squaring a number erases the evidence of whether that number was positive or negative, so information is lost and we are not able to reverse the squaring process.

We can also phrase the phenomenon in the language of functions. Since squaring is a common and useful mathematical practice, information will often come to us squared and we’ll need an un-squaring process to unpack that information. f(x) = x^2, for all the reasons just mentioned, is not a one-to-one function, so strictly speaking, it is not invertible. But un-squaring is too important, so we persevere. As with all non-one-to-one functions, we first restrict the domain of f(x) = x^2 to [0, \infty) to make it one-to-one. This inverse, f^{-1}(x) = \sqrt{x} thus has a positive range and so the convention that \sqrt{a} \geq 0 is born. So every use of the square root symbol comes with the proviso that we mean the positive root, not the negative root. We inevitably lose track of this information when squaring both sides.

[Note: Students can easily lose track of these conventions. After a lot of practice solving quadratic equations, moving from x^2 = 9 effortlessly to x = \pm 3, students will often start to report that \sqrt{9} = \pm 3.]

The convention that we choose the positive root is totally arbitrary. In a world in which we restricted the domain of  f(x) = x^2 to (-\infty, 0] before inverting, \sqrt{9} would be -3. In that world, x = 1 is a perfectly good solution to 2\sqrt{x+8} + 5 = -1, not extraneous at all.

A Trigonometric Equation which Yields an Extraneous Solution

For parallelism, consider the (somewhat artificial) equation:

(13) \arccos(2x-1) = \frac{4\pi}{3}

Like in (10), careful and observant solvers might notice that the range of the \arccos(x) function is [0, \pi] and correctly conclude that the equation has no solutions. But there seems to be a lot going on inside that \arccos expression, so many will rush ahead and try to unpack it by “cosineing”. Indeed, since a=b \Rightarrow \cos(a) = \cos(b), this seems innocent.

(14) 2x - 1 = -\frac{1}{2}

(15) 2x = \frac{1}{2}

(16) x = \frac{1}{4}

But x = \frac{1}{4} is an extraneous solution since \arccos(-\frac{1}{2}) = \frac{2\pi}{3} not \frac{4\pi}{3}.

The explanation for this extraneous solution will be similar to the logic we used above. If a = b, then \cos(a) = \cos(b), so if a and b are expressions containing a free variable x, any value of x that makes a = b true will also make \cos(a) = \cos(b) true. So we will not lose any solutions by “taking the cosine of both sides”. But as the cosine function is not one-to-one, \cos(a) = \cos(b) does not imply that a = b. So taking the cosine of both sides, just like squaring both sides, can enlarge the solution set.

The above paragraph explains why extraneous solutions could appear in the solution of (13), but maybe not why they do appear. For that, we again must look to the presence of the \arccos function. Since \cos is not one-to-one, we had to arbitrarily restrict its domain to [0, \pi] prior to inverting. So every use of the \arccos symbol comes with its own proviso that we are referring to a number in a particular interval of values. In a world in which we had restricted the domain of \cos to [\pi, 2\pi] prior to inverting, x = \frac{1}{4} would be a perfectly good solution to \arccos(2x-1) = \frac{4\pi}{3}, not extraneous at all.

The above examples seem to suggest that one can avoid dealing with extraneous solutions by carefully examining one’s equations at each step. But in practice, this really isn’t possible. I saved the fun examples for the end, but as this post is already way way too long, they will have to wait for a bit later.

-Will Rose

Thanks

Thanks to John Chase for letting me guest post on his blog. Thanks to James Key for encouraging me again and again to think about extraneous solutions.