Categories
Math

Avoid Oversimplified and Faulty Logic

Many concepts in subjects like math and science can be very technical, requiring lots of background knowledge to understand. This limits the audience of people who can discuss and work with these concepts. But given how much these subjects can influence our society, for example with the modern tech industry and computer science, there is a real need for more people to understand these concepts. Simplified explanations can go a long way towards enabling this. However, it can be hard to write such explanations while also describing the concepts accurately and incorporating scientists’ and engineers’ input on how to present their own work.

In this post, I will discuss a mistake that is often made with such explanations — namely, oversimplified logic that is faulty. I’ll then discuss some approaches for avoiding this kind of mistake.

Where Faulty Logic Arises

For some mathematical and scientific assertions, the underlying logic justifying their validity can be difficult to follow on a first encounter. Furthermore, this logic could also require prerequisites that are not widely known, making the subject even less accessible. Thus, some explanations attempt to “smooth over” the logic in an effort to compromise between not justifying anything or including all the complicated details. However, many times this can lead to false equivalencies, or in general faulty logic, that ultimately ends up confusing people more.

As an example, look at how calculus is typically first taught to students, especially those who aren’t expected to become math majors. Specifically, consider how derivatives are discussed. Many textbooks and resources introduce the notation dy/dx for the derivative of y with respect to x, but then later on they “forget” that this is merely notation and instead treat this as an actual fraction. For example, a typical “proof” of the chain rule, (dy/du)(du/dx) = dy/dx, is just “cancel the dus.”

This is erroneous. The notation dy/dx is just notation, that happens to be the same notation we use for fractions. It’s not the case that there are actually numbers dy and dx where the derivative is equal to their quotient; the derivative isn’t a fraction. In fact, we are the ones who defined this notation for the derivative in the first place. There is no reason why we couldn’t have alternatively defined the notation to be dy+dx instead of dy/dx. Then, the chain rule (which is always true regardless of the notation we use to express it) would instead say (dy+du)(du+dx) = dy+dx.

As an analogy, this would be similar to me inventing my own language in which the word “orange” means what in English we would call an apple, and then based on that concluding that apples are actually orange in color. You can’t define two different meanings for the same word and then just state that those meanings are related simply because they share the same word for expressing them — especially when you were the one who defined things that way. If you could do that, then you could “prove” almost anything to be true. Nothing would make any sense then.

And for the example of calculus, this gets even worse, because in the “next stage” of calculus, in multivariable calculus, the chain rule has a different form. There, the “cancel the dus” maxim would actually give you the wrong answer. That ends up confusing students: when is it valid to “cancel dus” and when is it not? Without some deeper understanding of the logic behind derivatives, people are left in the dark when trying to figure these concepts out.

In general, especially in subjects like math, faulty logic only works up to a point. There is no good reason to expect it to work in general, because the logic is faulty, and that is exactly what ends up happening when it fails down the line. This sets up students for a lot of confusion later on, causing them to question what they used to consider as an unquestionable foundation. (Of course, if faulty logic was used in the first place, such a “foundation” would not have actually been as strong as it should have been.) And especially for technical subjects that tend to be cumulative, not addressing faulty logic early on can lead to significantly more confusion later.

Furthermore, there will be some discerning students who can see through the faulty logic from the beginning, and these people will be confused even earlier on. As a result, they could potentially be left behind.

Tips on Avoiding Faulty Logic

So, OK, faulty logic would be a problem for any explanation of a technical assertion, including a simplified or even non-technical one. But we also said upfront that sometimes the logic behind these statements can be gnarly and complicated, and the entire reason why faulty logic ends up seeping through into educational material is in order to compromise here. For example, maybe a truly non-faulty logical proof of the chain rule in calculus would be too complicated to discuss when students are just seeing calculus for the first time. In fact, today many of the proofs of calculus statements are discussed in a college subject called real analysis, and typically a math major’s first real analysis class is considered to be pretty difficult, almost like a “rite of passage” for becoming a mathematician. So how can we avoid faulty logic while also keeping the presentation of the material more generally accessible?

I’ll discuss some of my ideas for navigating this as successfully as possible, but first let’s talk about different types of faulty logic that seem to me to be the most common.

First, the faulty logic above with the Chain Rule is an example of the general fallacy of misuse of notation. Here, we forget what the notation actually means at an underlying level — what it is referring to — and we just manipulate it symbolically. While often in math we try to choose notation that is suggestive of appropriate results (there is a reason we use the dy/dx notation for derivatives and not dy+dx), misusing notation is still completely illogical, and as discussed earlier it will lead to problems and confusion.

Another kind of faulty logic is circular logic, where you try to prove something but you end up assuming the very thing you want to prove in its own proof. (So basically, if you imagine that you listed out logical statements where each one leads to the next, it would at some point “wrap around,” looking like a circle.) This actually comes up in a natural way. Often times when trying to prove things, a useful method is to work both forwards and backwards, where hopefully we can connect the two directions at some point in the middle. We then hope that if we can reverse our logic properly, we can yield a full proof. However, with larger arguments, sometimes we can get lost in the details and forget to be careful, committing a fallacy of circular logic.

As a general theme, what we do to come up with proofs is not necessarily what is needed to make the proofs stand up in the end. We still have to make the argument connect logically, otherwise there is no reason at the end of the day to believe in our conclusion.

A third kind of faulty logic involves making unwarranted assumptions. For example, often times in math it is easy to forget that the theorems we’re proving are vastly general, applying to cases well beyond the specific ones we are familiar with. While it is useful to use familiar examples as a guide for intuition, we can’t make additional assumptions based off of those that don’t match the general scope of the theorems.

A general tip for avoiding faulty logic is to become comfortable saying “we can’t fully prove this right now.” We can still provide value by clarifying why exactly a deeper proof is needed, and then we can say that such a proof would be out of scope for our simplified explanation. This can actually be a great way to whet the student’s appetite for later studies; for example, in calculus, we can show why the oversimplified derivations are wrong, and then we can point to real analysis as a place where those derivations will be more thoroughly explained, which can drum up interest for future real analysis classes.

A restatement of this point is expressed by this maxim: there’s nothing wrong with using an approximation, but it is wrong to not call it an approximation. We need to be upfront with students about what exactly we’re claiming and what logic we’re using to get to those claims; otherwise, students will end up more confused and with gaps in their understanding. We don’t necessarily need to dive into the strongly technical details then and there, but if we are simplifying some aspect of things, then we should point this out and be clear as to why it is a simplification. Even in a non-technical setting, I think it should be doable to at least clarify which parts need deeper analysis, without necessarily going into them ourselves.

If we can avoid faulty logic with this tip, then I think we can deliver more accessible explanations that are also technically more accurate, better representative of scientists’ work, and not riddled with holes that will end up confusing people later on.

Written September 2022, edited May and November 2023.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.