Showing posts with label epistemology. Show all posts
Showing posts with label epistemology. Show all posts

Sunday, July 15, 2012

Simplicity as Evidence of Truth: How Do We Know It?

Home  >  Philosophy  >  Philosophy Of Science

This part 3 of a series on simplicity being evidence of truth.
  1. Simplicity as Evidence of Truth: Justifying Ockham’s Razor
  2. Simplicity as Evidence of Truth: Theories Tying Into Background Knowledge
  3. Simplicity as Evidence of Truth: How Do We Know It?
In this section I’ll have some concluding remarks to this series, one of which is this: how do we know that simplicity is evidence of truth? Whereas in part 1 of simplicity as evidence of truth I offered some justification for thinking that simplicity is evidence of truth, in this entry I’ll say a bit about how in practice we come to believe simplicity is such a guide for truth.

Recap

In part 1 of simplicity as evidence of truth I gave the following illustration to evoke the intuition that, ceteris paribus, the simplest explanation is the one most likely to be true. Suppose we investigate a new area of scientific research for which we have no background knowledge to tell us which theory is more probable and we study two variables: x and y. We have the following data:

Data Set 1
 
x0123456
y024681012


Let’s call the above situation the “Data Set 1 Scenario.” An equation presents itself for predicting the other values of x and y: y = 2x (we’ll call this equation 1), but that isn’t the only formula that fits the data. As Swinburne points out, all formulas of the following form (which we’ll call equation 2) yield the same data as well.[1]

Equation 2
 
y = 2x + (x − 1)(x − 2)(x − 3)(x − 4)(x − 5)(x − 6)z


Where z can be any constant or function of x. Although they agree with the data, the two equations may make very different predictions of unobserved data. For example, we can let z be x720 to get the following equation that predicts unobserved data differently from equation 1:

Equation 3
 
y = 2x + (x − 1)(x − 2)(x − 3)(x − 4)(x − 5)(x − 6)(x720)


Equation 1 predicts that when x = 9, y will be 18. Equation 3 predicts that when x = 9, y will be 270. If we were forced to go with either equation 1 or equation 3 to predict further data, which one would we choose? Obviously equation 1, and the reason seems clear: simplicity. There are literally infinitely many equations fitting Data Set 1 yielding infinitely many different y values for x = 9 (if nothing else, there are infinitely many numbers one could use for z), yet if one were forced to correctly predict what y would be for x = 9 and the consequences were sufficiently dire for predicting incorrectly (say, upon pain of ignominious death that one strongly wants to avoid), we would think it quite irrational to give any answer other than y = 18 for x = 9. This fact (if the stakes were sufficiently high it would be irrational to go with anything other than y = 18 for x = 9) suggests that simplicity is indeed a tool of rationality; we use simplicity as a guide for obtaining the truth.

Past Success Justification

The law of parsimony has been widely used in the history of science. How is it that we know simplicity is evidence of truth? One possible response is, “In science’s history, we’ve found that simpler theories tend to be better predictors.” One could say that the reason we believe simplicity is a guide for truth is through experience. Simplicity has served us well in the past, so probably it will serve us well in the future.

There are some problems with that “past success justification” approach however. Swinburne says the “in the past, simpler scientific theories have been the better predictors” claim is doubtful, since in science simpler physical laws have on many occasions been supplanted by more complicated laws.[2] Swinburne also adds:
But even if simplest theories have usually proved better predictors, this would not provide a justification for subsequent use of the criterion of simplicity, for the reason that the justification itself already relies on the criterion of simplicity. There are different ways of extrapolating from the corpus of past data about the relative success which was had by actual theories and which would have been had by possible theories of different kinds, if they had been formulated. “Usually simpler theories predict better than more complex theories” is one way. [3]
There are many ways to form a pattern of “theories that predict best” (or for that matter, true theories) that fit past experience (many of these giving different predictions about what future theories are likely to be better predictors), and surprise surpise, simplicity is big factor when choosing the “right” pattern. Astute readers may recognize the similarity between this situation and the Data Set 1 scenario, where there are innumerable theories that fit Data Set 1, predict that data set with perfect success, but give different future predictions.

If you’re skeptical of the existence of innumerable patterns fitting past experience, note that if nothing else one can construct an outrageously complicated disjunction of specific descriptions like “If (a man named Newton does such-and-such at such-and-such time) or (a man named Einstein does such-and-such at such-and-such time) or (...) or (...) ..., then the resulting equation will be at least approximately true.” If one objects saying, “Well, if Newton were named Smith, the same results would have obtained” one should note that (a) this still doesn’t change the fact that the massive disjunction perfectly fits past successes; (b) we can add “would have been” matters to the disjunction to get something like “If a man named Newton or Smith did such-and-such at such-and-such time)..., then the resulting equation will be at least approximately true.”

What’s more, just as equation 2 used a relatively complex set of parenthetical units (x − 1)(x − 2)…(x − 6) to help perfectly fit the observed data set, yielding infinitely many equations for infinitely many different possible future results (largely thanks to there being infinitely many choices for z), so too one can adopt a similar sort of approach for past successes where there are infinitely many models that perfectly fit past knowledge (and knowledge that would have obtained on other circumstances), give detailed predictions for the future, yet the models contradict each other over those future expectations. If nothing else, it remains true that there are many massive conjunctions (“If (A) or (B) or (C) or (D)…., then the theory is a good one”) that perfectly fit past successes yet give future different predictions of what is likely to be true in the future. Each of these infinitely many models, if it were used in the past, would have yielded great results. So clearly there are infinitely many models to fit past successes and at least some of these models (indeed, infinitely many) will contradict each other over future expectations. As noted in part 1 of this series, Swinburne notes that the criterion, “Choose the theory which postulates that the future resembles the past” is empty (infinitely many theories that are inconsistent with each other do that), and that to have real content we should change it to “Choose the theory which postulates that the future resembles the past in the simplest respect.” [4]

So how do we know that simplicity is a tool for rationality? In practice, this is something we know via a priori intuition. Even it is possible to construct a clever argument for why we should accept simplicity as a guide for truth, in practice that’s not why we do accept it. We accept it because it just seems to be true.

Not Just in Science

While a big focus in this series has been simplicity and science, the law of parsimony extends beyond science. Swinburne gives this nice illustration.
If we can explain an event as brought about by a person in virtue of powers of the same kind as other humans have, we do not postulate some novel power—we do not postulate that some person has a basic power of bending spoons at some distance away if we can explain the phenomenon of the spoons’ being bent by someone else bending them with his hands.[5]
Another illustration is the use of simplicity in the argument from ontological simplicity in my series on the moral argument.







[1] Swinburne, Richard. Simplicity as Evidence of Truth (Milwaukee, Wisconsin: Marquette University Press, 1996), p. 22.

[2] Swinburne, Richard. Simplicity as Evidence of Truth (Milwaukee, Wisconsin: Marquette University Press, 1996), p. 52

[3] Swinburne, Richard. Simplicity as Evidence of Truth (Milwaukee, Wisconsin: Marquette University Press, 1996), p. 52

[4] Swinburne, Richard. Simplicity as Evidence of Truth (Milwaukee, Wisconsin: Marquette University Press, 1996), p. 23

[5] Swinburne, Richard. Simplicity as Evidence of Truth (Milwaukee, Wisconsin: Marquette University Press, 1996), p. 59

Sunday, July 8, 2012

Simplicity as Evidence of Truth: Theories Tying Into Background Knowledge

Home  >  Philosophy  >  Philosophy Of Science

This part 2 of a series on simplicity being evidence of truth.
  1. Simplicity as Evidence of Truth: Justifying Ockham’s Razor
  2. Simplicity as Evidence of Truth: Theories Tying Into Background Knowledge
  3. Simplicity as Evidence of Truth: How Do We Know It?
As mentioned in part 1, much of this series is taken from Richard Swinburne’s Simplicity as Evidence of Truth. This article will discuss how simplicity plays a role in judging how well a theory fits in with our background knowledge. Before moving on I’ll introduce some preliminaries.

Some Facets of Simplicity

In this article it’ll be nice to mention a few facets of simplicity borrowed from Swinburne.[1]
  1. Number of entities. The theory that postulate fewer entities is simple than if it postulated more entities. As Swinburne notes, “The application of this facet in choosing theories is simply the use of Ockham’ razor.”[2]
  2. Number of kinds of things. A theory that postulates fewer different kinds of entities is simpler than if it postulated many different kinds of entities, e.g. a theory that postulates fewer different kinds of quark is simpler than a theory that postulates more of them.
  3. Fewer separate laws is simpler than many separate laws.
More could be listed (and Swinburne does list more) but this small list will be enough for our purposes.

Simplicity Not the Only Factor

I’ve been using the phrase “ceteris paribus” an awful lot in this series precisely because there are other factors to consider when choosing a good explanation besides simplicity. The list below is largely taken from Richard Swinburne.[3]
  • Yielding the data. This category covers both explanatory scope (how much data the theory explains) and explanatory power (the probability of expecting the data if the explanation were true).
  • Content. All else held constant, the theory that makes more claims (or is a stronger claim in the sense that it “claims more”) is less likely to be true than one that makes fewer claims (or is a weaker claim in the sense that it “claims less”). For example, “At least a million animals of this species have black fur” is a stronger claim than “At least ten animals of this species have black fur.” All else held constant, the weaker claim “At least ten animals of this species have black fur” is more likely to be true than the stronger claim.
  • Fitting in with background knowledge. For example, “The hypothesis that John stole the money is rendered more probable if we know [due to our background knowledge] that John has stolen on other occasions and comes from a social group among whom stealing is widespread.”[4] The likelihood of such background beliefs being true plays a role in our judgments. However, simplicity has a role when determining how well a theory fits in with our background knowledge.
And it’s that last item that we’ll deal with next in regards to simplicity.

Simplicity and Background Knowledge

Where there is relevant background knowledge, simplicity is a factor that determines which theory “fits best” with such data. Swinburne even goes so far as saying that “fitting better” is “fitting more simply” and making for a simpler overall view with the world.[5] When discovering a new chemical substance, it’s possible that it has a special kind of quark or special kind of chemical bond never before discovered that yields the same data, but it’s simpler not to posit “more than one kind of thing” and simply use what’s available ceteris paribus. Notice that in this case having a special kind of quark or special kind of chemical bond never seen before does tie in with our background knowledge to some extent (we believe that there are such things as quarks and chemical bonds) but fitting into our background knowledge more simply (using the same sort of quarks and chemical bonds we know of) is the better option. Simplicity is clearly a role when deciding how well a theory fits in with our background knowledge.

Theories often interact with each other; our current theory of genetics ties in with belief in DNA, and the belief that DNA ties in with belief in atoms etc. Also, when making predictions we rely on background knowledge; e.g. at one point people thought that if the earth was really moving, birds would be blown West when they let go of a tree branch. We no longer accept that as evidence that the earth is not moving because we have a different background system of physics that allows us to make different predictions. Call this network of theories and assumptions a conceptual grid. When looking into what theories tie into our background knowledge, we are assuming that ceteris paribus the world is more likely to be simple than complex, and ceteris paribus we prefer simpler conceptual grids to complex ones. Hence when discovering a new chemical, we don’t assume the chemical has entirely new types of quarks (thus giving us a more complex conceptual grid) when the chemical having the same sort of quarks we are familiar with will do (in terms of yielding the data etc.).





[1] Swinburne, Richard. Simplicity as Evidence of Truth (Milwaukee, Wisconsin: Marquette University Press, 1996), pp. 29-30, 31.

[2] Swinburne, Richard. Simplicity as Evidence of Truth (Milwaukee, Wisconsin: Marquette University Press, 1996), p. 29

[3] Swinburne, Richard. Simplicity as Evidence of Truth (Milwaukee, Wisconsin: Marquette University Press, 1996), pp. 18-19

[4] Swinburne, Richard. Simplicity as Evidence of Truth (Milwaukee, Wisconsin: Marquette University Press, 1996), p. 18

[5] Swinburne, Richard. Simplicity as Evidence of Truth (Milwaukee, Wisconsin: Marquette University Press, 1996), p. 45

Thursday, July 5, 2012

Simplicity as Evidence of Truth: Justifying Ockham’s Razor

Home  >  Philosophy  >  Philosophy Of Science

This part 1 of a series on simplicity being evidence of truth.
  1. Simplicity as Evidence of Truth: Justifying Ockham’s Razor
  2. Simplicity as Evidence of Truth: Theories Tying Into Background Knowledge
  3. Simplicity as Evidence of Truth: How Do We Know It?
In one blog entry where I argued that objective morality could be used as evidence for theism, I used what I called the argument from ontological simplicity. In that blog entry I mentioned the principle of rationality that, all else held constant, the simplest explanation is the best one—which incidentally is one of the formulations of Ockham’s razor (also known as Occam’s razor) named after 14th century philosopher William of Ockham. It’s also been called (perhaps more accurately) as the law of parsimony. A formulation of Ockham’s razor that is closer to what the philosopher originally stated—and one facet of simplicity—is to not multiply explanatory entities beyond necessity. Though both versions of Ockham’s razor are often used in science, philosophy, and everyday life, is it really true that ceteris paribus the simplest explanation is the one most likely to be true?

A Quick Justification for Simplicity

Much of this entry (and its sequels) will be taken from Richard Swinburne’s excellent little book Simplicity as Evidence of Truth, including the following illustration. Suppose we investigate a new area of scientific research for which we have no background knowledge to tell us which theory is more probable and we study two variables: x and y. We have the following data:

Data Set 1
 
x0123456
y024681012


Let’s call the above situation the “Data Set 1 Scenario.” An equation presents itself for predicting the other values of x and y: y = 2x (we’ll call this equation 1), but that isn’t the only formula that fits the data. As Swinburne points out, all formulas of the following form (which we’ll call equation 2) yield the same data as well.[1]

Equation 2
 
y = 2x + (x − 1)(x − 2)(x − 3)(x − 4)(x − 5)(x − 6)z


Where z can be any constant or function of x. Although they agree with the data, the two equations may make very different predictions of unobserved data. For example, we can let z be x720 to get the following equation that predicts unobserved data differently from equation 1:

Equation 3
 
y = 2x + (x − 1)(x − 2)(x − 3)(x − 4)(x − 5)(x − 6)(x720)


Equation 1 predicts that when x = 9, y will be 18. Equation 3 predicts that when x = 9, y will be 270. If we were forced to go with either equation 1 or equation 3 to predict further data, which one would we choose? Obviously equation 1, and the reason seems clear: simplicity. There are literally infinitely many equations fitting Data Set 1 yielding infinitely many different y values for x = 9 (if nothing else, there are infinitely many numbers one could use for z), yet if one were forced to correctly predict what y would be for x = 9 and the consequences were sufficiently dire for predicting incorrectly (say, upon pain of ignominious death that one strongly wants to avoid), we would think it quite irrational to give any answer other than y = 18 for x = 9.

Examples like this strongly suggest that simplicity is among the tools of rationality. For equations in science (e.g. the multitude of equations in physics) there are literally infinitely many possible equations perfectly fitting the observed empirical data that give different predictions of unobserved data, including wild ones like that of equation 2 (where (x − 1)(x − 2)... are added so that when x is 1, 2, etc. the data will come out right) but out of the multitude of equations that fit the observed data, scientists rationally prefer the simpler ones when it comes to making new predictions, even if they don’t do so consciously. If simplicity isn’t a guide for truth, why is it that if the stakes were sufficiently high it would be irrational to go with anything other than y = 18 for x = 9?

Objections and Rebuttals

Objection: We don’t need simplicity; we just assume that the future is like the past to make the next prediction. That is how we can favor equation 1 over equation 3.

Rebuttal: Both equations assume the future resembles the past, e.g. both equations say that for any time in the future when x = 4, we will get y = 8. Swinburne notes that the criterion, “Choose the theory which postulates that the future resembles the past” is empty (infinitely many theories that are inconsistent with each other do that), and that to have real content we should change it to “Choose the theory which postulates that the future resembles the past in the simplest respect.”[2] But in that case we have the simplicity criterion in action.

Objection: We can test and eliminate alternate theories with further testing rather than relying on simplicity. For example, equation 3 predicts that when x = 7, y would be 21 whereas equation 1 predicts y would be 14. So if we observe that y = 14 when x = 7, then we’ve confirmed equation 1 over equation 3.

Rebuttal: Even with this approach infinitely many theories will always remain. In philosophy of science, that there are countless theories that fit the empirical data is sometimes described as empirical data underdetermining theories. For example, suppose it is true we observe that y = 14 when x = 7 such that we get Data Set 2 below:

Data Set 2
 
x01234567
y02468101214


We can see that there will be infinitely many equations fitting Data Set 2 by noting equation 4 below, where z stands for any constant or function of x.

Equation 4
 
y = 2x + (x − 1)(x − 2)(x − 3)(x − 4)(x − 5)(x − 6)(x − 7)z


Thus the “just keep on testing” approach just isn’t enough when trying to eliminate the alternatives in the Data Set 1 scenario. Of course, it is possible that the proposed simplest theory might later shown to be false with later observations, but even when that occurs, if we are to choose between empirically identical theories, all else held constant we are rational to prefer the simplest theory as the most likely one.

Objection: There are reasons why we choose simpler theories that don’t have to do with simpler theories being more likely to be true; e.g. it is more convenient to work with simpler theories.

Rebuttal: Convenience is nice to have, but it’s still the case that we in practice rely on simpler theories as being true ceteris paribus. If we want to know whether e.g. bridge will be able to withstand trucks driving over it, we want a theory that gives us the truth, not one that is merely more convenient to work with. In practice, when seeking the truth we go with the simplest theories ceteris paribus. To illustrate further, consider the following case. A scientific experiment has gone horribly wrong and part of the building will be destroyed. You are trapped inside the building and have the option of being either in region #1 of the building or region #2 by the time the explosion occurs, but one of those regions will annihilated and these are your only two options. Which region will be destroyed will depend on which theory about the experiment is true: theory S (the simpler theory) or theory C (the more complicated theory). Both theories are equal in explanatory power, explanatory scope, how well they tie in with background knowledge etc. Theory S says that region #1 will be destroyed, and theory C says region #2 will be destroyed. The rational thing to do would be to go with the simpler theory and move to region #2—not because the simpler theory is easier and more convenient to use, but because the simpler theory is more likely to be true.

To make things more concrete, suppose it came down to the Data Set 1 scenario, where the data set was this:

Data Set 1
 
x0123456
y024681012


Suppose guessing the right non-destroyed region depended on predicting the right answer for what y would be when x = 9. Again, it would be irrational to guess anything other than y = 18, even if it meant going to region #2 and even if the travel would be somewhat burdensome (e.g. you would have to rush up some flights of stairs as opposed to sitting on a comfortable couch).





[1] Swinburne, Richard. Simplicity as Evidence of Truth (Milwaukee, Wisconsin: Marquette University Press, 1996), p. 22.

[2] Swinburne, Richard. Simplicity as Evidence of Truth (Milwaukee, Wisconsin: Marquette University Press, 1996), p. 23