Friday, December 16, 2011

Secrets of the truth cult!!

For much of their history human beings have taken part in rituals in which an authority informs other people of what is supposed to be the Truth. I call this the pulpit model of information. For centuries Europeans went to church and an authority got up in the pulpit and told them what to believe about the world (and other places).

This model was later adopted by the schools, no doubt because the schools were established by churches. Whatever the reason, schooling until recently consisted of listening to an authority tell you what to believe about the world (in universities, it still often consists of this). In school, though, you were even tested to make sure you’d learned the approved view of things.

In school you also acquire the idea that Truth is something that can be found on the printed page. Consequently we come to accept something that has been published as true, without verifying that it is.

It’s not surprising that we come to look on the truth as something that is dispensed by authorities. Consequently, we look around for people who look like authorities, and treat what they say as information. Furthermore, we treat the methods they use to come up with things to say as methods that can be used to define information. We are often wrong.

Given the track record of authorities (remember all those biological weapons that, according to authorities, Iraq was just itching to use against the West?), depending on them to tell us the truth is a questionable approach. Another problem with this approach is that there is considerable doubt as to whether we need to know the truth, anyway.

Here’s something that’s true: Churchill, Manitoba, is named for John Churchill, first governor of the Hudson’s Bay Company. That’s a fact. Despite being a fact, though, it doesn’t help me get served when I drop in to the local branch of his company.

Every day we are bombarded with truths. The newspaper tells us things like what the temperature was yesterday in Beijing and what celebrities have (or had) their birthdays today. I remember once reading in the paper that it was the late Alfred Hitchcock’s birthday and thinking “I can’t really send him a card, can I?”

Better than mere truth is information. Information is confused with many things that are not informative, though.

Facts, as we have just seen, are not necessarily informative. Unless I’ve made a bet about what the high temperature in Beijing was going to be, that fact cannot be said to inform me of anything.

Furthermore, many items of information are not factual. The idea of intelligence, for example, cannot be said to be a fact, since there is widespread disagreement about just what intelligence is. However, the concept of intelligence is informative because in speculating about it we discover useful things. We have even discovered some of the shortcomings of the idea of intelligence.

As we have also seen, authoritative statements are not necessarily informative. Another reason they're not necessarily informative is that they disagree with each other. In fact, many of them work according to decision models which encourage disagreement as a way of establishing crucial issues that need to be tested. Courts of English law, for example, require two or more highly trained professionals to argue for exactly opposite points.

People also often assume that a logically sound argument is informative. However, it need not be. We can reason as soundly as it’s possible to reason and still be wrong.

Deductive reasoning starts with a general premise or principle. It then applies that premise to a specific piece of evidence and draws a conclusion about that piece of evidence. For example, we might reason like this:

  • All Canadians are British subjects. (general principle)
  • John FitzGerald is a Canadian. (evidence)
  • Therefore, John FitzGerald is a British subject. (conclusion)
Well, that conclusion is true. However, let’s suppose we reason like this:
  • All Canadians have French first names.
  • John FitzGerald’s first name is not French.
  • Therefore, John FitzGerald is not a Canadian.
That conclusion is not true, although the reasoning is entirely sound. Since my first name is not French, the conclusion that I am not Canadian follows logically from the general principle that all Canadians have French first names. The problem, of course, is that the general principle is wrong. Consequently, all statements that follow logically from it are most likely to be wrong. That example is a bit artificial, but people draw sound conclusions from erroneous premises all the time.

For example, many people reasoned out thoroughly logical arguments that on January 1, 2000 the world would be thrown into chaos. I say their beliefs were serious because they acted on them. They stockpiled food, for example, they bought portable electric generators, and some even created fortified shelters to protect themselves from people who hadn’t stockpiled food or bought generators.

As we saw on January 1, 2000, though, the computers didn’t fail. Some of the premises in those thoroughly logical arguments had been unsound. Logic is a tool. Logic does not guarantee that your arguments will stand up any more than a hammer guarantees that the bookcase you build with it will stand up.

Information is often confused with consensus. The supposed existence of a consensus among scientists about global warming is supposed to imply that the consensus opinion is highly likely to be true. Well, a hundred years ago a consensus of scientists would have told you that other races were inferior to whites.

The issue of consensus about global warming seems to have been raised initially as a red herring. That is, people argued against taking action against global warming because there was no scientific consensus about what caused it.

However, consensus has nothing to do with it. At one time there was a scientific consensus that the sun revolved around the earth. That point seems to have escaped the people who are opposed to taking action against global warming, though. Now they complain that this consensus they considered so desirable is being forced on them.

What is informative about an idea is its ability to predict events. The chief value of consensus seems to be coming up with a plan that everyone, or at least everyone important, is willing to go along with. To me, that seems a lot like what lemmings do.

Information cannot be defined by its source. If an expert meteorologist says tomorrow will be sunny, clouds don’t decide to go somewhere else just because a respected source says they will. Information is defined by its effect. Information increases the probability that we will act in effective ways. If it never rains on days when the weather forecast calls for rain, you’re going to end up lugging around a useless umbrella. If it always rains on days your bunions hurt, though, your bunions are a mine of information.

The Truth Cult © 2007, John FitzGerald

Tuesday, December 13, 2011

Another dubious sports statistic

I believe it is against Canadian law for a televised hockey game to be completed without the announcer mentioning, somewhere amid his (sic) endless recitation of players' hometowns, that getting the first goal is all-important, since the team that gets the first goal wins such a high percentage of games.

This belief seems to have come from a study of all major league baseball games between 1966 and 1987 which found that 66% of the games were won by the team that scored first. That’s an interesting finding because in baseball the visiting team is more likely to score first (since it bats first). However, the home team was still more likely to win, so the importance of the first run was still questionable. In 1998 Tom Ruane published an article in which he showed that teams scoring the first run were less likely to win than teams who were the first to score each of the second through ninth runs. The first run, it seemed, was actually the least important run to score. How can that be?, you may be asking. How can a run associated with 66% of victories be unimportant?

The reason it’s unimportant is most likely that the winning team scores more runs than the losing team. Consequently, it’s more likely to score the first run. So even if scoring the first run has no effect on the chances of winning a game, the winning team is still more likely to score the first run.

To examine this possibility I chose data from another sport in which teams don’t alternate offensive and defensive sessions. I collected scores from 110 National Hockey League games played from November 30, 2006 to December 14, 2006. I included games settled by shootout, but gave no credit to the winning team for the goal awarded for the shootout. The team scoring the first goal won 70% of these games (77 of the 110). However, the winning team also scored 68% of the goals (439 of 649). So, if scoring the first goal did not improve a team’s chances of winning a game, you’d still expect the winning team to score the first goal in 68% of the games, or 75 games. The improvement here is all of two percentage points.

But is it an improvement? You can’t reasonably expect that teams scoring 68% of the goals will necessarily win exactly 68% of the games. Other factors have some effect on the outcome, so you’d expect them to win a number around 75. Fortunately, we can estimate the probability that:
  • if scoring the first goal does not increase a team’s chances of winning and
  • if winning teams score 68% of the goals then
  • the team scoring the first goal will win 77 games.

That probability is 44%. Conventional standards of statistical signficance would reject the idea that the first goal is of any importance when the percentage is that high. However, arguing that the probability of the difference being real is still greater than 50% is entirely reasonable. But if we look at the difference that way, we still have to conclude that there is only a 56% chance that scoring the first goal increased the likelihood of winning a game, and that if it did increase the probability of winning a game, it increased it by only 2 percentage points (aka one chance in 50). Either way, that first goal doesn’t seem all that important.

I propose an alternative to the Law of the All-Important First Goal/Run. I modestly call it FitzGerald's Law: the first team to score the winning goal will win. My law has as much explanatory value as the Law of the Fatal First Goal/Run, but is logically more elegant. It also reminds me of another statistical topic which baffles me: why, in a baseball game which finishes with a score of 11-10, can the player who drove in the first run for the winning team get credit for the game-winning RBI? Hm?

Another dubious sports statistic © 1995, 2006 John FitzGerald

More articles at ActualAnalysis.com

Friday, December 9, 2011

Better living through multiple linear regression analysis

I probably say somewhere on the main ste that multiple regression analysis is overused, and indeed it is. Nevertheless, it does have valuable uses which I don't want to frighten people away from, so here's an article about one of them.

I regularly use regression analysis to clarify for a client the factors affecting satisfaction with training and rehabilitation programs the client offers. It started with a review of a program about which the client knew rhat the more enthusiastic about the program consumers were on entry, the more satisfied they were at the end. The question was whether final satisfaction or dissatisfaction with the programs was simply a self-fulfilling prophecy – did consumers say they were satisfied or dissatisfied with the programs simply to justify their initial attitudes?

The client also collected information about consumers' opinions of various characteristics of their programs. This information was not correlated with initial attitude, nor were different types of this information correlated with each other. It was therefore easy, using multiple linear regression analysis, to estimate what proportion of final satisfaction could be explained by initial attitude toward the programs, and then see if characteristics of the programs explained the remainder of the final satisfaction (the residual, as it's known in regression analysis). It turned out that characteristics of the programs were twice as important as initial attitude in determining satisfaction with the programs.

So not only did multiple linear regression analysis determine that satisfaction with the programs was not a self-fulfilling prophecy, it also estimated the relative importance of initial attitude and of the actual characteristics of the programs. The analysis was made easier by the lack of correlation between the different types of information collected, but correlated information can be analyzed with more complicated designs. The possible existence of correlation, though, is the chief reason you shouldn't try this at home. Statistical and database software make it easy to do multiple linear regression analysis, but if you don't know how to deal with correlated variables or how to identify outliers (extreme observations which distort the results), you'll often get the wrong results when you use that software.

We have since gone on to use this technique to determine whether what consumers say are the important factors in determining their satisfaction are in fact the most important. We have frequently found that a simple count of the most popular explanations is contradicted by the multiple linear regression analysis. This is not surprising, since counting explanations, even if they are valid, gives us only a very rough estimate of the importance of different factors. The multile linear regression analysis clarifies the issue.

Of course, it is also important that you use a proper hypothesis-testing design. Just turning multiple linear regression loose on a set of data is almost certain to produce a large proportion of unhelpful or misleading results.

Better Living through Multiple Linear Regression Analysis © 1999, 2011 John FitzGerald

Monday, December 5, 2011

The myth of information technology

The term information technology implies to many people that the technology to which it refers creates information, transmits it, or stores it. The technologies we group together as information technology, however, rarely perform any of these functions. They are called information technology because they use information, not because they transmit it. A cellphone, for example, converts coded electrical signals into a facsimile of a person speaking. What the person is saying, though, may be balderdash.

Information consists of data which reduce uncertainty. The technology which we refer to as information technology is blithely unaware of whether the data it deals with reduce uncertainty or not.

The data provided by "information technology" may not be informative simply because they are irrelevant. For example, if I go looking for the box score of a particular baseball game in the newspaper, the other box scores, informative as they are, simply make it more difficult for me to find the one I'm interested in. These days, though, people use their information technology to collect large amounts of information which are of no relevance to the decision they're going to make.

Then again the data may not be informative because they are not intelligible. While Turkish newspapers are informative to Turks, they are not informative to me, because I don't speak Turkish. I deal with this problem by not subscribing to Turkish newspapers. However, people often use their information technology to collect large amounts of data which they can no more interpret than I can interpret Turkish newspapers. Turning data mining software loose on the data is not guaranteed to turn it into information, either, for reasons which are discussed in other articles on the main site.

The fact is that we make information, not technology. Even those rare items of software which perform analytical functions were created by human minds. Most of what we call information technology is actually nothing more than data technology. It gives us the capacity to collect large masses of data, but it is up to us to find or define the information in it.

Few people believe everything they read in the newspaper or see in the television. Few believe that every item that appears in the newspaper or on the television is relevant to their concerns. It's time for the same discernment to be shown in dealing with databases.

We hear a lot these days about the problem of information overload. In fact, it is data we are overloaded with, not information. If we set out to collect data, we will drown in data. If we condescend, though, to use our analytical abilities, and set off in search of the data that we need, we will find that you can never be overloaded with information.

Thursday, November 24, 2011

Transparent evaluation

One of the ways I earn my living is by evaluating service programs. People are often wary of program evaluation, since evaluation is a word with many meanings, many of these meanings negative. In research, though, evaluation has a very simple and neutral meaning. It is simply determining whether an event of interest has happened.

Program evaluation, therefore, is simply a matter of determining whether a program is doing the things that it is supposed to be doing. I prefer to look on it, in fact, as giving a program a chance to show what it can do. So, if you're going to show what a program can do, how do you go about it?

First of all you need a plan – a statement of what the program is supposed to be doing. If you're evaluating a program for the first time, the first step is likely to be the development of a program logic model. A program logic model is simply a description of the steps in the program and the decisions made once steps are completed.

Once you have the program logic model, you then determine if the program is following the model. Obviously, to do that you need records. A crucial part of any service program is a system of records which provides:

  • descriptions of the services being provided to each consumer,
  • descriptions of the goals which these services are to help
    the consumer achieve,
  • measures of the extent to which the goals have been achieved,
    and
  • descriptions of the decisions made as a result of the
    achievement or non-achievement of goals.

Obviously judicious examination of records like those is going to help you determine if the program logic model is being followed. If the program is not implementing the plan fully, then you can take steps to improve its chances of doing so.

The records system will also permit a thoroughgoing outcome evaluation. Accurate estimates of the program's success in achieving its ultimate goals can easily be calculated.

Furthermore, a good system of records will enable program staff or anyone else to perform the outcome evaluation by themselves. When you require an external evaluation, for example, you won't have to pay your independent consultant to develop an evaluation from the ground up. The evaluative standards will be set, and the evidence will be collected. Your consultant can spend time doing something more sophisticated and effective, such as studying specific aspects of the program that you consider important.

In short, the goal of program evaluation is to make the program transparent. If program evaluation is successful, there will be general agreement about what the goals of the program are, about the ways in which the program should be trying to achieve these goals, and about what the world should look like if the program is successful. There will also be clear standards by which anyone can reliably measure the degree of success achieved by the program. That also makes evaluation more bearable for staff, since they don't have to worry about their work being evaluated by standards of which they have not been informed.

Doing all this can be a lot of work. However, the benefits are enormous, and you need spend no more money, in either the short or long term, than you could end up spending on less productive approaches.

Transparent Evaluation © 2002, John FitzGerald

From ActualAnalysis.com

Monday, November 14, 2011

Monday, October 31, 2011

Average vs. average

I have run across people who, when calculating a mean, will discard their two or three highest and two or three lowest pieces of data and calculate the mean for the rest of their data. What they want to do is protect themselves against the effects of skew, specifically the distortion of a mean by a few extreme scores.

That probably doesn't hurt, but there is a simpler and much more effective way of dealing with this problem – use the median. The median is the score that is midway between the highest and the lowest. In other words it is the true average of your set of data (the mean is an estimate of the median). So use the MEDIAN function in your spreadsheet rather than the MEAN function.

There are some exceptions to this rule, though. If you're using your data to estimate a total – the total value of donations to an organization, for example – you'd use the mean. If you want to compare two sets of data with a statistical test you would usually be better off to use the mean.

And if the SKEWNESS function in your spreadsheet provides a skewness coefficient for your set of data that is higher than -1.00 and less than 1.00 you normally don't worry about this at all.

Friday, June 24, 2011

Uninformation (4)

Information is not identical with experience

We often assume that because someone is experienced in a field that they are therefore well informed about it. One is particularly likely to believe this if the person involved is oneself. One might as well argue that because I take the streetcar every day that I am an expert on public transportation, or that because I watch television every day I'm an expert on television. Obviously you acquire some knowledge from your experience, but it does not necessarily constitute an understanding of your experience.

And we may simply fail to learn from our experience. Psychologists talk about the consulting room phenomenon — faced with evidence that a diagnostic test such as the Rorschach test doesn't work the way it's supposed to, some psychologists and psychiatrists will reply that they've seen it work in their consulting rooms. In fact, individual practitioners have little opportunity to establish in their practice that a test actually works. The chief criterion they can use is the success of treatment, and even a correct diagnosis may lead to unsuccessful treatment, while an incorrect one may lead to successful treatment. We can also sometimes be a little lenient in deciding how successful we’ve been.

We have seen how authorities — people with great experience in their fields — usually disagree with each other. That is, their experience has led them to contradictory conclusions, and those conclusions cannot all be informative.

We derive information from our experience — we don't just pick it up by accident. We derive it by analyzing our experience in certain ways, acting on the conclusions we’ve drawn from our analysis, and then testing the adequacy of our conclusions.

First article in the Uninformation series

Actual Analysis
Uninformation (4) © 2011, John FitzGerald

Tuesday, June 21, 2011

Uninformation (3)

The opinions of authorities are not necessarily informative

We often treat anything printed in an authoritative journal or asserted by an expert to be informative. Although authorities and experts do tend to be far better informed about their subjects than the average person, we still cannot assume that whatever they say is informative or even true . All you have to do to learn why we have no justification is to read what authoritative foreign journals and experts have to say about your own country. The influential journal Le monde diplomatique once published an article whose author claimed that Canada had no constitution, but rather “a collection of texts with the force of a constitution”, and that these onstitutional texts could not be challenged in lower courts . Well, the latest of this collection of texts explicitly defines it as the national constitution, and it explicitly gives all courts the power to review all matters of law, which of course includes the constitution.

Our lives are rife today with experts and expert opinions. The news media are constantly presenting experts and their opinions about every topic under the sun, the implication being that an expert=s opinion is more informative than the opinion of someone who is not an expert..

For an assertion to be informative to us, though, we have to have some idea of the likelihood that it’s true. If the expert is an expert on gardening or cooking, verifying the accuracy of what he or she says is fairly easy. If, however, the expert is an expert on politics or medicine or some other field which requires special or complicated knowledge which you do not have, you may well have no way of verifying his or her opinion. A few years ago we saw experts queuing up to predict that the stock market would rise, if not forever, at least for a long, long time to come. Certainly these experts made arguments for their positions, but usually they were adducing as evidence for their opinion facts which the ordinary person could not verify.

Another problem about expert forecasts is that the experts are rarely experts in forecasting. J. Scott Armstrong and Kesten Green have observed that the scientific forecasts we are often encouraged to believe in are too often forecasts by scientists rather than forecasts arrived at scientifically.

Another problem is that experts are not impersonal compendia of information but human beings who advocate certain disputed positions in their field. They are advocates for ideas which other experts in their fields dispute. The Western intellectual tradition is to have as many people as possible arguing about ideas. Many of these ideas have the same quality that ideas about what was going to happen on January 1, 2000 had – they are founded on data which are not fully understood.

We can hardly expect experts to be perfect. If we cannot expect them to be perfect, then we have to assess the soundness of their opinions. If we are unable to assess the soundness of their opinions, then their opinions are not informative to us. They may well be valid, but if we cannot verify that they are valid then they are not informative. At the same time as all those experts were predicting that the stock market would rise forever, some experts were predicting that the bubble was going to burst. Those experts were right, but most of us had no way of verifying that they were. Therefore, even though they were right, they were not providing us with information.

First article in the uninformation series

Next: Information is not identical with experience

Website
Uninformation (3) © 2011, John FitzGerald

Tuesday, June 14, 2011

Uninformation (2)

2. The logical or reasonable is not necessarily informative

People often believe that if they can construct a chain of reasoning which supports their beliefs that therefore they have demonstrated that their belief is true and informative. For example, many people reasoned out arguments which they seriously believed demonstrated that on January 1, 2000 the world would be thrown into chaos. I say their beliefs were serious because they acted on them – they stockpiled food, for example, they bought portable electric generators, and some even created fortified shelters to protect themselves from people who hadn't stockpiled food or bought generators.

As we found out, they were wrong. However, I can’t say that their conclusion was any less sound than the conclusion I and most other people drew that any disruption that might occur on January 1, 2000 would be minor. The people who drew this conclusion were sane and their reasoning from their data was sound. It was probably as sound or sounder than my own. In the end, one reason I and most other people were right and they were wrong is that we were using better data – data which were more informative. Another reason is that we were just luckier. In fact, no one fully understood all the factors one would have to assess to produce an accurate forecast of what would happen to the power grid on January 1, 2000. Furthermore, we probably weren’t aware of all the factors that would have to be considered.

If sound reasoning is based on invalid and inadequate data, it will reach invalid and inadequate conclusions. None of us is perfect – not even, as unlikely as it may seem, you or I – and we all at one time or another base logical conclusions on unsound data. And sometimes our reasoning just slips a gear, too. Even if our reasoning is perfect, none of us is omniscient, either. We can easily overlook important considerations.

That’s why the betting industry exists. If you’ve ever heard some of the explanations – often vehement ones – which horseplayers come up with to explain why the sure thing they bet in the last race ran as if he was pulling a milk wagon, you’ll know that relying too much on reason can not only cost you money but also lead you into an unjustified skepticism about the honesty and competence of one’s fellow human beings.

Obviously logic is involved in the development of information, just as facts are involved. It is not by itself informative, though. Two plus two equals four, but if the right answer is five you’re still wrong. That is why conclusions drawn from data need to be tested before they can be accepted as sound. If you think the 5-horse in the next race is going to romp, you won’t know that you’re right till the race has been run. And no matter what the weather report says, you won’t know whether it’s going to rain tomorrow or not until tomorrow arrives.

First article in the Uninformation series

Next: Information is not the statement of an authority

Actual Analysis
Uninformation (2) © 2011, John FitzGerald

Sunday, June 12, 2011

Uninformation (1)

Information consists of data which establish whether or not an assertion is false. Not all data do this. One of the reasons we have difficulty becoming and staying informed is that we sometimes accept as informative things which really aren’t, or least aren’t necessarily. This is the first in a series of posts in which we’ll look at a few things which are not information.

1. Information is not synonymous with facts

People often confuse information with facts. Someone who knows a lot of facts is considered to be well informed. A fact is only informative, though, if it helps you settle a question you need to know the answer to. If someone is on trial for armed robbery, the Crown does not submit evidence that the defendant is a skilled bridge player, true as that evidence may be.

Here's a fact: Churchill, Manitoba, is named for John Churchill, first governor of the Hudson's Bay Company. That=s a fact. Despite being a fact, though, it doesn't help me answer the question “Where do I find the men’s shirts?” whenever I drop in to one of the Bay’s branches. So for me that datum is not informative, factual though it be.

Furthermore, there are plenty of items of information that are not factual. The idea of intelligence, for example, cannot be said to be a fact, since there is widespread disagreement about just what intelligence is. However, the concept of intelligence is informative because in speculating about it we discover useful things. We have even discovered some of the shortcomings of the idea of intelligence.

Information is always derived from facts, and it always helps to predict facts. However, it need not be factual itself, and something which is factual need not be informative. As someone who has spent his life filling his memory with facts whose relevance to my life is highly questionable (see note about John Churchill above), I realize that collecting trivia can be enjoyable. Until they tell you something useful, though, trivia are just trivial.

Next: The logical or reasonable is not necessarily informative

Actual Analysis
Uninformation (1) © 2011, John FitzGerald

Tuesday, May 3, 2011

The value of political polls

I've been questioning the value of polls here, so here's some evidence of how well they work. I was interested in Ekos Politics' claim that methods of predicting the seats won by each party in Canadian federal elections "work pretty well" (the quotation is from a PDF I can no longer find on their website, but I have a copy if you want one). Here are Ekos' final projections for the election of May 2, 2011 (you can verify them here):

Conservatives: 130 to 146 seats
New Democrats: 103 to 123
Liberals: 36 to 46,
Bloc Québécois: 10 to 20
Green: 1.

And the results:

Conservatives 167 seats
New Democrats 102
Liberals 34,
Bloc Québécois: 4
Green 1.

In other words, Ekos got the Green seats right and no other party's. Of course, there is no reason they should get them right. The regional variation in voting is so great in Canada (the BQ only runs in Quebec, for example) that you'd need extensive polling in each riding to even hope to approximate the results. Even then the non-representative samples you'd be working with would seriously limit the accuracy of your estimates.

At any rate, the Ekos projections missed the two important events of May 2: the Conservative majority and the collapse of the Bloc. Journalists will probably go on acting as if polls mean something, but that doesn't mean you have to.

Tuesday, April 26, 2011

Inside the information cult (1)

In Canada we're in the final week of a federal election campaign. So far press coverage has been dominated by coverage of poll results. Unlike, it seems, most people I am skeptical of the utility of election poll results, and here I'll explain why.

Polls of people’s opinions can be a useful exercise if they’re done properly and the results are interpreted carefully. The polls that the press publishes, however, often fail to satisfy these criteria (at least in the form in which they are presented in the press). For one thing, they usually ask at most a handful of questions and don’t attempt to assess how meaningful the responses are . The results of this type of poll are not information but pseudo-information. Electoral poll results are obviously uninformative simply because they don’t predict election results accurately enough. Informative data must be valid, and in general they are not valid.

If you consult the excellent PollingReport website you will find a summary of final estimates by eighteen polls of the popular vote in the 2000 United States presidential elections. Fourteen of those predictions had George W. Bush winning the popular vote, which in fact was won by Al Gore. Sure, the election was close, but isn’t that when you most need a good prediction? These poll results are not information, but rather some devout information cultists’ simulation of information.

In 2004, 15 of 22 polls predicted that Mr. Bush would win the popular vote, but five still predicted that John Kerry would win (two predicted a saw-off, as did two in 2000). This election wasn’t quite as close as the one in 2000, but the difference between the two candidates’ support was only about 2 percentage points. Even if the polls do give you a good idea of how people are going to vote, sampling error wipes out any utility they may have when a vote is close, which is a lot of the time. Furthermore, why would we expect polls to be all that valid as measures of what the population as a whole intends to do?

First, there’s their questionable sampling to consider. Poll results often come with statements, derived from sampling theory, saying that given the size of the sample they polled, their results will be accurate within so many percentage points of the actual percentages 95% or 99% of the time; in Canada the press is required to provide such estimates. These estimates are derived from sampling theory. However, sampling theory assumes that the samples polled are representative (that is, that they are random samples of the population). That is not true of any political poll.

A random sample is one in which each member of a population has a known probability of appearing. If you draw a simple random sample of 10% of a jar containing 2,000 jelly beans, each jelly bean will, if you draw the sample properly, have a 10% chance of appearing in the sample. However, let’s say that you want to draw a sample of 10% of the members of a club with 2,000 members so that you can ask them (the members of your sample) some questions about the club. All of a sudden you don’t know the probability that each member has of appearing in the sample, for a very simple reason.

The simple reason is that people can refuse to take part in your sample. If you mail them your questionnaire, others will forget to complete it, and some of the procrastinators will never get round to it. Some just won't be interested. The problem is that you can’t tell the people who won’t return the questionnaire from the ones who will.

You will end up drawing a random sample from the population of club members who complete questionnaires. The same is true of samples in political polls. Most people, in fact, refuse to take part in political polls. Secondly, people have to be home to answer the phone before they can consent to take part in the poll. It’s likely that some large subgroups of the population (the young, for example, or the employed) are less likely to be at home than others. Thirdly, the questions have to be asked in a language the person polled understands; people who can’t understand the language of the poll well enough have to be excluded.

For these and other reasons the sample you get in a political poll is never representative of the population as a whole but rather of that minority of the population that is both able and willing to take part in polls. If that minority thinks like the majority, then your results will apply to the majority as well. If the majority doesn’t think like the minority, then the results won’t apply. The catch, of course, is that you have no idea how closely the thinking of the minority corresponds to the thinking of the majority.

Even if you were able to get a representative sample, you would still have the problem that people sometimes don’t have too accurate an idea of what they’re going to do. Sometimes they change their minds between the time they take part in the poll and the time they actually vote. Sometimes they don’t know how they’re going to vote till they get in the booth. Sometimes they don’t vote. And even if they do know how they’re going to vote, why should we assume that they’ll tell us the truth?

Election polling is a cargo cult practice. We know that examining samples has been a productive practice in science, so we draw a few samples of our own to examine. However, just as the control towers at cargo cult airstrips in Melanesia don't have the crucial operating characteristics of real control towers, the samples drawn in election polls don;t have the crucial operating characteristics of samples from which estimates of statistics (the percentage of people likely to vote for a political party, for example) can be reliably derived. And even if they did, the mutability of human intentions would probably keep them inaccurate.

Actual Analysis website

Inside the information cult (1) © 2011, John FitzGerald

Friday, April 8, 2011

Lady Luck is actually very democratic

A commercial for a poker site is advising us that Lady Luck hangs out with the better players. In fact, she demonstrably doesn't.

The only meaningful conception of luck that I'm aware of is the statistical one. I am identifying luck with the statistical concept of error, which, as we shall see, is well suited to be a conception of luck. Anyway, any result (winning a poker game, for example) can be statistically analyzed as the consequence of an effect (poker-playing skill, say) and error. Error is the sum of all those things that affect the result but aren't related to poker-playing skill — the specific cards you get, how alert you are, and so on.

Error is randomly distributed with a mean of zero (these characteristics follow from the mathematics required to distinguish effects from error). Since the effects of the variables that produce the error are not correlated with poker-playing, that means the mean error score for good players is zero, and the mean score for poor players is zero. And after all, there should be nothing about being a good player that makes you more likely to be dealt a pair of aces.

Saturday, April 2, 2011

Accuracy is not enough

Data are not necessarily information. They are informative only to the extent that they reduce uncertainty. If you want to know what programs are on television tonight, knowing yesterday's television schedule will not help you. Yesterday's schedule is full of data, but the data are no longer informative.

In psychometric terms, informative data are those which are valid – which predict events of interest to you. To be valid data must be accurate; in fact, the validity of information is limited by its accuracy. Of course, inaccurate data cannot be valid, and the maximum possible validity of accurate data is equal to the square root of its reliability coefficient.

The minimum validity of accurate data, however, is always zero. Sometimes data are not valid simply because they are distributed in a way ill-suited to the statistics which are used to assess validity; often the distribution can be modified through a mathematical transformation and validity restored. Sometimes the data are simply irrelevant or poorly defined.

In Canada a federal election campaign is under way. As usual the press commentary about it includes frequent presentation of poll results. At the moment the poll results are the unverifiable opinions of the 30% of the population that takes part in polls about what they think they'll be doing a month from now. These people probably differ significantly from people who don't take part in polls. They've probably got more time on their hands for a start, which means they're likely older, better off, and so on. That is, they are probably not even accurate estimates of unverifiable opinions. If you check the excellent Polling Report website you'll find that American polls have been dependably incompetent at predicting the results of American presidential elections, which are simple two-candidate races. In Canada, with three national parties and a big regional party they are likely to be even less effective.

Anyway, if you depend on any type of database, it should be checked regularly to ensure not only accuracy but also relevance and utility.

Accuracy is not Enough © 2001, 2011 John FitzGerald


More articles from www.ActualAnalysis.com

Friday, February 4, 2011

Friday, January 14, 2011

The non-confidence interval

The rise of the opinion poll to pre-eminent importance has made us all familiar with statements like "These results are accurate within three percentage points, 95 times out of 100." This is a statement of what is known in sampling theory as a confidence interval.

Usually the result to which this statement refers is an estimate of the percentage of the population holding a certain opinion. The statement of the confidence interval implies that if 100 more samples of the same size had been drawn from the same population, the percentages estimated from 95 of those samples would have been within three percentage points of the percentage in the entire population.

The mathematics used to reach this conclusion is quite elegant, and the validity of confidence intervals is inescapable as long as certain conditions are met. What rarely is mentioned, though, is that these conditions are not met very often.

First, the sample has to be a random sample from the population. In opinion polling, this assumption is never met. For one thing, most people don't co-operate with poll takers. They hang up the phone, they don't return the questionnaire in the prepaid envelope, they don't stop for the people in malls with the clipboards. At best, polling samples are random samples from that minority of the population which agrees to be sampled.

Second, there is no one standard confidence interval for a poll. The confidence interval varies with the size of the percentage being estimated. This issue is rarely mentioned by researchers of any kind. In general, the confidence interval of a percentage becomes smaller as the percentage differs from 50%. The decrease can be important with smaller samples. For example, an estimate of 50% based on a sample of 200 has a 95% confidence interval of 6.9%, given certain assumptions. An estimate of 80% based on a sample of the same size has a 95% confidence interval of 5.5%.

Finally, the formulas for the confidence interval assume that you are measuring something reliable. These formulas were derived originally for problems in the natural sciences, where the items being sampled have solidity and consistency. In opinion polling, on the other hand, the items being sampled tend to be ethereal and ephemeral. Today I may feel like voting for the Vegetarian Party, but by the time I get behind the screen and pick up the little pencil I may well have decided that yesterday's scandal involving the executive committee of the Vegetarian Party, a seedy restaurant, and fish cakes disguised as tofu has pretty well demonstrated the unfitness of the Vegetarian Party to govern.

At best, a poll, or any similar survey, is a measure of what the minority of people who take part in polls think at a specific hour and minute of a specific day. As a guide to action, polls need to be supplemented by other information about the issues which they investigate.

Originally published at Actual Analysis

The Non-confidence Interval © 1995, John FitzGerald

Thursday, January 13, 2011

The TRUTH about means and medians

One of the many bees I have in my bonnet is buzzing about how people talk about the mean and the median. I just read a research report in which the mean was described as the average and the median was just called the median. It was a pretty good report, so I suspect the author was trying to help her non-statistically-trained readers. I still think this can be misleading, though.

The mean of a set of scores is simply the result of adding them up and dividing by the number of scores. The median, on the other hand, is the score which has equal numbers of other scores above and below it.

The average of a set of scores is its midpoint. That is, the median is the average. The mean is an estimate of the median. We use it because it can be manipulated more effectively and profitably than the median can, usually without affecting the validity of conclusions.

Most people know that skew may make the mean inaccurate. However, most do not know that there are criteria for deciding if use of the mean should be reconsidered. I reconsider if the skewness coefficient (which is provided by spreadsheets as well as statistical software) is greater than 1 or less than -1.

Above all, do not do what some people do and throw out your highest and lowest scores, or the two highest and two lowest etc., as a protection against skew. While that probably does little harm, it doesn't help either. If you're using your data for descriptive purposes, the best solution is to use the median. That way you get the benefit of all your data.

If you're using the data for inferential purposes, you should of course be using statistical tests. I suggest you compare the result of a test of differences between means with the result of a test of differences between medians (this is often a good idea even with unskewed data). If you're comparing either means or medians without a test, you are wasting your time. You need to know how likely a difference is to happen by accident before you can decide how important it is.

I should note that there are occasions when discarding data before calculating the mean can be useful; however, these are occasions that are best handled by people trained to deal with them. If you ever find yourself having to estimate the location parameter of a Cauchy distribution, discarding data from the tails before calculating a mean can be helpful, but even more helpful is having someone do who's been trained to do it and does it a lot.

Main site

The Truth about Means and Medians © 2011, John FitzGerald