Sam Harris - author, neuroscientist, and philosopher - recently stirred up a lot of controversy by his suggestion that airport security should make use of profiling to improve security. Harris describe his recommendation more accurately as anti-profiling, meaning that people who are obviously low risk should be paid less attention to than they currently are so that security resources can focus better on the higher risks. That is, little old ladies in wheelchairs from small-town Iowa are nowhere near as risky as a young male Muslim from the Middle East.
The singling out of Muslims is, of course, the biggest source of the controversy. In fact, Harris suggests that the profile of higher risk travelers should include “anybody who could conceivably be Muslim”. Many critics have declared this as impossible since Islam is a belief system so you can’t determine this characteristic by appearance.
Harris added an interesting, and very commendable, twist to the discussion. In response to suggestions by critics he invited security expert and author Bruce Schneier to debate profiling in terms of actual security and posted the resulting exchange on his blog. It is my intent here to evaluate that discussion. I’m particularly fascinated by it because I had originally sided against profiling (or anti-profiling) and I think Dr. Harris has convinced me that it might actually produce net value, though I’m not entirely convinced yet. More importantly, Harris did this in a debate against a security expert. I think he made some very important and valid points and Schneier made some serious errors in argumentation, system evaluation, statistics, and security. (You can skip to the summary section at the bottom if you want to avoid the length, detailed evaluation.)
Before I get to the details, I should provide a bit of my background on the subject and my potential biases. I am not a security expert but I have worked on defence and security systems for about 8 or 9 years both as system developer and project manager. I have worked on automatic target recognition systems for Defence R&D Canada (DRDC), object classification systems for DARPA, IED detection systems for DRDC and the Joint IED Defeat Organization (JIEDDO), facial recognition algorithms and software for demonstrations, and I contributed to proposals for U.S.-Canada border security. I’m also very experienced in probability and statistics, and common mistakes in their use. I am knowledgeable about security systems in principle, though less so about security operations. That doesn’t mean I lack operational experience. I’ve supported system integration of equipment on space shuttles and supported many missions from NASA Mission Control in Houston for assembling the International Space Station and inspecting shuttles on orbit post-Columbia tragedy. I’m well aware of the difference between principle, design, and practical operations. But I do have limited experience with operational security systems.
An important step in describing security systems is distinguishing the type of system. There are differences between detection, classification, recognition, identification, and verification. For the sake of the discussion, here are some brief definitions:
- Detection: the ability to determine that there is an item present worth further evaluation. Examples: motion in a camera scene, changes from the last time you patrolled an area, metal in a person’s pockets.
- Classification: describing a general category of item by its measurable characteristics. Examples: human, vehicle, water, explosive
- Recognition: determining a specific category based in interpretation of measured characteristics in a context of additional outside knowledge. Examples: middle-aged male, Muslim, Ford F-150, lake, C4 explosive
- Identification: providing a unique name of an item as the only one in the world. Examples: Osama Bin Laden, my uncle’s F-150, Lake Tahoe, my wife’s wedding ring.
- Verification: confirming identification via independent but subsequent measurement. Where identification might compare a face to a database to determine who it is, verification only compares the claimed identification against some other feature. Examples: passwords, photo ID, biometrics like thumb prints or iris scans. (Biometrics can be used for both identification and verification, but the same measurement should never be used for both.)
You might then have a security system that detects motion at your back door, classifies it as a human, recognizes it as an middle-aged Caucasian male, identifies it as Sam Harris, and verifies him with an iris scans before opening the door to let him in, each as a separate and independent system. Of course these definitions are not well-defined boundaries and the sorting into categories can be probabilistic rather than discrete. (You can say, for instance, that there is a 60% chance someone is a Muslim.) The definitions are also not necessarily universal. Under these definitions, facial recognition technology is really facial identification, for example. Recognition and identification are often used interchangeably.
One final important definition is risk. This is the combination of the probability of an outcome combine with the severity of its consequence. Being killed by a falling meteorite is low risk because it is highly improbable even though the outcome is very severe. Opening letters with your finger is low risk even if paper cuts are fairly common because the consequence of getting a paper cut is low severity.
The Public Mass Debate
The debate between Harris and Schneier is written as a highly non-linear and interrupted conversation. In my evaluation I’ll try to parse the conversation into discrete arguments and address them individually.
1. The issue
Harris’ basic claim is that treating everybody to the same level of scrutiny is a waste of resources. Checking little old ladies in wheelchairs from Iowa is not as necessary as checking a Muslim from the Middle East because the little old lady is less likely to be a terrorist. They spend some time in the debate clarifying this claim. It isn’t just a matter of convenience or cost efficiency. Harris points out that, given a set of finite resources, attention to low risk individuals reduces the amount of attention to high-risk individuals.
This principle is indeed important. Safety is maximized by minimizing the highest level of risk. If you reduce a component so that it is no longer the highest risk component, you are better to move effort to the new highest risk component. Think of it in terms of the weakest link in a chain. Given the finite material of the chain, you are safest to remove a little bit of material from all of the other links and add it to that weakest link until they are all equal strength. This lowers the strength of most links a little bit, but greatly increases the weakest point and hence the overall safe load the chain can handle. Leveling the risk across the process is the optimized case. This is true even if you can acquire more resources. Those should be spread across the process to lower the overall equal risk.
I often use this argument when it comes to blind spots in cars. Almost everybody will tell you to check your blind spots for safety when changing lanes. Indeed, that is correct. But it comes with a cost. While you reduce your risk of colliding to the side, you take attention away from the road in front of you and so increase the risk of a front collision if the car in front brakes. You should keep your glance for only so long as those two risks end up equal. An even better solution is to use the SAE recommended mirror positions that eliminate blind spots. In this configuration, you always see the car on the side in either the rearview mirror, side mirror, or peripheral vision, so you never have to turn your head further than towards the side mirror, keeping the road ahead in your periphery. The overall safety is improved, but you still want to keep your mirror glances no longer than to equalize the risks of side and front collision so that the overall risk is a minimum.
For airport security, this means using available resources to address high risk individuals to the point that they are of equally low risk as everybody else.
Schneier appears to generally agree with the principle. His counter-argument isn’t based on defeating that principle, but in the efficacy and cost/benefit ratio of identifying who is high or low risk. He essentially claims that implementing such a system is (a) costly, (b) provides little benefit, and (c) adds complexity that provides additional sources of errors and potential weakness.
2. Base rate
Many of Harris’ critics, including Schneier’s original response to Harris, pointed out that there are very few Muslims who are terrorists, so profiling them would produce significant false positives and provide little useful information. Harris seems to suggest in the posted debate that this way of thinking is backwards:
The question is not, What is the probability that any given Muslim is a terrorist? The question is, What is the probability that the next terrorist will be a Muslim?
Both sides are somewhat incorrect here, at least as far as the question of profiling at airport security. The correct question is what is the relative probability of the person standing in front of the security agent compared to other people. Remember, the goal here is to equalize risk. If the person in front of you is ten times more likely to be a terrorist than the person behind them, you should spend ten times the effort screening them.
Let’s do the math directly. Let’s say there are 1.1 billion adults in the world, of which 1 billion are Muslims and 100 million are non-Muslims. (OK, this is far from reality but bear with me as I’m illustrating the right form of the risk calculation.) Suppose there are 4000 Muslim terrorists in the world, and 1000 non-Muslim terrorists. OK, so what is the likelihood that the next terrorist will be Muslim? Well, 4000 of the 5000 world terrorists are Muslim, or 80%. That is a high chance. Does that mean, as Sam Harris suggests, that we should scrutinize Muslims at airport security checks more than non-Muslims? Absolutely not. Under this scenario, the probability the Muslim is a terrorist is 5000 in 1 billion, or 500 per 100 million. The probability the non-Muslim is a terrorist is 1000 per 100 million, or twice that of the Muslim. In fact, you should spend twice as much effort scrutinizing the non-Muslim.
Harris recognizes this flaw though. He describes the inverse probability problem quite well, using the exact disease test example and numbers I used to describe this common error in my Nov. 11, 2011 article, “Humans are Probably Odd, Statistically Speaking”. However, Harris doesn’t realize he’s on the wrong side of this error when he says that what matters is that a terrorist is likely to be Muslim. What matters is the relatively probability of the person being screened being a terrorist versus other people.
The absolute probability of the person being a terrorist is important too, but a separate issue. Harris correctly points out that the absolute argument can be used to say airport security is largely not needed since the probability of anyone being a terrorist is so low. In fact, that is true. You are far more likely to die in a car accident on the way to the airport than to be killed by a terrorist on the airplane. Recall that risk is minimized by focusing on the high risk areas to equalize the risks. Given the finite money and labour available in the U.S. economy for security and safety, if avoiding death is the goal then the TSA budget should be cut to almost zero and re-distributed to things more likely to kill you. This means, for instance, that Americans should be spending many times more on heart disease research, automobile safety, safety devices to prevent falls, keeping police from shooting innocent people, and on avoiding choking on their own vomit, which is still eight times more likely than dying in a terrorist attack.
That is all true in terms of risk of death. Recall risk is probability multiplied by severity of consequence. Given these things all end in your death, it is the probabilities that matter. There may be other factors to consider though. Costs to the economy is also a relevant consequence to consider. We should also be careful to be sure we are talking about marginal risk. That is, more people might die in cars but that doesn’t necessarily mean that a car ride is riskier if there are far more car trips than airplane trips. If deciding between a car or airplane, it is the relative risk of those particular trip options that matters, not the annual deaths in each.
There is no question that the U.S. spends enormously irrational amounts of money on preventing terrorism compared to other risks. Such is the politics of fear and the media doesn’t help. It’s easy to blame it on generic human irrationality, but then Norway didn’t react to their terrorist attack the same way as the U.S. It’s largely political culture.
All of that is a separate issue though. No matter how much is spend on airport security, we’re talking about how best to spend it. Harris was wrong when he put it in terms of the probability of a terrorist being Muslim. His critics were wrong to put it in terms of absolute probability. What matters is relative probability of an individual being a terrorist compared to others.
Here is where Harris’ is right for the wrong reason. When we plug in the real numbers to my example above, a Muslim standing in front of you is far more likely to be a terrorist than a non-Muslim, all other things being equal. In reality, both the majority of terrorists are Muslim and Muslims are fewer than non-Muslims, so these combine to make the relative risk for Muslims much higher. How much higher is open to debate, but suppose it is ten times more likely. That would mean that security should spend ten times more effort on the Muslim. If all suicidal terrorists are Muslims, then the relative risk of non-Muslims is zero so zero effort should be place on non-Muslims and all of it on the Muslims. The actual relative risk matters.
Schneier says this is irrelevant. He is wrong in principle. His position rests on security system analysis, a topic that will be the remainder of this evaluation. But his analysis would need to take into account this real risk calculation. Any analysis that leaves it out is making incorrect calculations.
An important part of Harris’ argument is that being Muslim is causally related to the type of suicidal terrorism we are looking for. This does not imply, as some suggest, that all Muslims are terrorists. We’ve already been through that probability discussion. Causal connections are another area people often misinterpret. The logical statement that all A’s are B’s doesn’t not imply that all B’s are A’s, or even a majority.
Think of lottery winners. Winning a lottery is causally related to buying a ticket. That doesn’t mean that everyone who buys a ticket is caused to win. On the contrary, very few win. But winning is a direct result of buying a ticket. Likewise, Harris argues that suicidal terrorism is causally related to being Muslim. If a perfect causal correlation, it would mean looking for terrorists in non-Muslims would be like looking for lottery winners in non-ticket holders. It would be a huge waste of effort, if true.
This relationship is actually important as Harris points out. The discussion above only relied on a correlative probability. If there were no causality it would imply there is a better characteristic on which to base the risk calculation. If the reason for the terrorism was related to, say, geographic land disputes regardless of belief system and the region happen to be mostly Muslim, then the correlation would be coincidental and a better measure would be the geographic origin. This is why Harris makes it clear that the belief system itself is causally related to the terrorism, even if a rare lottery-like causal relationship. The argument then is that “being Muslim” is the correct criteria, though could possibly be made more precise with additional criteria.
Harris is correct on his argument here. Schneier agrees in principle, and is even willing to assume a causal correlation of 1, meaning 100% of suicidal terrorists would be causally due to being Muslim. (This may not be exactly true, but is probably close.) However, Schneier maintains that this is irrelevant. Again, he is wrong. The relative risk of being Muslim, as well as any other risk category differentiators, must go into any risk analysis and the causality link means it is the correct criteria. You cannot analyze if you are equalizing risk if you don’t actually look at the relative risks.
4. Muslim Recognition
Perhaps the biggest points of contention between Harris and Schneier is the ability to recognize Muslims. Recall that recognition is an interpretation of measurable features in a greater context using outside knowledge. Many of Harris’ critics, along with Schneier, suggest that it is impossible to recognize Muslims because Islam is a belief system. They are wrong.
It is certainly impossible to perfectly recognize who is Muslim and who isn’t. But this doesn’t mean there aren’t classifiers. Recall that classification is categorizing via measured characteristic. Muslims are not random. There are measurable characteristics that statistically correlate with being Muslim. These likely include things like skin tone, name, country of origin, clothing, jewelry, mannerisms, and language. There may be more.
Of course it’s not a guarantee. That’s why statistical correlations are correlations and not one-to-one mapping functions. Some people can look Muslim and not be Muslim. Some people who are Muslim do not look it. Exceptions do not negate the correlation. Possibility is entirely different from probability. The fact that there are men in drag does not negate that there are distinctive characteristics to tell men and women apart, statistically. Harris’ argument all along has been to profile “anybody who might conceivably be Muslim”.
Sure, if a terrorist knew what security looked for they could change clothes, speak with a different accent, and otherwise attempt to look less Muslim. Schneier and others bring up this argument but it is very weak. It has two problems. First, it only serves to reduce their Muslim appearance. It doesn’t move them from the category of “conceivably Muslim” to “inconceivably Muslim”. Second, changing any characteristic from what you are normally used to provides additional signs: Discomfort, bad acting, social awkwardness, and even wearing all news clothes are all cues that somebody may be pretending to be something other than their normal selves. Trained observers can easily detect such things.
Schneier suggests how accurately people can detect Muslims, or anyone conceivably a Muslim, is irrelevent. In fact, he suggests this added complexity reduces security rather than increase it. I’ll get to that in the next section, but it is here that Schneier is most inconsistent. He claims the problem here is that, even given the causality above, that property A (terrorist) is caused by property B (Muslim), the best we can detect is B’ (estimated Muslim) so there will be detection errors. He essentially claims that how close B’ is to B, that is how accurately screeners can recognize Muslims, is irrelevant. But that is wrong. It is essential to his cost/benefit analysis. If B’ and B are almost identical, the security cost is negligible and the gain in security from equalizing risks could be far greater.
Following from the last section, how well screeners can recognize Muslims, or potential Muslims, is important. Harris makes the argument that people can actually be very good at it. Schneier suggests this isn’t the case a few times and Harris repeats his claim. Perhaps more importantly, Schneier seems to say that the accuracy is irrelevant but I don’t see how that is defensible. Scheneiers approach of a cost/benefit analysis (evaluated below) is highly dependent on this accuracy.
Risk analysis comes in again for accuracy. False positives, resulting in wasted extra scrutiny, have much less severe consequences than false negatives, which result in a terrorist getting through. Using the risk equalization principle, an optimized system would err heavily on the side of producing false positives and reducing false negatives such that the overall risk (probability times consequence) is the same for false positives and false negatives.
The ability to equalize risk in this manner assumes that the system has the equivalent of a discrimination threshold such that some continuous error curve can be produced, like a standard ROC curve. You then set the discrimination threshold such that the risks associated with false positives and false negatives are equal. (To reiterate, this isn’t setting the error rates the same. It is setting the risks the same which will involve a lot more false positives than false negatives.)
Here again Schneier seems to miss the mark. He says
I believe that once you start trying to specify your profile exactly, it will either encompass so many people as to be useless, or leave out so many people as to be dangerous. That is, I can’t figure out how to get your error rate down.
He is essentially claiming that you could be only at one end of the ROC curve or the other, but not in between. This implies that the system would be hypersensitive. A small adjustment in profile would have to produce a wide swing from one extreme to the other. I’m not sure how he can justify that position. It effectively means that small changes in the profile would produce a very large change in the percentage of people falling into one of the two groups, yet its not hard to imagine small profile changes that could only slightly change borderline cases one way or the other as a small percentage of the population. Heck, even changing the age classifier by a year younger would only make a small change and in principle could move the threshold towards more false positives while reducing the false negatives at that age. I don’t buy Schneier’s assertion.
His position makes no sense to me from the point of view of recognition system design, which is what he is talking about here in terms of recognizing Muslims. This hypersensitivity belief is also inconsistent with the overall base rate discussion or Schneier’s overall recommendations, both which recognize the probability of anyone being a terrorist is very small. Hence it’s hard to believe that it could suddenly swing so wildly to “leave out so many people as to be dangerous”. If it left out everybody it wouldn’t be all that dangerous, and it certainly wouldn’t leave out everybody or close to it. It’s not really that difficult to provide classifiers to recognize a majority of Muslims. The hard part is the borderline cases.
The only sensical interpretation I can make is based on the last part of his assertion, that he seems to think it’s necessary to get that error rate down. While this can certainly maximize the benefit and minimize the cost, that’s just system improvements. It doesn’t mean there isn’t a risk balance point on the ROC curve that optimizes the tradeoffs and still produces benefits greater than the costs.
One of Schneier’s main themes is on simplicity:
Complexity is the enemy of security. Adding complexity to a security system invariably introduces additional vulnerabilities. Simple systems are easier to analyze. Simpler systems have fewer security assumptions. Simpler systems are more robust against mistakes in analysis. And simpler systems are more secure.
To demonstrate this principle he describes securing a building. If you add more doors it becomes more complex and harder to secure. Complexity decreases security.
With all due respect to Schneier, I find this argument rather lazy, itself very over-simplified. He makes the mistake here of confusing the complexity of the environment to be secured with the complexity of the security system. It’s a perfectly valid argument for explaining why access to airplanes should be very limited. The more ways there are to get onto an airplane the harder it is to secure. For example, airplanes can be accessed by passengers, crew, maintenance, supply crews, security, possibly anyone who can get on the tarmac, and possibly any airport employee with access to keys to hallways that lead to airport gates. If access could be simplified, that would improve security.
Complexity of the security systems is an entirely different story. Consider Schneier’s building example. You are given a choice of four security systems: an executive package with deadbolts, security cameras and a security guard, a deluxe package with only the cameras and deadbolts, and economy package of just deadbolts, or a “good luck” package consisting of the honour system. Which is simplest and which is most secure? In fact, the simpler they get the less secure they are. This is because these systems are independent and additive. Compromising one of the components still leaves you with the others. Redundancy is often core to improved security but also adds complexity.
Schneier makes further mistakes in describing complexity to passengers. He points out that after the failed shoe-bombing it was only certain types of shoes that passengers were required to remove at security, leading to confusion. Simplifying the rule to include all shoes made it faster, more efficient, and increased scrutiny and security. But that is due to confusion, not complexity. In the case of selective screening, the passenger has no decision to make. All of the work is done by screening agents who simply tell the passenger if they are getting a more detailed screening. And that complexity comes with a savings of letting low risk people through which would actually speed up the process.
The benefits of simplicity of the security system only apply when the complexity replaces the simplicity rather than adds to it, and does so with less benefit than the risks added by the complexity. There is no simple rule. Ironically, Schneier points this out later in his summary that the devil is in the details and you need to do a thorough analysis on each case. Unfortunately, it seems he didn’t do this on his own arguments. Simple is not inherently more secure. The details of each circumstance matter.
Schneier’s other major theme is his system analysis via cost/benefit accounting. It is here that he dismisses many of Harris’ valid points in principle because even if they do benefit, the practical costs are higher. In principle, Schneier is right. If the added security and/or efficiency of profiling is offset by reduced security from implementing the profiling system then it is a lousy trade. In practice though, Schneier may be wrong.
Once again I find Scneier’s analysis as lazy. He compares the number of costs and benefits, not their values. He refers to the result as lopsided and lists his costs and benefits that I’ll discuss one by one.
On the benefit side, we have increased efficiency as screeners ignore some primary-screening anomalies for people who don’t meet the profile.
Harris’ point is that screeners freed up from screening low risk people can spend time by increasing screening of high risk individuals, equalizing the risk and hence maximizing the overall security. It is both security and efficiency that is improved. By how much depends on the relative risk of Muslims versus non-Muslims and the accuracy of recognizing Muslims (how close B and B’ are), both things that Schneier incorrectly dismissed as irrelevant.
On the cost side, we have decreased security resulting from our imperfect profile of Muslims,
That could be zero or near it if the system errs on the side of caution. Or it could be huge. How well it can be done matters, as Harris repeatedly points out and Schneier dismisses as irrelevant. This is a cost, but we need to know how big it is.
decreased security resulting from our ignoring of non-Muslim terrorist threats
This is zero in his analysis. He set the causal correlation of terrorist being Muslim as 1, meaning they are all Muslims. He did this because he said it was irrelevant. But it isn’t. The cost is zero if the correlation is 1, and nearly zero of it’s close to 1.
decreased security resulting in errors in implementing the system,
This is only true if the errors result in false negatives and yet Schneier himself argues that such a system tends towards avoiding false negatives via the principle-agent problem whereby screeners would tend to cover their ass by erring on the side of being overly cautious. The system Harris proposed is one that is overly cautious and errs towards false positives to avoid false negatives. Both seem to agree that this cost should be nearly zero.
increased cost due to replacing procedures with judgment,
This too is only a security cost if it results in false negatives which both seem to think is unlikely. There is little evidence presented that this would be much above zero. What we’re talking about is the decision process of who to check, not the procedure of checking. The current procedure consists of the word “everyone”. The non-security costs of judgment over “everyone” include the time to judge and the cost of training, but the point is that the time to judge is significantly less that the time saved by not screening low risk individuals. Training costs would be real, but how does it compare to the savings of reducing the number of parallel screeners since the lines will move faster so fewer people are needed. (I notice this benefit never showed up on the list.)
decreased efficiency (or possibly increased cost) because of the principal-agent problem,
That is zero. The principle-agent problem doesn’t decrease efficiency from the baseline case which is the current system. It decreases efficiency from an optimized system of equalized risk by biasing away from false negatives such that the risks (costs) from false positives go up. The cost/benefit analysis looks at costs and benefits of changing the current system, hence using it as the baseline. You can’t use the optimized system as a baseline from which to measure costs like this. It means the benefit of implementing will be less benefit in practice, but this isn’t a cost.
and decreased efficiency as screeners make their profiling judgments.
This was listed above in the “judgment” cost.
Additionally, your system is vulnerable to mistakes in your estimation of the proper profile.
This is the same as the first cost, describing how accurately Muslim recognition can be performed.
When summed up, we don’t know what the result is. It seems to be a potentially big benefit and a series of small or zero valued costs. This does nothing to argue against the profiling system. Real values of real operational programs, or rational estimates thereof, are required to do the comparison.
Schneier repeats this list, including mistakes, in his closing summary at the bottom, still without actual values or even estimated magnitudes. It’s not really a cost/benefit analysis at all, but more like a simple pros and cons list biased in favour of the cons.
One of the biggest missing pieces for me in the cost/benefit, or anywhere in the discussion, is the demographics of travelers. If most travelers in the U.S. are little old ladies in wheelchairs, then this system has huge benefits as it lets them through faster and frees up a lot of screener time to focus on higher risks. On the other hand, if almost all travelers are Muslims or appear to be conceivably so, the benefits are small. This is probably the biggest factor in determining benefits of the system and neither Harris nor Schneier mention it that I can see.
Another important missed point is the scalability of the system. Scheier spends a lot of time demonstrating that the system is hard to optimize and maximize in practice by identifying Muslims, or terrorists in general, exactly. The costs of training and implementation, and practical problems of implementation, make such an idealized system impossible.
But the profiling argument doesn’t rest on perfection. Both the costs and benefits scale with precision. Using Harris’ anti-profile approach, adding a simple set of classifiers such as “Caucasian, American, older than 65, in a wheel chair, U.S. war veteran” costs very little to implement. It takes essentially no judgment and yet lets through very low risk individuals so frees up some available time for additional screening of higher risk people. The benefits might be little, but so are the costs.
Neither Harris nor Schneier discuss this scalability.
8. Gaming the system
Before reading this exchange I had tended towards the conclusion that a system that treats everyone alike is a better system, whether it be random selection or checking everyone equally. The reason I thought this way is best summed by the MIT “Carnival Booth” algorithm. This group analyzed the existing CAPS system in place for airport security since 1999 and made use of profiling. They concluded that random sampling was more secure because profiling allowed terrorists to test the system to find out who least fits the profile. They could do this using as little as 6 flights to determine who gets profiled and who doesn’t. A random system provides no useful information so everyone is equally as likely to be under greater scrutiny.
The risk calculation is largely the same as how I started this article. Terrorists could use the testing flights to determine who is the lowest risk and so their overall risk of getting caught is the lowest risk individual. The screeners spending time on high risk individuals is therefore wasteful and lowers security. If they spent less time on those high risk individuals and more time on random sampling, they’d increase the number of random searches, and hence increase the lowest level of security and hence raise the overall likelihood of catching the terrorists.
What I never seriously thought about was the probability of finding a terrorist in the lowest risk category. In Harris’ proposed scheme, it’d be incredibly unlikely. Finding somebody who by all appearances and behaviours is low risk, but have them actually be convinced to be a suicidal terrorist is very unlikely. Only somebody truly devoted to Islam, and brainwashed into martyrdom by a small subset of fundamentalist Muslims, would be willing to do such a thing. The probability of finding one that wouldn’t be profiled in Harris’ system is incredibly small. It may not be zero, but it may make the Carnival Booth algorithm, or similar concepts, moot and a small risk cost in comparison to the security benefit gained by most or all such people being on the highest level of scrutiny.
Again, the optimization occurs when those risks are equalized. A small to negligible chance of gaming the profiling system needs to be compared to the gains in security of the system.
Harris recognizes that gaming the system may be possible, but he regularly reiterates he’s talking about a system that errs very much on the side of caution and that false positives are largely inconsequential, particularly describing the consequences as merely additional questions or searches rather than, say, imprisoning people. He may miss Schneier’s point that these also cost some of the benefits that Harris is looking for in the first place. After all, a profiling system that errs on the side of caution in the extreme is one that screens everyone which is the current system that Harris is arguing against.
9. Social Concerns and System Feedback
Many of Harris’ critics, including Schneier, assert that singling out Muslims or people who could be Muslims would increase hatred for the U.S. and support for terrorist activities, if not increasing the number of terrorists outright. Certainly this is a concern and even Harris admits that. History has shown that discriminating on the basis of ethnicity or religion tends to produce bad results all around. Not only does it treat innocent people unfairly, but the general public tends to grow more bigoted by the visibility of singling people out. The U.S. already has problems with Muslims being unfairly treated.
However, this is not necessarily the correct scale of thinking. You can’t compare statistical airport screening with Nazi’s rounding up Jews or American camps for Japanese in WWII. Again the devil is in the details. If it is just a statistical bias in screening towards Muslims, that’s hardly visible at all. It’s not like there’d be a separate Muslim line.
If it is a low scale, anti-profiling operation such as letting little old ladies in wheelchairs through, there’s no visible bias against Muslims at all. I can’t imagine a Muslim seeing an old, feeble, handicapped passenger getting waved through and feeling hatred for that. U.S. veterans already get preferential treatment at airports. Does getting an easier time at security too really stand out as a basis for hatred? If implemented in terms of letting low risk people through instead of scrutinizing Muslims more, I suspect there’d be little backlash at all. In fact, this is the basis of the “fast lane” Nexus card for regular travelers. It simply moves the screeners from the security line to back offices looking at more in-depth classifiers. I have Canadian and NATO Secret clearance. Militaries and governments trust me with important secrets. Would Muslims really be annoyed if TSA agents also trusted me for being demonstrably low risk, or anyone with a Nexus card for being low risk? Why when the low risk is because they are little old ladies would Muslims suddenly become angered? (Actually, in the propose anti-profiling system, I suspect I wouldn’t be anti-profiled out unless I had a clearance card of some sort. But the point is that Muslims aren’t dumb and can understand why some legitimately low risk people are waived through faster.)
Harris also points out that security and intelligence already treat Muslims differently, from invasion of privacy while gathering intelligence to immigration scrutiny. Would statistical bias in airport screening really be a big motivator?
Personally I feel uncomfortable with profiling. I feel more comfortable about an anti-profiling system even though they are effectively the same thing. This is partly why I say the form of implementation may be important.
Harris and Schneier do bring up the issue of comfort in the context of considerations such as U.S. culture and Constitution. Indeed, my values tend to lean that way too. My evaluation here is based on the statements and claims, not on personal values. One of the problems with values, or instincts, is that they are based on some principle or idea that may apply in general but might not in every case. To me integrity is about evaluating cases individually and not just catering to generalized, ideological rules. Even if we decide not to do something that shows it has net value, we are choosing to do so knowing what it is costing us, and informed choice is the basis of a free society.
Potential changes in risks due to changes in the security system are a form of system feedback. There are other forms to consider. For example, would reduced scrutiny create a market for exploiting it? Would terrorists put a lot of effort into getting bombs installed into wheelchairs of perfectly innocent people unknown to them, and extract it on the airplane or set it off remotely. It’s possible, and such a risk would need to be accounted for.
Overall I’m not greatly impressed with the debate itself but it has made me think more about it. Sam Harris made some good and valid detailed points but fails to have any sort of rigorous system-level evaluative framework from which to make quantitative judgments. Bruce Schneier has the right approach in terms of system cost/benefit analysis but performs lazy analysis, makes use of a variety of unjustified or incorrect assertions and principles, and unjustly dismisses Harris’ valid points as irrelevant.
More importantly I think the debate leaves out some of the most significant details for determining the value of such a system, including:
- actual values of costs and benefits
- scalability of the proposed program and scaling of those costs and benefits
- risk analysis and equalization of risks to optimize the system
- demographic information to provide estimates of the value of such a system
I suspect that Schneier is probably correct in the end that an optimized system of maximal precision and value is practically impossible. I also suspect that Harris is probably correct that a basic, anti-profile system that goes easier on low risk individuals, using clear and easily measured classifiers with minimal judgment, is probably quite possible with benefits that exceed costs. However, such a low scale system will not provide a huge overall benefit either.
I will need to see actual or estimated numbers though. I haven’t read Schneier’s essays or book, so I can only hope he has some in there. So far those numbers are nowhere to be found in the debate despite the fact everything rests on them.
Ultimately I think the value in seeing an accurate analysis would be informed choice. If profiling (or anti-profiling) prove to be valuable we may still chose not to do it for other reasons, but we can’t chose to ignore that value if we don’t know what it is.