How does turing test work
The interviewer sends written questions to the other two people. He does not see them, therefore, he does not know if he is talking to a woman or a man. The purpose of the questions is to understand which interlocutor is the man and which interlocutor is the woman. A typical question may be "do you have long or short hair?
Participants in the game receive the question and respond to the interviewer in writing, using the teleprinter. The game seems to be very simple and the problem of quick solution. In reality it is not so because the participants in the game can also lie.
The interviewer does not know who is lying and who is sincere. He must be able to understand it for himself. At the end of the game the interviewer must decide who of the two participants is the man and the woman.
Repeating the game N times, the interviewer mistakes the sex of the participants for X times. After reiterations of the game, the interviewer is wrong 25 times the sex of the two participants. In the second phase of the test we replace one of the two participants with a computer. Now the interviewer must understand whether to answer is a man or a machine. The process is always the same. In the Turing Test, the judge would try to guess who was a computer and who was a real human.
For one thing, he suggested the test could be done in just five minutes. Some of this was made obvious nearly 50 years ago with the construction of the program known as ELIZA by computer scientist Joseph Weizenbaum. ELIZA was used to simulate a type of psychotherapist known as a Rogerian , or person-centred, therapist.
Several patients who interacted with it thought it was real, leading to the earliest claim that the Turing Test had been passed. Any worthwhile Turing Test has to have the judge and the human player acting in as human-like a way as possible. Given that this is a test of understanding text, computers need to be judged against the abilities of the top few percent of copy-editors. If the questions are right, they can indicate whether the computer has understood the material culture of the other participants.
A test for AI based on these is known as a Winograd Schema Challenge and was first proposed in as an improvement on the Turing Test. To understand this, you have to understand the cultural and practical world of trophies and suitcases. First of all—as his brief discussion of solipsism makes clear—it is worth asking what grounds we have for attributing intelligence thought, mind to other people.
If it is plausible to suppose that we base our attributions on behavioral tests or behavioral criteria, then his claim about the appropriate test to apply in the case of machines seems apt, and his conjecture that digital computing machines might pass the test seems like a reasonable—though controversial—empirical conjecture. Second, subsequent developments in the philosophy of mind—and, in particular, the fashioning of functionalist theories of the mind—have provided a more secure theoretical environment in which to place speculations about the possibility of thinking machines.
If mental states are functional states—and if mental states are capable of realisation in vastly different kinds of materials—then there is some reason to think that it is an empirical question whether minds can be realised in digital computing machines. Of course, this kind of suggestion is open to challenge; we shall consider some important philosophical objections in the later parts of this review.
There are a number of much-debated issues that arise in connection with the interpretation of various parts of Turing , and that we have hitherto neglected to discuss. But since some of this interpretation has been contested, it is probably worth noting where the major points of controversy have been. Turing introduces the imitation game by describing a game in which the participants are a man, a woman, and a human interrogator. The interrogator is in a room apart from the other two, and is set the task of determining which of the other two is a man and which is a woman.
Both the man and the woman are set the task of trying to convince the interrogator that they are the woman. Turing recommends that the best strategy for the woman is to answer all questions truthfully; of course, the best strategy for the man will require some lying.
The participants in this game also use teletypewriter to communicate with one another—to avoid clues that might be offered by tone of voice, etc. Now, of course, it is possible to interpret Turing as here intending to say what he seems literally to say, namely, that the new game is one in which the computer must pretend to be a woman, and the other participant in the game is a woman. For discussion, see, for example, Genova and Traiger And it is also possible to interpret Turing as intending to say that the new game is one in which the computer must pretend to be a woman, and the other participant in the game is a man who must also pretend to be a woman.
Moreover, as Moor argues, there is no reason to think that one would get a better test if the computer must pretend to be a woman and the other participant in the game is a man pretending to be a woman; and, indeed, there is some reason to think that one would get a worse test.
Perhaps it would make no difference to the effectiveness of the test if the computer must pretend to be a woman, and the other participant is a woman any more than it would make a difference if the computer must pretend to be an accountant and the other participant is an accountant ; however, this consideration is simply insufficient to outweigh the strong textual evidence that supports the standard interpretation of the imitation game that we gave at the beginning of our discussion of Turing For a dissenting view about many of the matters discussed in this paragraph, see Sterrett ; There are two different theoretical claims that are run together in many discussions of The Turing Test that can profitably be separated.
If something can pass itself off as a person under sufficiently demanding test conditions, then we have very good reason to suppose that that thing is intelligent. Another claim holds that an appropriately programmed computer could pass the kind of test that is described in the first claim. Some objections to the claims made in Turing are objections to the Thinking Machine Claim, but not objections to the Turing Test Claim.
Consider, for example, the argument of Searle , which we discuss further in Section 6. However, other objections are objections to the Turing Test Claim. Until we get to Section 6, we shall be confining our attention to discussions of the Turing Test Claim. Given the initial distinction that we made between different ways in which the expression The Turing Test gets interpreted in the literature, it is probably best to approach the question of the assessment of the current standing of The Turing Test by dividing cases.
True enough, we think that there is a correct interpretation of exactly what test it is that is proposed by Turing ; but a complete discussion of the current standing of The Turing Test should pay at least some attention to the current standing of other tests that have been mistakenly supposed to be proposed by Turing There are a number of main ideas to be investigated.
First, there is the suggestion that The Turing Test provides logically necessary and sufficient conditions for the attribution of intelligence.
Second, there is the suggestion that The Turing Test provides logically sufficient—but not logically necessary—conditions for the attribution of intelligence. Fourth—and perhaps not importantly distinct from the previous claim—there is the suggestion that The Turing Test provides more or less strong probabilistic support for the attribution of intelligence.
We shall consider each of these suggestions in turn. It is doubtful whether there are very many examples of people who have explicitly claimed that The Turing Test is meant to provide conditions that are both logically necessary and logically sufficient for the attribution of intelligence. Perhaps Block is one such case. However, some of the objections that have been proposed against The Turing Test only make sense under the assumption that The Turing Test does indeed provide logically necessary and logically sufficient conditions for the attribution of intelligence; and many more of the objections that have been proposed against The Turing Test only make sense under the assumption that The Turing Test provides necessary and sufficient conditions for the attribution of intelligence, where the modality in question is weaker than the strictly logical, e.
Consider, for example, those people who have claimed that The Turing Test is chauvinistic; and, in particular, those people who have claimed that it is surely logically possible for there to be something that possesses considerable intelligence, and yet that is not able to pass The Turing Test.
Examples: Intelligent creatures might fail to pass The Turing Test because they do not share our way of life; intelligent creatures might fail to pass The Turing Test because they refuse to engage in games of pretence; intelligent creatures might fail to pass The Turing Test because the pragmatic conventions that govern the languages that they speak are so very different from the pragmatic conventions that govern human languages.
None of this can constitute objections to The Turing Test unless The Turing Test delivers necessary conditions for the attribution of intelligence. Rather—as we shall see later—French supposes that The Turing Test establishes sufficient conditions that no machine will ever satisfy. Floridi and Chiriatti say that The Turing Test provides necessary but insufficient conditions for intelligence: not passing The Turing Test disqualifies an AI from being intelligent, but passing The Turing Test is not sufficient to qualify an AI as intelligent.
There are many philosophers who have supposed that The Turing Test is intended to provide logically sufficient conditions for the attribution of intelligence. That is, there are many philosophers who have supposed that The Turing Test claims that it is logically impossible for something that lacks intelligence to pass The Turing Test.
Often, this supposition goes with an interpretation according to which passing The Turing Test requires rather a lot, e. There are well-known arguments against the claim that passing The Turing Test—or any other purely behavioral test—provides logically sufficient conditions for the attribution of intelligence. If we agree that Blockhead is logically possible, and if we agree that Blockhead is not intelligent does not have a mind, does not think , then Blockhead is a counterexample to the claim that the Turing Test provides a logically sufficient condition for the ascription of intelligence.
After all, Blockhead could be programmed with a look-up tree that produces responses identical with the ones that you would give over the entire course of your life given the same inputs. First, it could be denied that Blockhead is a logical possibility; second, it could be claimed that Blockhead would be intelligent have a mind, think.
In order to deny that Blockhead is a logical possibility, it seems that what needs to be denied is the commonly accepted link between conceivability and logical possibility: it certainly seems that Blockhead is conceivable , and so, if properly circumscribed conceivability is sufficient for logical possibility, then it seems that we have good reason to accept that Blockhead is a logical possibility.
Since it would take us too far away from our present concerns to explore this issue properly, we merely note that it remains a controversial question whether properly circumscribed conceivability is sufficient for logical possibility. For further discussion of this issue, see Crooke Blockhead may not be a particularly efficient processor of information; but it is at least a processor of information, and that—in combination with the behavior that is produced as a result of the processing of information—might well be taken to be sufficient grounds for the attribution of some level of intelligence to Blockhead.
For further critical discussion of the argument of Block , see McDermott , and Pautz and Stoljar If no true claims about the observable behavior of the entity can play any role in the justification of the ascription of the mental state in question to the entity, then there are no grounds for attributing that kind of mental state to the entity. The claim that, in order to be justified in ascribing a mental state to an entity, there must be some true claims about the observable behavior of that entity that alone—i.
It may be—for all that we are able to argue—that Wittgenstein was a philosophical behaviorist; it may be—for all that we are able to argue—that Turing was one, too. However, if we go by the letter of the account given in the previous paragraph, then all that need follow from the claim that the Turing Test is criterial for the ascription of intelligence thought, mind is that, when other true claims not themselves couched in terms of mentalistic vocabulary are conjoined with the claim that an entity has passed the Turing Test, it then follows that the entity in question has intelligence thought, mind.
Note that the parenthetical qualification that the additional true claims not be couched in terms of mentalistic vocabulary is only one way in which one might try to avoid the threat of trivialization.
The difficulty is that the addition of the true claim that an entity has a mind will always produce a set of claims that entails that that entity has a mind, no matter what other claims belong to the set! Many people have supposed that there is good reason to deny that Blockhead is a nomic or physical possibility. But, if this is right, then, while it may be true that Blockhead is a logical possibility, it follows that Blockhead is not a nomic or physical possibility.
And then it seems natural to hold that The Turing Test does indeed provide nomically sufficient conditions for the attribution of intelligence: given everything else that we already know—or, at any rate, take ourselves to know—about the universe in which we live, we would be fully justified in concluding that anything that succeeds in passing The Turing Test is, indeed, intelligent possessed of a mind, and so forth.
There are ways in which the argument in the previous paragraph might be resisted. At the very least, it is worth noting that there is a serious gap in the argument that we have just rehearsed. Perhaps—for all that has been argued so far—there are nomically possible ways of producing mere simulations of intelligence. McDermott calculates that a look-up table for a participant who makes 50 conversational exchanges would have about 10 nodes. When we look at the initial formulation that Turing provides of his test, it is clear that he thought that the passing of the test would provide probabilistic support for the hypothesis of intelligence.
There are at least two different points to make here. First, the prediction that Turing makes is itself probabilistic: Turing predicts that, in about fifty years from the time of his writing, it will be possible to programme digital computers to make them play the imitation game so well that an average interrogator will have no more than a seventy per cent chance of making the right identification after five minutes of questioning.
Clearly, a machine that is very successful in many different runs of the game that last for quite extended periods of time and that involve highly skilled participants in the other roles has a much stronger claim to intelligence than a machine that has been successful in a single, short run of the game with highly inexpert participants.
That a machine has succeeded in one short run of the game against inexpert opponents might provide some reason for increase in confidence that the machine in question is intelligent: but it is clear that results on subsequent runs of the game could quickly overturn this initial increase in confidence.
That a machine has done much better than chance over many long runs of the imitation game against a variety of skilled participants surely provides much stronger evidence that the machine is intelligent. The probabilistic nature of The Turing Test is often overlooked.
But this interpretation of The Turing Test is vulnerable to the kind of objection lodged by Bringsjord : even on a moderately long single run with relatively expert participants, it may not be all that unlikely that an unintelligent machine serendipitously succeeds in the imitation game. In our view, given enough sufficiently long runs with different sufficiently expert participants, the likelihood of serendipitous success can be made as small as one wishes. Some of the literature about The Turing Test is concerned with questions about the framing of a test that can provide a suitable guide to future research in the area of Artificial Intelligence.
The idea here is very simple. Suppose that we have the ambition to produce an artificially intelligent entity. What tests should we take as setting the goals that putatively intelligent artificial systems should achieve? Should we suppose that The Turing Test provides an appropriate goal for research in this field? In assessing these proposals, there are two different questions that need to be borne in mind.
First, there is the question whether it is a useful goal for AI research to aim to make a machine that can pass the given test administered over the specified length of time, at the specified degree of success. Second, there is the question of the appropriate conclusion to draw about the mental capacities of a machine that does manage to pass the test administered over the specified length of time, at the specified degree of success. Opinion on these questions is deeply divided.
Some people suppose that The Turing Test does not provide a useful goal for research in AI because it is far too difficult to produce a system that can pass the test. Other people suppose that The Turing Test does not provide a useful goal for research in AI because it sets a very narrow target and thus sets unnecessary restrictions on the kind of research that gets done.
Some people think that The Turing Test provides an entirely appropriate goal for research in AI; while other people think that there is a sense in which The Turing Test is not really demanding enough, and who suppose that The Turing Test needs to be extended in various ways in order to provide an appropriate goal for AI. We shall consider some representatives of each of these positions in turn. There are some people who continue to endorse The Turing Test.
For example, Neufeld and Finnestad a b argue that The Turing Test is no barrier to progress in AI, requires no significant redefinition, and does not shut down other avenues of investigation. Maybe we do better just to take The Turing Test to define a watershed rather than a threshold towards which we might hope to make incremental progression. Amongst these people there are some who have gone on to offer reasons for thinking that it is doubtful that we shall ever be able to create a machine that can pass The Turing Test—or, at any rate, that it is doubtful that we shall be able to do this at any time in the foreseeable future.
Perhaps the most interesting arguments of this kind are due to French ; at any rate, these are the arguments that we shall go on to consider. Cullen sets out similar considerations. First, if interrogators are allowed to draw on the results of research into, say, associative priming , then there is data that will very plausibly separate human beings from machines.
For example, there is research that shows that, if humans are presented with series of strings of letters, they require less time to recognize that a string is a word in a language that they speak if it is preceded by a related word in the language that they speak , rather than by an unrelated word in the language that they speak or a string of letters that is not a word in the language that they speak.
Provided that the interrogator has accurate data about average recognition times for subjects who speak the language in question, the interrogator can distinguish between the machine and the human simply by looking at recognition times for appropriate series of strings of letters. Or so says French.
After all, the design of The Turing Test makes it hard to see how the interrogator will get reliable information about response times to series of strings of symbols. The point of putting the computer in a separate room and requiring communication by teletype was precisely to rule out certain irrelevant ways of identifying the computer.
Perhaps it is also worth noting that administration of the kind of test that French imagines is not ordinary conversation; nor is it something that one would expect that any but a few expert interrogators would happen upon.
So, even if the circumstances of The Turing Test do not rule out the kind of procedure that French here envisages, it is not clear that The Turing Test will be impossibly hard for machines to pass. For, in the first case, the ratings that humans make depend upon large numbers of culturally acquired associations which it would be well-nigh impossible to identify and describe, and hence which it would arguably be well-nigh impossible to program into a computer.
And, in the second case, the ratings that people actually make are highly dependent upon particular social and cultural settings and upon the particular ways in which human life is experienced.
And there would also be widespread agreement amongst competent speakers of English in the developed world that pens rate higher as weapons than grand pianos rate as wheelbarrows. It is not clear to us that the data upon which the ratings games rely is as reliable as French would have us suppose. What if the grand piano has wheels? What if the opponent has a sword or a sub-machine gun? Moreover, even if the data is reliable, it is not obvious that any but a select group of interrogators will hit upon this kind of strategy for trying to unmask the machine; nor is it obvious that it is impossibly hard to build a machine that is able to perform in the way in which typical humans do on these kinds of tests.
There are other reasons that have been given for thinking that The Turing Test is too hard and, for this reason, inappropriate in setting goals for current research into artificial intelligence. In general, the idea is that there may well be features of human cognition that are particularly hard to simulate, but that are not in any sense essential for intelligence or thought, or possession of a mind.
The problem here is not merely that The Turing Test really does test for human intelligence; rather, the problem here is the fact—if indeed it is a fact—that there are quite inessential features of human intelligence that are extraordinarily difficult to replicate in a machine.
If this complaint is justified—if, indeed, there are features of human intelligence that are extraordinarily difficult to replicate in machines, and that could and would be reliably used to unmask machines in runs of The Turing Test—then there is reason to worry about the idea that The Turing Test sets an appropriate direction for research in artificial intelligence.
However, as our discussion of French shows, there may be reason for caution in supposing that the kinds of considerations discussed in the present section show that we are already in a position to say that The Turing Test does indeed set inappropriate goals for research in artificial intelligence. There are authors who have suggested that The Turing Test does not set a sufficiently broad goal for research in the area of artificial intelligence.
Amongst these authors, there are many who suppose that The Turing Test is too easy. We go on to consider some of these authors in the next sub-section. But there are also some authors who have supposed that, even if the goal that is set by The Turing Test is very demanding indeed, it is nonetheless too restrictive. Objection to the notion that the Turing Test provides a logically sufficient condition for intelligence can be adapted to the goal of showing that the Turing Test is too restrictive.
Consider, for example, Gunderson Gunderson has two major complaints to make against The Turing Test. But, second, he thinks that success in the Imitation Game would be but one example of the kinds of things that intelligent beings can do and—hence—in itself could not be taken as a reliable indicator of intelligence. According to Gunderson, Turing is in the same position as the vacuum cleaner salesman if he is prepared to say that a machine is intelligent merely on the basis of its success in the Imitation Game.
There is an obvious reply to the argument that we have here attributed to Gunderson, viz. In order to carry out a conversation, one needs to have many different kinds of cognitive skills, each of which is capable of application in other areas.
Apart from the obvious general cognitive competencies—memory, perception, etc. It is inconceivable that that there be a machine that is startlingly good at playing the Imitation Game, and yet unable to do well at any other tasks that might be assigned to it; and it is equally inconceivable that there is a machine that is startlingly good at the Imitation Game and yet that does not have a wide range of competencies that can be displayed in a range of quite disparate areas.
To the extent that Gunderson considers this line of reply, all that he says is that there is no reason to think that a machine that can succeed in the Imitation Game must have more than a narrow range of abilities; we think that there is no reason to believe that this reply should be taken seriously. More recently, Erion has defended a position that has some affinity to that of Gunderson.
In our view, at least when The Turing Test is properly understood, it is clear that anything that passes The Turing Test must have the ability to solve problems in a wide variety of everyday circumstances because the interrogators will use their questions to probe these—and other—kinds of abilities in those who play the Imitation Game. There are authors who have suggested that The Turing Test should be replaced with a more demanding test of one kind or another.
It is not at all clear that any of these tests actually proposes a better goal for research in AI than is set by The Turing Test. However, in this section, we shall not attempt to defend that claim; rather, we shall simply describe some of the further tests that have been proposed, and make occasional comments upon them. It is, of course, not essential to the game that tele-text devices be used to prevent direct access to information about the sex or genus of participants in the game.
We shall not advert to these relatively mundane kinds of considerations in what follows. Harnad , claims that a better test than The Turing Test will be one that requires responses to all of our inputs, and not merely to text-formatted linguistic inputs.
That is, according to Harnad, the appropriate goal for research in AI has to be to construct a robot with something like human sensorimotor capabilities. It is an interesting question whether the test that Harnad proposes sets a more appropriate goal for AI research. In particular, it seems worth noting that it is not clear that there could be a system that was able to pass The Turing Test and yet that was not able to pass The Total Turing Test.
This point against Harnad can be found in Hauser , and elsewhere. Against this proposal, it seems worth noting that there are questions to be raised about the interpretation of the third condition. If a computer program is long and complex, then no human agent can explain in complete detail how the output was produced. Why did the computer output 3. But if we are allowed to give a highly schematic explanation—the computer took the input, did some internal processing and then produced an answer—then it seems that it will turn out to be very hard to support the claim that human agents ever do anything genuinely creative.
After all, we too take external input, perform internal processing, and produce outputs. What is missing from the account that we are considering is any suggestion about the appropriate level of explanation that is to be provided. One might also worry that the proposed test rules out by fiat the possibility that creativity can be best achieved by using genuine randomising devices.
Schweizer claims that a better test than The Turing Test will advert to the evolutionary history of the subjects of the test. When we attribute intelligence to human beings, we rely on an extensive historical record of the intellectual achievements of human beings.
On the basis of this historical record, we are able to claim that human beings are intelligent; and we can rely upon this claim when we attribute intelligence to individual human beings on the basis of their behavior.
According to Schweizer, if we are to attribute intelligence to machines, we need to be able to advert to a comparable historical record of cognitive achievements. So, it will only be when machines have developed languages, written scientific treatises, composed symphonies, invented games, and the like, that we shall be in a position to attribute intelligence to individual machines on the basis of their behavior.
Against Schweizer, it seems worth noting that it is not at all clear that our reason for granting intelligence to other humans on the basis of their behavior is that we have prior knowledge of the collective cognitive achievements of human beings. Damassino suggests that it would be better to require test subjects to produce an enquiry in which performance is assessed along three dimensions: a comparison with human performance; b success in completing the enquiry; and c efficiency in completing the enquiry minimisation of the number of questions asked in completing the enquiry.
The motivation given for this proposal is that, because The Turing Test attracts projects whose primary ambition is to fool judges, it is concerned with whether or how well test subjects perform on their allocated tasks. It seems to us that there is nothing here that impugns The Turing Test. It does not count against The Turing Test that public competitions based on it with prizes attached lead to gaming, given that everyone knows that those prizes are being awarded to entries that clearly do not pass The Turing Test.
0コメント