Principal
Introduction to Automata Theory, Languages, and Computations
Introduction to Automata Theory, Languages, and Computations
John E. Hopcroft, Rajeev Motwani, Jeffrey D. Ullman
This classic book on formal languages, automata theory, and computational complexity has been updated to present theoretical concepts in a concise and straightforward manner with the increase of handson, practical applications. This new edition comes with Gradiance, an online assessment tool developed for computer science.
Gradiance is the most advanced online assessment tool developed for the computer science discipline. With its innovative underlying technology, Gradiance turns basic homework assignments and programming labs into an interactive learning experience for students. By using a series of “root questions” and hints, it not only tests a student’s capability, but actually simulates a oneonone teacherstudent tutorial that allows for the student to more easily learn the material. Through the programming labs, instructors are capable of testing, tracking, and honing their students’ skills, both in terms of syntax and semantics, with an unprecedented level of assessment never before offered.
Gradiance is the most advanced online assessment tool developed for the computer science discipline. With its innovative underlying technology, Gradiance turns basic homework assignments and programming labs into an interactive learning experience for students. By using a series of “root questions” and hints, it not only tests a student’s capability, but actually simulates a oneonone teacherstudent tutorial that allows for the student to more easily learn the material. Through the programming labs, instructors are capable of testing, tracking, and honing their students’ skills, both in terms of syntax and semantics, with an unprecedented level of assessment never before offered.
Categories:
Año:
2006
Edición:
3rd
Editorial:
Prentice Hall
Idioma:
english
Páginas:
550
ISBN 10:
0321455363
File:
PDF, 5.67 MB
Descarga (pdf, 5.67 MB)
 Open in Browser
 Checking other formats...
 Please login to your account first

Need help? Please read our short guide how to send a book to Kindle.
The file will be sent to your email address. It may take up to 15 minutes before you receive it.
The file will be sent to your Kindle account. It may takes up to 15 minutes before you received it.
Please note you need to add our email km0@bookmail.org to approved email addresses. Read more.
Please note you need to add our email km0@bookmail.org to approved email addresses. Read more.
You may be interested in Powered by Rec2Me
Most frequently terms
languages^{740}
input^{716}
strings^{621}
tape^{484}
symbols^{446}
theorem^{444}
turing^{416}
fig^{413}
stack^{390}
grammar^{389}
rst^{382}
nite^{370}
automaton^{359}
automata^{325}
polynomial^{300}
dfa^{284}
variables^{269}
accepts^{260}
algorithm^{255}
productions^{252}
pda^{239}
transition^{231}
turing machine^{227}
variable^{221}
derivation^{214}
nfa^{202}
alphabet^{180}
induction^{168}
grammars^{165}
turing machines^{164}
recursive^{163}
sequence^{158}
transitions^{158}
nodes^{156}
node^{153}
undecidable^{150}
regular languages^{144}
properties^{143}
parse^{135}
integer^{134}
construct^{132}
erent^{129}
de ned^{128}
nondeterministic^{124}
graph^{124}
inductive^{121}
closure^{120}
satis^{114}
deterministic^{114}
conclude^{113}
pcp^{112}
reduction^{105}
de nition^{104}
cfg^{104}
terminal^{102}
integers^{101}
labeled^{101}
boolean^{99}
leftmost^{97}
hypothesis^{94}
You can write a book review and share your experiences. Other readers will always be interested in your opinion of the books you've read. Whether you've loved the book or not, if you give your honest and detailed thoughts then people will find new books that are right for them.
1

2

hopcroft_titlepgs 5/8/06 12:43 PM Page 1 INTRODUCTION TO Automata Theory, Languages, and Computation 3 rd Edition hopcroft_titlepgs 5/8/06 12:43 PM Page 2 INTRODUCTION TO Automata Theory, Languages, and Computation 3 rd Edition JOHN E. HOPCROFT Cornell University R A J E E V M OT WA N I Stanford University JEFFREY D. ULLMAN Stanford University Publisher Executive Editor Acquisitions Editor Project Editor Associate Managing Editor Cover Designer Digital Assets Manager Media Producer Marketing Manager Marketing Assistant Senior Author Support/ Technology Specialist Senior Manufacturing Buyer Media Manufacturing Buyer Greg Tobin Michael Hirsch Matt Goldstein Katherine Harutunian Jeffrey Holcomb Joyce Cosentino Wells Marianne Groth Bethany Tidd Michelle Brown Dana Lopreato Joe Vetere Carol Melville Ginny Michaud Many of the exercises that appear in this text use the stems of questions from Gradiance Corporation, which retains the copyright to all such questions. © Gradiance Corp., 2004–2006 Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and AddisonWesley was aware of a trademark claim, the designations have been printed in initial caps or all caps. Lib rary o f Con gre ss Catal ogin gin Pu blication Dat a Hopcroft, John E., 1939Introduction to automata theory, languages, and computation / by John E. Hopcroft, Rajeev Motwani, Jeffrey D. Ullman.  3rd ed. p. cm. Includes bibliographical references and index. ISBN 0321455363 1. Machine theory. 2. Formal languages. 3. Computational complexity. I. Motwani, Rajeev. II. Ullman, Jeffrey D., 1942 III. Title. QA267.H56 2006 511.3'5dc22 2006014263 Copyright © 2007 Pearson Education, Inc. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of th; e publisher. Printed in the United States of America. For information on obtaining permission for use of material in this work, please submit a written request to Pearson Education, Inc., Rights and Contracts Department, 75 Arlington Street, Suite 300, Boston, MA 02116, fax your request to 6178487047, or email at http://www.pearsoned.com/legal/permissions.htm. 1 2 3 4 5 6 7 8 9 10—CW—10 09 08 07 06 Preface In the preface from the 1979 predecessor to this book, Hopcroft and Ullman marveled at the fact that the subject of automata had exploded, compared with its state at the time they wrote their rst book, in 1969. Truly, the 1979 book contained many topics not found in the earlier work and was about twice its size. If you compare this book with the 1979 book, you will nd that, like the automobiles of the 1970's, this book is \larger on the outside, but smaller on the inside." That sounds like a retrograde step, but we are happy with the changes for several reasons. First, in 1979, automata and language theory was still an area of active research. A purpose of that book was to encourage mathematically inclined students to make new contributions to the eld. Today, there is little direct research in automata theory (as opposed to its applications), and thus little motivation for us to retain the succinct, highly mathematical tone of the 1979 book. Second, the role of automata and language theory has changed over the past two decades. In 1979, automata was largely a graduatelevel subject, and we imagined our reader was an advanced graduate student, especially those using the later chapters of the book. Today, the subject is a staple of the undergraduate curriculum. As such, the content of the book must assume less in the way of prerequisites from the student, and therefore must provide more of the background and details of arguments than did the earlier book. A third change in the environment is that Computer Science has grown to an almost unimaginable degree in the past three decades. While in 1979 it was often a challenge to ll up a curriculum with material that we felt would survive the next wave of technology, today very many subdisciplines compete for the limited amount of space in the undergraduate curriculum. Fourthly, CS has become a more vocational subject, and there is a severe pragmatism among many of its students. We continue to believe that aspects of automata theory are essential tools in a variety of new disciplines, and we believe that the theoretical, mindexpanding exercises embodied in the typical automata course retain their value, no matter how much the student prefers to learn only the most immediately monetizable technology. However, to assure a continued place for the subject on the menu of topics available to the computer science student, we believe it is necessary to emphasize the applications v vi PREFACE along with the mathematics. Thus, we have replaced a number of the more abstruse topics in the earlier book with examples of how the ideas are used today. While applications of automata and language theory to compilers are now so well understood that they are normally covered in a compiler course, there are a variety of more recent uses, including modelchecking algorithms to verify protocols and documentdescription languages that are patterned on contextfree grammars. A nal explanation for the simultaneous growth and shrinkage of the book is that we were today able to take advantage of the TEX and LATEX typesetting systems developed by Don Knuth and Les Lamport. The latter, especially, encourages the \open" style of typesetting that makes books larger, but easier to read. We appreciate the eorts of both men. Use of the Book This book is suitable for a quarter or semester course at the Junior level or above. At Stanford, we have used the notes in CS154, the course in automata and language theory. It is a onequarter course, which both Rajeev and Je have taught. Because of the limited time available, Chapter 11 is not covered, and some of the later material, such as the more dicult polynomialtime reductions in Section 10.4 are omitted as well. The book's Web site (see below) includes notes and syllabi for several oerings of CS154. Some years ago, we found that many graduate students came to Stanford with a course in automata theory that did not include the theory of intractability. As the Stanford faculty believes that these ideas are essential for every computer scientist to know at more than the level of \NPcomplete means it takes too long," there is another course, CS154N, that students may take to cover only Chapters 8, 9, and 10. They actually participate in roughly the last third of CS154 to fulll the CS154N requirement. Even today, we nd several students each quarter availing themselves of this option. Since it requires little extra eort, we recommend the approach. Prerequisites To make best use of this book, students should have taken previously a course covering discrete mathematics, e.g., graphs, trees, logic, and proof techniques. We assume also that they have had several courses in programming, and are familiar with common data structures, recursion, and the role of major system components such as compilers. These prerequisites should be obtained in a typical freshmansophomore CS program. PREFACE vii Exercises The book contains extensive exercises, with some for almost every section. We indicate harder exercises or parts of exercises with an exclamation point. The hardest exercises have a double exclamation point. Some of the exercises or parts are marked with a star. For these exercises, we shall endeavor to maintain solutions accessible through the book's Web page. These solutions are publicly available and should be used for selftesting. Note that in a few cases, one exercise B asks for modication or adaptation of your solution to another exercise A. If certain parts of A have solutions, then you should expect the corresponding parts of B to have solutions as well. Gradiance OnLine Homeworks A new feature of the third edition is that there is an accompanying set of online homeworks using a technology developed by Gradiance Corp. Instructors may assign these homeworks to their class, or students not enrolled in a class may enroll in an \omnibus class" that allows them to do the homeworks as a tutorial (without an instructorcreated class). Gradiance questions look like ordinary questions, but your solutions are sampled. If you make an incorrect choice you are given specic advice or feedback to help you correct your solution. If your instructor permits, you are allowed to try again, until you get a perfect score. A subscription to the Gradiance service is oered with all new copies of this text sold in North America. For more information, visit the AddisonWesley web site www.aw.com/gradiance or send email to computing@aw.com. Support on the World Wide Web The book's home page is http://wwwdb.stanford.edu/~ullman/ialc.html Here are solutions to starred exercises, errata as we learn of them, and backup materials. We hope to make available the notes for each oering of CS154 as we teach it, including homeworks, solutions, and exams. Acknowledgements A handout on \how to do proofs" by Craig Silverstein inuenced some of the material in Chapter 1. Comments and errata on drafts of the second edition (2000) were received from: Zoe Abrams, George Candea, Haowen Chen, ByongGun Chun, Jerey Shallit, Bret Taylor, Jason Townsend, and Erik Uzureau. We also received many emails pointing out errata in the second edition of this book, and these were acknowledged online in the errata sheets for that viii PREFACE edition. However, we would like to mention here the following people who provided large numbers of signicant errata: Zeki Bayram, Sebastian Hick, KangRae Lee, Christian Lemburg, Nezam MahdaviAmiri, Dave Maier, A. P. Marathe, Mark Meuleman, Mustafa SaitAmetov, Alexey Sarytchev, Jukka Suomela, Rod Topor, PoLian Tsai, Tom Whaley, Aaron Windsor, and Jacinth H.T. Wu. The help of all these people is greatefully acknowledged. Remaining errors are ours, of course. J. E. H. R. M. J. D. U. Ithaca NY and Stanford CA February, 2006 Table of Contents 1 Automata: The Methods and the Madness 1.1 Why Study Automata Theory? . . . . . . . . . . . . . . . . . 1.1.1 Introduction to Finite Automata . . . . . . . . . . . . 1.1.2 Structural Representations . . . . . . . . . . . . . . . 1.1.3 Automata and Complexity . . . . . . . . . . . . . . . 1.2 Introduction to Formal Proof . . . . . . . . . . . . . . . . . . 1.2.1 Deductive Proofs . . . . . . . . . . . . . . . . . . . . . 1.2.2 Reduction to Denitions . . . . . . . . . . . . . . . . . 1.2.3 Other Theorem Forms . . . . . . . . . . . . . . . . . . 1.2.4 Theorems That Appear Not to Be IfThen Statements 1.3 Additional Forms of Proof . . . . . . . . . . . . . . . . . . . . 1.3.1 Proving Equivalences About Sets . . . . . . . . . . . . 1.3.2 The Contrapositive . . . . . . . . . . . . . . . . . . . . 1.3.3 Proof by Contradiction . . . . . . . . . . . . . . . . . 1.3.4 Counterexamples . . . . . . . . . . . . . . . . . . . . . 1.4 Inductive Proofs . . . . . . . . . . . . . . . . . . . . . . . . . 1.4.1 Inductions on Integers . . . . . . . . . . . . . . . . . . 1.4.2 More General Forms of Integer Inductions . . . . . . . 1.4.3 Structural Inductions . . . . . . . . . . . . . . . . . . 1.4.4 Mutual Inductions . . . . . . . . . . . . . . . . . . . . 1.5 The Central Concepts of Automata Theory . . . . . . . . . . 1.5.1 Alphabets . . . . . . . . . . . . . . . . . . . . . . . . . 1.5.2 Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5.3 Languages . . . . . . . . . . . . . . . . . . . . . . . . . 1.5.4 Problems . . . . . . . . . . . . . . . . . . . . . . . . . 1.6 Summary of Chapter 1 . . . . . . . . . . . . . . . . . . . . . . 1.7 Gradiance Problems for Chapter 1 . . . . . . . . . . . . . . . 1.8 References for Chapter 1 . . . . . . . . . . . . . . . . . . . . . 2 Finite Automata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2 2 4 5 5 6 8 10 13 13 14 14 16 17 19 19 22 23 26 28 28 29 30 31 33 35 36 37 2.1 An Informal Picture of Finite Automata . . . . . . . . . . . . . . 38 2.1.1 The Ground Rules . . . . . . . . . . . . . . . . . . . . . . 38 2.1.2 The Protocol . . . . . . . . . . . . . . . . . . . . . . . . . 39 ix TABLE OF CONTENTS x 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.1.3 Enabling the Automata to Ignore Actions . . . . . . . . . 2.1.4 The Entire System as an Automaton . . . . . . . . . . . . 2.1.5 Using the Product Automaton to Validate the Protocol . Deterministic Finite Automata . . . . . . . . . . . . . . . . . . . 2.2.1 Denition of a Deterministic Finite Automaton . . . . . . 2.2.2 How a DFA Processes Strings . . . . . . . . . . . . . . . . 2.2.3 Simpler Notations for DFA's . . . . . . . . . . . . . . . . 2.2.4 Extending the Transition Function to Strings . . . . . . . 2.2.5 The Language of a DFA . . . . . . . . . . . . . . . . . . . 2.2.6 Exercises for Section 2.2 . . . . . . . . . . . . . . . . . . . Nondeterministic Finite Automata . . . . . . . . . . . . . . . . . 2.3.1 An Informal View of Nondeterministic Finite Automata . 2.3.2 Denition of Nondeterministic Finite Automata . . . . . . 2.3.3 The Extended Transition Function . . . . . . . . . . . . . 2.3.4 The Language of an NFA . . . . . . . . . . . . . . . . . . 2.3.5 Equivalence of Deterministic and Nondeterministic Finite Automata . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.6 A Bad Case for the Subset Construction . . . . . . . . . . 2.3.7 Exercises for Section 2.3 . . . . . . . . . . . . . . . . . . . An Application: Text Search . . . . . . . . . . . . . . . . . . . . 2.4.1 Finding Strings in Text . . . . . . . . . . . . . . . . . . . 2.4.2 Nondeterministic Finite Automata for Text Search . . . . 2.4.3 A DFA to Recognize a Set of Keywords . . . . . . . . . . 2.4.4 Exercises for Section 2.4 . . . . . . . . . . . . . . . . . . . Finite Automata With EpsilonTransitions . . . . . . . . . . . . . 2.5.1 Uses of Transitions . . . . . . . . . . . . . . . . . . . . . 2.5.2 The Formal Notation for an NFA . . . . . . . . . . . . . 2.5.3 EpsilonClosures . . . . . . . . . . . . . . . . . . . . . . . 2.5.4 Extended Transitions and Languages for NFA's . . . . . 2.5.5 Eliminating Transitions . . . . . . . . . . . . . . . . . . 2.5.6 Exercises for Section 2.5 . . . . . . . . . . . . . . . . . . . Summary of Chapter 2 . . . . . . . . . . . . . . . . . . . . . . . . Gradiance Problems for Chapter 2 . . . . . . . . . . . . . . . . . References for Chapter 2 . . . . . . . . . . . . . . . . . . . . . . . 3 Regular Expressions and Languages 3.1 Regular Expressions . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.1 The Operators of Regular Expressions . . . . . . . . . . . 3.1.2 Building Regular Expressions . . . . . . . . . . . . . . . . 3.1.3 Precedence of RegularExpression Operators . . . . . . . 3.1.4 Exercises for Section 3.1 . . . . . . . . . . . . . . . . . . . 3.2 Finite Automata and Regular Expressions . . . . . . . . . . . . . 3.2.1 From DFA's to Regular Expressions . . . . . . . . . . . . 3.2.2 Converting DFA's to Regular Expressions by Eliminating States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 43 44 45 45 46 47 49 52 52 55 55 57 58 59 60 64 65 68 68 69 70 71 72 72 73 74 75 77 79 80 80 83 85 85 86 87 90 91 92 93 98 TABLE OF CONTENTS 3.3 3.4 3.5 3.6 3.7 3.2.3 Converting Regular Expressions to Automata . . . 3.2.4 Exercises for Section 3.2 . . . . . . . . . . . . . . . Applications of Regular Expressions . . . . . . . . . . . . 3.3.1 Regular Expressions in UNIX . . . . . . . . . . . . 3.3.2 Lexical Analysis . . . . . . . . . . . . . . . . . . . 3.3.3 Finding Patterns in Text . . . . . . . . . . . . . . 3.3.4 Exercises for Section 3.3 . . . . . . . . . . . . . . . Algebraic Laws for Regular Expressions . . . . . . . . . . 3.4.1 Associativity and Commutativity . . . . . . . . . . 3.4.2 Identities and Annihilators . . . . . . . . . . . . . 3.4.3 Distributive Laws . . . . . . . . . . . . . . . . . . . 3.4.4 The Idempotent Law . . . . . . . . . . . . . . . . . 3.4.5 Laws Involving Closures . . . . . . . . . . . . . . . 3.4.6 Discovering Laws for Regular Expressions . . . . . 3.4.7 The Test for a RegularExpression Algebraic Law . 3.4.8 Exercises for Section 3.4 . . . . . . . . . . . . . . . Summary of Chapter 3 . . . . . . . . . . . . . . . . . . . . Gradiance Problems for Chapter 3 . . . . . . . . . . . . . References for Chapter 3 . . . . . . . . . . . . . . . . . . . 4 Properties of Regular Languages xi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 . 107 . 109 . 109 . 110 . 112 . 114 . 115 . 115 . 116 . 116 . 117 . 118 . 118 . 120 . 121 . 123 . 123 . 125 127 4.1 Proving Languages Not to Be Regular . . . . . . . . . . . . . . . 128 4.1.1 The Pumping Lemma for Regular Languages . . . . . . . 128 4.1.2 Applications of the Pumping Lemma . . . . . . . . . . . . 129 4.1.3 Exercises for Section 4.1 . . . . . . . . . . . . . . . . . . . 131 4.2 Closure Properties of Regular Languages . . . . . . . . . . . . . . 133 4.2.1 Closure of Regular Languages Under Boolean Operations 133 4.2.2 Reversal . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 4.2.3 Homomorphisms . . . . . . . . . . . . . . . . . . . . . . . 140 4.2.4 Inverse Homomorphisms . . . . . . . . . . . . . . . . . . . 142 4.2.5 Exercises for Section 4.2 . . . . . . . . . . . . . . . . . . . 147 4.3 Decision Properties of Regular Languages . . . . . . . . . . . . . 150 4.3.1 Converting Among Representations . . . . . . . . . . . . 151 4.3.2 Testing Emptiness of Regular Languages . . . . . . . . . . 153 4.3.3 Testing Membership in a Regular Language . . . . . . . . 154 4.3.4 Exercises for Section 4.3 . . . . . . . . . . . . . . . . . . . 155 4.4 Equivalence and Minimization of Automata . . . . . . . . . . . . 155 4.4.1 Testing Equivalence of States . . . . . . . . . . . . . . . . 155 4.4.2 Testing Equivalence of Regular Languages . . . . . . . . . 159 4.4.3 Minimization of DFA's . . . . . . . . . . . . . . . . . . . . 160 4.4.4 Why the Minimized DFA Can't Be Beaten . . . . . . . . 163 4.4.5 Exercises for Section 4.4 . . . . . . . . . . . . . . . . . . . 165 4.5 Summary of Chapter 4 . . . . . . . . . . . . . . . . . . . . . . . . 166 4.6 Gradiance Problems for Chapter 4 . . . . . . . . . . . . . . . . . 167 4.7 References for Chapter 4 . . . . . . . . . . . . . . . . . . . . . . . 169 TABLE OF CONTENTS xii 5 ContextFree Grammars and Languages 5.1 ContextFree Grammars . . . . . . . . . . . . . . . . . . . . . 5.1.1 An Informal Example . . . . . . . . . . . . . . . . . . 5.1.2 Denition of ContextFree Grammars . . . . . . . . . 5.1.3 Derivations Using a Grammar . . . . . . . . . . . . . . 5.1.4 Leftmost and Rightmost Derivations . . . . . . . . . . 5.1.5 The Language of a Grammar . . . . . . . . . . . . . . 5.1.6 Sentential Forms . . . . . . . . . . . . . . . . . . . . . 5.1.7 Exercises for Section 5.1 . . . . . . . . . . . . . . . . . 5.2 Parse Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.1 Constructing Parse Trees . . . . . . . . . . . . . . . . 5.2.2 The Yield of a Parse Tree . . . . . . . . . . . . . . . . 5.2.3 Inference, Derivations, and Parse Trees . . . . . . . . 5.2.4 From Inferences to Trees . . . . . . . . . . . . . . . . . 5.2.5 From Trees to Derivations . . . . . . . . . . . . . . . . 5.2.6 From Derivations to Recursive Inferences . . . . . . . 5.2.7 Exercises for Section 5.2 . . . . . . . . . . . . . . . . . 5.3 Applications of ContextFree Grammars . . . . . . . . . . . . 5.3.1 Parsers . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.2 The YACC ParserGenerator . . . . . . . . . . . . . . 5.3.3 Markup Languages . . . . . . . . . . . . . . . . . . . . 5.3.4 XML and DocumentType Denitions . . . . . . . . . 5.3.5 Exercises for Section 5.3 . . . . . . . . . . . . . . . . . 5.4 Ambiguity in Grammars and Languages . . . . . . . . . . . . 5.4.1 Ambiguous Grammars . . . . . . . . . . . . . . . . . . 5.4.2 Removing Ambiguity From Grammars . . . . . . . . . 5.4.3 Leftmost Derivations as a Way to Express Ambiguity 5.4.4 Inherent Ambiguity . . . . . . . . . . . . . . . . . . . 5.4.5 Exercises for Section 5.4 . . . . . . . . . . . . . . . . . 5.5 Summary of Chapter 5 . . . . . . . . . . . . . . . . . . . . . . 5.6 Gradiance Problems for Chapter 5 . . . . . . . . . . . . . . . 5.7 References for Chapter 5 . . . . . . . . . . . . . . . . . . . . . 6 Pushdown Automata 6.1 Denition of the Pushdown Automaton . . . . . . . . 6.1.1 Informal Introduction . . . . . . . . . . . . . . 6.1.2 The Formal Denition of Pushdown Automata 6.1.3 A Graphical Notation for PDA's . . . . . . . . 6.1.4 Instantaneous Descriptions of a PDA . . . . . . 6.1.5 Exercises for Section 6.1 . . . . . . . . . . . . . 6.2 The Languages of a PDA . . . . . . . . . . . . . . . . 6.2.1 Acceptance by Final State . . . . . . . . . . . . 6.2.2 Acceptance by Empty Stack . . . . . . . . . . . 6.2.3 From Empty Stack to Final State . . . . . . . . 6.2.4 From Final State to Empty Stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 . 172 . 173 . 175 . 177 . 179 . 180 . 181 . 183 . 183 . 185 . 185 . 187 . 188 . 191 . 193 . 193 . 194 . 196 . 197 . 200 . 206 . 207 . 207 . 209 . 212 . 213 . 215 . 216 . 218 . 224 . . . . . . . . . . . . 225 . 225 . 227 . 229 . 230 . 233 . 234 . 235 . 236 . 237 . 240 225 TABLE OF CONTENTS 6.2.5 Exercises for Section 6.2 . . . . . . . . . . . . 6.3 Equivalence of PDA's and CFG's . . . . . . . . . . . 6.3.1 From Grammars to Pushdown Automata . . 6.3.2 From PDA's to Grammars . . . . . . . . . . . 6.3.3 Exercises for Section 6.3 . . . . . . . . . . . . 6.4 Deterministic Pushdown Automata . . . . . . . . . . 6.4.1 Denition of a Deterministic PDA . . . . . . 6.4.2 Regular Languages and Deterministic PDA's 6.4.3 DPDA's and ContextFree Languages . . . . 6.4.4 DPDA's and Ambiguous Grammars . . . . . 6.4.5 Exercises for Section 6.4 . . . . . . . . . . . . 6.5 Summary of Chapter 6 . . . . . . . . . . . . . . . . . 6.6 Gradiance Problems for Chapter 6 . . . . . . . . . . 6.7 References for Chapter 6 . . . . . . . . . . . . . . . . xiii . . . . . . . . . . . . . . . 241 . 243 . 243 . 247 . 251 . 252 . 252 . 253 . 254 . 255 . 256 . 257 . 258 . 260 7.1 Normal Forms for ContextFree Grammars . . . . . . . . . . . 7.1.1 Eliminating Useless Symbols . . . . . . . . . . . . . . . 7.1.2 Computing the Generating and Reachable Symbols . . . 7.1.3 Eliminating Productions . . . . . . . . . . . . . . . . . 7.1.4 Eliminating Unit Productions . . . . . . . . . . . . . . . 7.1.5 Chomsky Normal Form . . . . . . . . . . . . . . . . . . 7.1.6 Exercises for Section 7.1 . . . . . . . . . . . . . . . . . . 7.2 The Pumping Lemma for ContextFree Languages . . . . . . . 7.2.1 The Size of Parse Trees . . . . . . . . . . . . . . . . . . 7.2.2 Statement of the Pumping Lemma . . . . . . . . . . . . 7.2.3 Applications of the Pumping Lemma for CFL's . . . . . 7.2.4 Exercises for Section 7.2 . . . . . . . . . . . . . . . . . . 7.3 Closure Properties of ContextFree Languages . . . . . . . . . . 7.3.1 Substitutions . . . . . . . . . . . . . . . . . . . . . . . . 7.3.2 Applications of the Substitution Theorem . . . . . . . . 7.3.3 Reversal . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.4 Intersection With a Regular Language . . . . . . . . . . 7.3.5 Inverse Homomorphism . . . . . . . . . . . . . . . . . . 7.3.6 Exercises for Section 7.3 . . . . . . . . . . . . . . . . . . 7.4 Decision Properties of CFL's . . . . . . . . . . . . . . . . . . . 7.4.1 Complexity of Converting Among CFG's and PDA's . . 7.4.2 Running Time of Conversion to Chomsky Normal Form 7.4.3 Testing Emptiness of CFL's . . . . . . . . . . . . . . . . 7.4.4 Testing Membership in a CFL . . . . . . . . . . . . . . 7.4.5 Preview of Undecidable CFL Problems . . . . . . . . . . 7.4.6 Exercises for Section 7.4 . . . . . . . . . . . . . . . . . . 7.5 Summary of Chapter 7 . . . . . . . . . . . . . . . . . . . . . . . 7.6 Gradiance Problems for Chapter 7 . . . . . . . . . . . . . . . . 7.7 References for Chapter 7 . . . . . . . . . . . . . . . . . . . . . . . 261 . 262 . 264 . 265 . 268 . 272 . 275 . 279 . 280 . 280 . 283 . 286 . 287 . 287 . 289 . 290 . 291 . 295 . 297 . 299 . 299 . 301 . 302 . 303 . 307 . 307 . 308 . 309 . 314 7 Properties of ContextFree Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261 TABLE OF CONTENTS xiv 8 Introduction to Turing Machines 315 8.1 Problems That Computers Cannot Solve . . . . . . . . . . . . . . 315 8.1.1 Programs that Print \Hello, World" . . . . . . . . . . . . 316 8.1.2 The Hypothetical \Hello, World" Tester . . . . . . . . . . 318 8.1.3 Reducing One Problem to Another . . . . . . . . . . . . . 321 8.1.4 Exercises for Section 8.1 . . . . . . . . . . . . . . . . . . . 324 8.2 The Turing Machine . . . . . . . . . . . . . . . . . . . . . . . . . 324 8.2.1 The Quest to Decide All Mathematical Questions . . . . . 325 8.2.2 Notation for the Turing Machine . . . . . . . . . . . . . . 326 8.2.3 Instantaneous Descriptions for Turing Machines . . . . . . 327 8.2.4 Transition Diagrams for Turing Machines . . . . . . . . . 331 8.2.5 The Language of a Turing Machine . . . . . . . . . . . . . 334 8.2.6 Turing Machines and Halting . . . . . . . . . . . . . . . . 334 8.2.7 Exercises for Section 8.2 . . . . . . . . . . . . . . . . . . . 335 8.3 Programming Techniques for Turing Machines . . . . . . . . . . . 337 8.3.1 Storage in the State . . . . . . . . . . . . . . . . . . . . . 337 8.3.2 Multiple Tracks . . . . . . . . . . . . . . . . . . . . . . . . 339 8.3.3 Subroutines . . . . . . . . . . . . . . . . . . . . . . . . . . 341 8.3.4 Exercises for Section 8.3 . . . . . . . . . . . . . . . . . . . 343 8.4 Extensions to the Basic Turing Machine . . . . . . . . . . . . . . 343 8.4.1 Multitape Turing Machines . . . . . . . . . . . . . . . . . 344 8.4.2 Equivalence of OneTape and Multitape TM's . . . . . . . 345 8.4.3 Running Time and the ManyTapestoOne Construction 346 8.4.4 Nondeterministic Turing Machines . . . . . . . . . . . . . 347 8.4.5 Exercises for Section 8.4 . . . . . . . . . . . . . . . . . . . 349 8.5 Restricted Turing Machines . . . . . . . . . . . . . . . . . . . . . 352 8.5.1 Turing Machines With Semiinnite Tapes . . . . . . . . . 352 8.5.2 Multistack Machines . . . . . . . . . . . . . . . . . . . . . 355 8.5.3 Counter Machines . . . . . . . . . . . . . . . . . . . . . . 358 8.5.4 The Power of Counter Machines . . . . . . . . . . . . . . 359 8.5.5 Exercises for Section 8.5 . . . . . . . . . . . . . . . . . . . 361 8.6 Turing Machines and Computers . . . . . . . . . . . . . . . . . . 362 8.6.1 Simulating a Turing Machine by Computer . . . . . . . . 362 8.6.2 Simulating a Computer by a Turing Machine . . . . . . . 363 8.6.3 Comparing the Running Times of Computers and Turing Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . 368 8.7 Summary of Chapter 8 . . . . . . . . . . . . . . . . . . . . . . . . 370 8.8 Gradiance Problems for Chapter 8 . . . . . . . . . . . . . . . . . 372 8.9 References for Chapter 8 . . . . . . . . . . . . . . . . . . . . . . . 374 9 Undecidability 9.1 A Language That Is Not Recursively Enumerable . 9.1.1 Enumerating the Binary Strings . . . . . . 9.1.2 Codes for Turing Machines . . . . . . . . . 9.1.3 The Diagonalization Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377 . 378 . 379 . 379 . 380 TABLE OF CONTENTS 9.2 9.3 9.4 9.5 9.6 9.7 9.8 9.1.4 Proof That Ld Is Not Recursively Enumerable . . . 9.1.5 Exercises for Section 9.1 . . . . . . . . . . . . . . . . An Undecidable Problem That Is RE . . . . . . . . . . . . . 9.2.1 Recursive Languages . . . . . . . . . . . . . . . . . . 9.2.2 Complements of Recursive and RE languages . . . . 9.2.3 The Universal Language . . . . . . . . . . . . . . . . 9.2.4 Undecidability of the Universal Language . . . . . . 9.2.5 Exercises for Section 9.2 . . . . . . . . . . . . . . . . Undecidable Problems About Turing Machines . . . . . . . 9.3.1 Reductions . . . . . . . . . . . . . . . . . . . . . . . 9.3.2 Turing Machines That Accept the Empty Language 9.3.3 Rice's Theorem and Properties of the RE Languages 9.3.4 Problems about TuringMachine Specications . . . 9.3.5 Exercises for Section 9.3 . . . . . . . . . . . . . . . . Post's Correspondence Problem . . . . . . . . . . . . . . . . 9.4.1 Denition of Post's Correspondence Problem . . . . 9.4.2 The \Modied" PCP . . . . . . . . . . . . . . . . . . 9.4.3 Completion of the Proof of PCP Undecidability . . . 9.4.4 Exercises for Section 9.4 . . . . . . . . . . . . . . . . Other Undecidable Problems . . . . . . . . . . . . . . . . . 9.5.1 Problems About Programs . . . . . . . . . . . . . . 9.5.2 Undecidability of Ambiguity for CFG's . . . . . . . 9.5.3 The Complement of a List Language . . . . . . . . . 9.5.4 Exercises for Section 9.5 . . . . . . . . . . . . . . . . Summary of Chapter 9 . . . . . . . . . . . . . . . . . . . . . Gradiance Problems for Chapter 9 . . . . . . . . . . . . . . References for Chapter 9 . . . . . . . . . . . . . . . . . . . . xv . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382 . 382 . 383 . 383 . 384 . 387 . 389 . 390 . 392 . 392 . 394 . 397 . 399 . 400 . 401 . 401 . 404 . 407 . 412 . 412 . 413 . 413 . 415 . 418 . 419 . 420 . 422 425 10 Intractable Problems 10.1 The Classes P and NP . . . . . . . . . . . . . . . . . . . . . . . 426 10.1.1 Problems Solvable in Polynomial Time . . . . . . . . 10.1.2 An Example: Kruskal's Algorithm . . . . . . . . . . 10.1.3 Nondeterministic Polynomial Time . . . . . . . . . . 10.1.4 An NP Example: The Traveling Salesman Problem 10.1.5 PolynomialTime Reductions . . . . . . . . . . . . . 10.1.6 NPComplete Problems . . . . . . . . . . . . . . . . 10.1.7 Exercises for Section 10.1 . . . . . . . . . . . . . . . 10.2 An NPComplete Problem . . . . . . . . . . . . . . . . . . . 10.2.1 The Satisability Problem . . . . . . . . . . . . . . . 10.2.2 Representing SAT Instances . . . . . . . . . . . . . . 10.2.3 NPCompleteness of the SAT Problem . . . . . . . . 10.2.4 Exercises for Section 10.2 . . . . . . . . . . . . . . . 10.3 A Restricted Satisability Problem . . . . . . . . . . . . . . 10.3.1 Normal Forms for Boolean Expressions . . . . . . . . 10.3.2 Converting Expressions to CNF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 426 . 426 . 431 . 431 . 433 . 434 . 435 . 438 . 438 . 439 . 440 . 447 . 447 . 448 . 449 TABLE OF CONTENTS xvi 10.4 10.5 10.6 10.7 10.3.3 NPCompleteness of CSAT . . . . . . . . . 10.3.4 NPCompleteness of 3SAT . . . . . . . . . . 10.3.5 Exercises for Section 10.3 . . . . . . . . . . Additional NPComplete Problems . . . . . . . . . 10.4.1 Describing NPcomplete Problems . . . . . 10.4.2 The Problem of Independent Sets . . . . . . 10.4.3 The NodeCover Problem . . . . . . . . . . 10.4.4 The Directed HamiltonCircuit Problem . . 10.4.5 Undirected Hamilton Circuits and the TSP 10.4.6 Summary of NPComplete Problems . . . . 10.4.7 Exercises for Section 10.4 . . . . . . . . . . Summary of Chapter 10 . . . . . . . . . . . . . . . Gradiance Problems for Chapter 10 . . . . . . . . . References for Chapter 10 . . . . . . . . . . . . . . 11 Additional Classes of Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 452 . 456 . 458 . 458 . 459 . 459 . 463 . 465 . 471 . 473 . 473 . 477 . 478 . 481 483 11.1 Complements of Languages in NP . . . . . . . . . . . . . . . . . 484 11.1.1 The Class of Languages CoNP . . . . . . . . . . . . . . 484 11.1.2 NPComplete Problems and CoNP . . . . . . . . . . . . 485 11.1.3 Exercises for Section 11.1 . . . . . . . . . . . . . . . . . . 486 11.2 Problems Solvable in Polynomial Space . . . . . . . . . . . . . . 487 11.2.1 PolynomialSpace Turing Machines . . . . . . . . . . . . . 487 11.2.2 Relationship of PS and NPS to Previously Dened Classes488 11.2.3 Deterministic and Nondeterministic Polynomial Space . . 490 11.3 A Problem That Is Complete for PS . . . . . . . . . . . . . . . . 492 11.3.1 PSCompleteness . . . . . . . . . . . . . . . . . . . . . . . 492 11.3.2 Quantied Boolean Formulas . . . . . . . . . . . . . . . . 493 11.3.3 Evaluating Quantied Boolean Formulas . . . . . . . . . . 494 11.3.4 PSCompleteness of the QBF Problem . . . . . . . . . . . 496 11.3.5 Exercises for Section 11.3 . . . . . . . . . . . . . . . . . . 501 11.4 Language Classes Based on Randomization . . . . . . . . . . . . 501 11.4.1 Quicksort: an Example of a Randomized Algorithm . . . 502 11.4.2 A TuringMachine Model Using Randomization . . . . . . 503 11.4.3 The Language of a Randomized Turing Machine . . . . . 504 11.4.4 The Class RP . . . . . . . . . . . . . . . . . . . . . . . . 506 11.4.5 Recognizing Languages in RP . . . . . . . . . . . . . . . 508 11.4.6 The Class ZPP . . . . . . . . . . . . . . . . . . . . . . . 509 11.4.7 Relationship Between RP and ZPP . . . . . . . . . . . . 510 11.4.8 Relationships to the Classes P and NP . . . . . . . . . . 511 11.5 The Complexity of Primality Testing . . . . . . . . . . . . . . . . 512 11.5.1 The Importance of Testing Primality . . . . . . . . . . . . 512 11.5.2 Introduction to Modular Arithmetic . . . . . . . . . . . . 514 11.5.3 The Complexity of ModularArithmetic Computations . . 516 11.5.4 RandomPolynomial Primality Testing . . . . . . . . . . . 517 11.5.5 Nondeterministic Primality Tests . . . . . . . . . . . . . . 518 TABLE OF CONTENTS 11.5.6 Exercises for Section 11.5 . . 11.6 Summary of Chapter 11 . . . . . . . 11.7 Gradiance Problems for Chapter 11 . 11.8 References for Chapter 11 . . . . . . Index xvii . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 521 . 522 . 523 . 524 527 Chapter 1 Automata: The Methods and the Madness Automata theory is the study of abstract computing devices, or \machines." Before there were computers, in the 1930's, A. Turing studied an abstract machine that had all the capabilities of today's computers, at least as far as in what they could compute. Turing's goal was to describe precisely the boundary between what a computing machine could do and what it could not do his conclusions apply not only to his abstract Turing machines, but to today's real machines. In the 1940's and 1950's, simpler kinds of machines, which we today call \ nite automata," were studied by a number of researchers. These automata, originally proposed to model brain function, turned out to be extremely useful for a variety of other purposes, which we shall mention in Section 1.1. Also in the late 1950's, the linguist N. Chomsky began the study of formal \grammars." While not strictly machines, these grammars have close relationships to abstract automata and serve today as the basis of some important software components, including parts of compilers. In 1969, S. Cook extended Turing's study of what could and what could not be computed. Cook was able to separate those problems that can be solved eciently by computer from those problems that can in principle be solved, but in practice take so much time that computers are useless for all but very small instances of the problem. The latter class of problems is called \intractable," or \NPhard." It is highly unlikely that even the exponential improvement in computing speed that computer hardware has been following (\Moore's Law") will have signi cant impact on our ability to solve large instances of intractable problems. All of these theoretical developments bear directly on what computer scientists do today. Some of the concepts, like nite automata and certain kinds of formal grammars, are used in the design and construction of important kinds of software. Other concepts, like the Turing machine, help us understand what 1 2 CHAPTER 1. AUTOMATA: THE METHODS AND THE MADNESS we can expect from our software. Especially, the theory of intractable problems lets us deduce whether we are likely to be able to meet a problem \headon" and write a program to solve it (because it is not in the intractable class), or whether we have to nd some way to work around the intractable problem: nd an approximation, use a heuristic, or use some other method to limit the amount of time the program will spend solving the problem. In this introductory chapter, we begin with a very highlevel view of what automata theory is about, and what its uses are. Much of the chapter is devoted to a survey of proof techniques and tricks for discovering proofs. We cover deductive proofs, reformulating statements, proofs by contradiction, proofs by induction, and other important concepts. A nal section introduces the concepts that pervade automata theory: alphabets, strings, and languages. 1.1 Why Study Automata Theory? There are several reasons why the study of automata and complexity is an important part of the core of Computer Science. This section serves to introduce the reader to the principal motivation and also outlines the major topics covered in this book. 1.1.1 Introduction to Finite Automata Finite automata are a useful model for many important kinds of hardware and software. We shall see, starting in Chapter 2, examples of how the concepts are used. For the moment, let us just list some of the most important kinds: 1. Software for designing and checking the behavior of digital circuits. 2. The \lexical analyzer" of a typical compiler, that is, the compiler component that breaks the input text into logical units, such as identi ers, keywords, and punctuation. 3. Software for scanning large bodies of text, such as collections of Web pages, to nd occurrences of words, phrases, or other patterns. 4. Software for verifying systems of all types that have a nite number of distinct states, such as communications protocols or protocols for secure exchange of information. While we shall soon meet a precise de nition of automata of various types, let us begin our informal introduction with a sketch of what a nite automaton is and does. There are many systems or components, such as those enumerated above, that may be viewed as being at all times in one of a nite number of \states." The purpose of a state is to remember the relevant portion of the system's history. Since there are only a nite number of states, the entire history generally cannot be remembered, so the system must be designed carefully, to 1.1. WHY STUDY AUTOMATA THEORY? 3 remember what is important and forget what is not. The advantage of having only a nite number of states is that we can implement the system with a xed set of resources. For example, we could implement it in hardware as a circuit, or as a simple form of program that can make decisions looking only at a limited amount of data or using the position in the code itself to make the decision. Example 1.1: Perhaps the simplest nontrivial nite automaton is an on/o switch. The device remembers whether it is in the \on" state or the \o" state, and it allows the user to press a button whose eect is dierent, depending on the state of the switch. That is, if the switch is in the o state, then pressing the button changes it to the on state, and if the switch is in the on state, then pressing the same button turns it to the o state. Push Start off on Push Figure 1.1: A nite automaton modeling an on/o switch The niteautomaton model for the switch is shown in Fig. 1.1. As for all nite automata, the states are represented by circles in this example, we have named the states on and o . Arcs between states are labeled by \inputs," which represent external inuences on the system. Here, both arcs are labeled by the input Push, which represents a user pushing the button. The intent of the two arcs is that whichever state the system is in, when the Push input is received it goes to the other state. One of the states is designated the \start state," the state in which the system is placed initially. In our example, the start state is o , and we conventionally indicate the start state by the word Start and an arrow leading to that state. It is often necessary to indicate one or more states as \ nal" or \accepting" states. Entering one of these states after a sequence of inputs indicates that the input sequence is good in some way. For instance, we could have regarded the state on in Fig. 1.1 as accepting, because in that state, the device being controlled by the switch will operate. It is conventional to designate accepting states by a double circle, although we have not made any such designation in Fig. 1.1. 2 Example 1.2: Sometimes, what is remembered by a state can be much more complex than an on/o choice. Figure 1.2 shows another nite automaton that could be part of a lexical analyzer. The job of this automaton is to recognize the keyword then. It thus needs ve states, each of which represents a dierent 4 CHAPTER 1. AUTOMATA: THE METHODS AND THE MADNESS position in the word then that has been reached so far. These positions correspond to the pre xes of the word, ranging from the empty string (i.e., nothing of the word has been seen so far) to the complete word. Start t h t e th n the then Figure 1.2: A nite automaton modeling recognition of then In Fig. 1.2, the ve states are named by the pre x of then seen so far. Inputs correspond to letters. We may imagine that the lexical analyzer examines one character of the program that it is compiling at a time, and the next character to be examined is the input to the automaton. The start state corresponds to the empty string, and each state has a transition on the next letter of then to the state that corresponds to the nextlarger pre x. The state named then is entered when the input has spelled the word then. Since it is the job of this automaton to recognize when then has been seen, we could consider that state the lone accepting state. 2 1.1.2 Structural Representations There are two important notations that are not automatonlike, but play an important role in the study of automata and their applications. 1. Grammars are useful models when designing software that processes data with a recursive structure. The bestknown example is a \parser," the component of a compiler that deals with the recursively nested features of the typical programming language, such as expressions  arithmetic, conditional, and so on. For instance, a grammatical rule like E ) E + E states that an expression can be formed by taking any two expressions and connecting them by a plus sign this rule is typical of how expressions of real programming languages are formed. We introduce contextfree grammars, as they are usually called, in Chapter 5. 2. Regular Expressions also denote the structure of data, especially text strings. As we shall see in Chapter 3, the patterns of strings they describe are exactly the same as what can be described by nite automata. The style of these expressions diers signi cantly from that of grammars, and we shall content ourselves with a simple example here. The UNIXstyle regular expression 'AZ]az]* ]AZ]AZ]' represents capitalized words followed by a space and two capital letters. This expression represents patterns in text that could be a city and state, e.g., Ithaca NY. It misses multiword city names, such as Palo Alto CA, which could be captured by the more complex expression 'AZ]az]*( ]AZ]az]*)* ]AZ]AZ]' 1.2. INTRODUCTION TO FORMAL PROOF 5 When interpreting such expressions, we only need to know that AZ] represents a range of characters from capital \A" to capital \Z" (i.e., any capital letter), and ] is used to represent the blank character alone. Also, the symbol * represents \any number of" the preceding expression. Parentheses are used to group components of the expression they do not represent characters of the text described. 1.1.3 Automata and Complexity Automata are essential for the study of the limits of computation. As we mentioned in the introduction to the chapter, there are two important issues: 1. What can a computer do at all? This study is called \decidability," and the problems that can be solved by computer are called \decidable." This topic is addressed in Chapter 9. 2. What can a computer do eciently? This study is called \intractability," and the problems that can be solved by a computer using no more time than some slowly growing function of the size of the input are called \tractable." Often, we take all polynomial functions to be \slowly growing," while functions that grow faster than any polynomial are deemed to grow too fast. The subject is studied in Chapter 10. 1.2 Introduction to Formal Proof If you studied plane geometry in high school any time before the 1990's, you most likely had to do some detailed \deductive proofs," where you showed the truth of a statement by a detailed sequence of steps and reasons. While geometry has its practical side (e.g., you need to know the rule for computing the area of a rectangle if you need to buy the correct amount of carpet for a room), the study of formal proof methodologies was at least as important a reason for covering this branch of mathematics in high school. In the USA of the 1990's it became popular to teach proof as a matter of personal feelings about the statement. While it is good to feel the truth of a statement you need to use, important techniques of proof are no longer mastered in high school. Yet proof is something that every computer scientist needs to understand. Some computer scientists take the extreme view that a formal proof of the correctness of a program should go handinhand with the writing of the program itself. We doubt that doing so is productive. On the other hand, there are those who say that proof has no place in the discipline of programming. The slogan \if you are not sure your program is correct, run it and see" is commonly oered by this camp. Our position is between these two extremes. Testing programs is surely essential. However, testing goes only so far, since you cannot try your program on every input. More importantly, if your program is complex  say a tricky 6 CHAPTER 1. AUTOMATA: THE METHODS AND THE MADNESS recursion or iteration  then if you don't understand what is going on as you go around a loop or call a function recursively, it is unlikely that you will write the code correctly. When your testing tells you the code is incorrect, you still need to get it right. To make your iteration or recursion correct, you need to set up an inductive hypothesis, and it is helpful to reason, formally or informally, that the hypothesis is consistent with the iteration or recursion. This process of understanding the workings of a correct program is essentially the same as the process of proving theorems by induction. Thus, in addition to giving you models that are useful for certain types of software, it has become traditional for a course on automata theory to cover methodologies of formal proof. Perhaps more than other core subjects of computer science, automata theory lends itself to natural and interesting proofs, both of the deductive kind (a sequence of justi ed steps) and the inductive kind (recursive proofs of a parameterized statement that use the statement itself with \lower" values of the parameter). 1.2.1 Deductive Proofs As mentioned above, a deductive proof consists of a sequence of statements whose truth leads us from some initial statement, called the hypothesis or the given statement(s), to a conclusion statement. Each step in the proof must follow, by some accepted logical principle, from either the given facts, or some of the previous statements in the deductive proof, or a combination of these. The hypothesis may be true or false, typically depending on values of its parameters. Often, the hypothesis consists of several independent statements connected by a logical AND. In those cases, we talk of each of these statements as a hypothesis, or as a given statement. The theorem that is proved when we go from a hypothesis H to a conclusion C is the statement \if H then C ." We say that C is deduced from H . An example theorem of the form \if H then C " will illustrate these points. Theorem 1.3: If x 4, then 2x x2 . 2 It is not hard to convince ourselves informally that Theorem 1.3 is true, although a formal proof requires induction and will be left for Example 1.17. First, notice that the hypothesis H is \x 4." This hypothesis has a parameter, x, and thus is neither true nor false. Rather, its truth depends on the value of the parameter x e.g., H is true for x = 6 and false for x = 2. Likewise, the conclusion C is \2x x2 ." This statement also uses parameter x and is true for certain values of x and not others. For example, C is false for x = 3, since 23 = 8, which is not as large as 32 = 9. On the other hand, C is true for x = 4, since 24 = 42 = 16. For x = 5, the statement is also true, since 25 = 32 is at least as large as 52 = 25. Perhaps you can see the intuitive argument that tells us the conclusion 2x x2 will be true whenever x 4. We already saw that it is true for x = 4. As x grows larger than 4, the left side, 2x doubles each time x increases by 1.2. INTRODUCTION TO FORMAL PROOF 7 2 1. However, the right side, x2 , grows by the ratio x+1 . If x 4, then ; xx+1 2 (x + 1)=x cannot be greater than 1.25, and therefore x cannot be bigger than 1.5625. Since 1:5625 < 2, each time x increases above 4 the left side 2x grows more than the right side x2 . Thus, as long as we start from a value like x = 4 where the inequality 2x x2 is already satis ed, we can increase x as much as we like, and the inequality will still be satis ed. We have now completed an informal but accurate proof of Theorem 1.3. We shall return to the proof and make it more precise in Example 1.17, after we introduce \inductive" proofs. Theorem 1.3, like all interesting theorems, involves an in nite number of related facts, in this case the statement \if x 4 then 2x x2 " for all integers x. In fact, we do not need to assume x is an integer, but the proof talked about repeatedly increasing x by 1, starting at x = 4, so we really addressed only the situation where x is an integer. Theorem 1.3 can be used to help deduce other theorems. In the next example, we consider a complete deductive proof of a simple theorem that uses Theorem 1.3. Theorem 1.4: If x is the sum of the squares of four positive integers, then 2x x2 . PROOF: The intuitive idea of the proof is that if the hypothesis is true for x, that is, x is the sum of the squares of four positive integers, then x must be at least 4. Therefore, the hypothesis of Theorem 1.3 holds, and since we believe that theorem, we may state that its conclusion is also true for x. The reasoning can be expressed as a sequence of steps. Each step is either the hypothesis of the theorem to be proved, part of that hypothesis, or a statement that follows from one or more previous statements. By \follows" we mean that if the hypothesis of some theorem is a previous statement, then the conclusion of that theorem is true, and can be written down as a statement of our proof. This logical rule is often called modus ponens i.e., if we know H is true, and we know \if H then C " is true, we may conclude that C is true. We also allow certain other logical steps to be used in creating a statement that follows from one or more previous statements. For instance, if A and B are two previous statements, then we can deduce and write down the statement \A and B ." Figure 1.3 shows the sequence of statements we need to prove Theorem 1.4. While we shall not generally prove theorems in such a stylized form, it helps to think of proofs as very explicit lists of statements, each with a precise justi cation. In step (1), we have repeated one of the given statements of the theorem: that x is the sum of the squares of four integers. It often helps in proofs if we name quantities that are referred to but not named, and we have done so here, giving the four integers the names a, b, c, and d. In step (2), we put down the other part of the hypothesis of the theorem: that the values being squared are each at least 1. Technically, this statement represents four distinct statements, one for each of the four integers involved. ; CHAPTER 1. AUTOMATA: THE METHODS AND THE MADNESS 8 1. 2. 3. 4. 5. Statement x = a2 + b2 + c2 + d2 a1 b1 c1 d1 a2 1 b2 1 c2 1 d2 1 x4 2x x2 Justi cation Given Given (2) and properties of arithmetic (1), (3), and properties of arithmetic (4) and Theorem 1.3 Figure 1.3: A formal proof of Theorem 1.4 Then, in step (3) we observe that if a number is at least 1, then its square is also at least 1. We use as a justi cation the fact that statement (2) holds, and \properties of arithmetic." That is, we assume the reader knows, or can prove simple statements about how inequalities work, such as the statement \if y 1, then y2 1." Step (4) uses statements (1) and (3). The rst statement tells us that x is the sum of the four squares in question, and statement (3) tells us that each of the squares is at least 1. Again using wellknown properties of arithmetic, we conclude that x is at least 1 + 1 + 1 + 1, or 4. At the nal step (5), we use statement (4), which is the hypothesis of Theorem 1.3. The theorem itself is the justi cation for writing down its conclusion, since its hypothesis is a previous statement. Since the statement (5) that is the conclusion of Theorem 1.3 is also the conclusion of Theorem 1.4, we have now proved Theorem 1.4. That is, we have started with the hypothesis of that theorem, and have managed to deduce its conclusion. 2 1.2.2 Reduction to Denitions In the previous two theorems, the hypotheses used terms that should have been familiar: integers, addition, and multiplication, for instance. In many other theorems, including many from automata theory, the terms used in the statement may have implications that are less obvious. A useful way to proceed in many proofs is: If you are not sure how to start a proof, convert all terms in the hypothesis to their de nitions. Here is an example of a theorem that is simple to prove once we have expressed its statement in elementary terms. It uses the following two de nitions: 1. A set S is nite if there exists an integer n such that S has exactly n elements. We write kS k = n, where kS k is used to denote the number of elements in a set S . If the set S is not nite, we say S is innite. Intuitively, an in nite set is a set that contains more than any integer number of elements. 1.2. INTRODUCTION TO FORMAL PROOF 9 2. If S and T are both subsets of some set U , then T is the complement of S (with respect to U ) if S T = U and S \ T = . That is, each element of U is in exactly one of S and T put another way, T consists of exactly those elements of U that are not in S . Theorem 1.5: Let S be a nite subset of some in nite set U . Let T be the complement of S with respect to U . Then T is in nite. PROOF: Intuitively, this theorem says that if you have an in nite supply of something (U ), and you take a nite amount away (S ), then you still have an in nite amount left. Let us begin by restating the facts of the theorem as in Fig. 1.4. Original Statement S is nite New Statement There is a integer n such that kS k = n U is in nite For no integer p is kU k = p T is the complement of S S T = U and S \ T = Figure 1.4: Restating the givens of Theorem 1.5 We are still stuck, so we need to use a common proof technique called \proof by contradiction." In this proof method, to be discussed further in Section 1.3.3, we assume that the conclusion is false. We then use that assumption, together with parts of the hypothesis, to prove the opposite of one of the given statements of the hypothesis. We have then shown that it is impossible for all parts of the hypothesis to be true and for the conclusion to be false at the same time. The only possibility that remains is for the conclusion to be true whenever the hypothesis is true. That is, the theorem is true. In the case of Theorem 1.5, the contradiction of the conclusion is \T is nite." Let us assume T is nite, along with the statement of the hypothesis that says S is nite i.e., kS k = n for some integer n. Similarly, we can restate the assumption that T is nite as kT k = m for some integer m. Now one of the given statements tells us that S T = U , and S \ T = . That is, the elements of U are exactly the elements of S and T . Thus, there must be n + m elements of U . Since n + m is an integer, and we have shown kU k = n + m, it follows that U is nite. More precisely, we showed the number of elements in U is some integer, which is the de nition of \ nite." But the statement that U is nite contradicts the given statement that U is in nite. We have thus used the contradiction of our conclusion to prove the contradiction of one of the given statements of the hypothesis, and by the principle of \proof by contradiction" we may conclude the theorem is true. 2 Proofs do not have to be so wordy. Having seen the ideas behind the proof, let us reprove the theorem in a few lines. 10 CHAPTER 1. AUTOMATA: THE METHODS AND THE MADNESS Statements With Quantiers Many theorems involve statements that use the quantiers \for all" and \there exists," or similar variations, such as \for every" instead of \for all." The order in which these quanti ers appear aects what the statement means. It is often helpful to see statements with more than one quanti er as a \game" between two players  forall and thereexists  who take turns specifying values for the parameters mentioned in the theorem. \Forall" must consider all possible choices, so forall's choices are generally left as variables. However, \thereexists" only has to pick one value, which may depend on the values picked by the players previously. The order in which the quanti ers appear in the statement determines who goes rst. If the last player to make a choice can always nd some allowable value, then the statement is true. For example, consider an alternative de nition of \in nite set": set S is innite if and only if for all integers n, there exists a subset T of S with exactly n members. Here, \forall" precedes \thereexists," so we must consider an arbitrary integer n. Now, \thereexists" gets to pick a subset T , and may use the knowledge of n to do so. For instance, if S were the set of integers, \thereexists" could pick the subset T = f1 2 : : : ng and thereby succeed regardless of n. That is a proof that the set of integers is in nite. The following statement looks like the de nition of \in nite," but is incorrect because it reverses the order of the quanti ers: \there exists a subset T of set S such that for all n, set T has exactly n members." Now, given a set S such as the integers, player \thereexists" can pick any set T say f1 2 5g is picked. For this choice, player \forall" must show that T has n members for every possible n. However, \forall" cannot do so. For instance, it is false for n = 4, or in fact for any n 6= 3. PROOF: (of Theorem 1.5) We know that S T = U and S and T are disjoint, so kS k + kT k = kU k. Since S is nite, kS k = n for some integer n, and since U is in nite, there is no integer p such that kU k = p. So assume that T is nite that is, kT k = m for some integer m. Then kU k = kS k + kT k = n + m, which contradicts the given statement that there is no integer p equal to kU k. 2 1.2.3 Other Theorem Forms The \ifthen" form of theorem is most common in typical areas of mathematics. However, we see other kinds of statements proved as theorems also. In this section, we shall examine the most common forms of statement and what we usually need to do to prove them. 1.2. INTRODUCTION TO FORMAL PROOF 11 Ways of Saying \IfThen" First, there are a number of kinds of theorem statements that look dierent from a simple \if H then C " form, but are in fact saying the same thing: if hypothesis H is true for a given value of the parameter(s), then the conclusion C is true for the same value. Here are some of the other ways in which \if H then C " might appear. 1. H implies C . 2. H only if C . 3. C if H . 4. Whenever H holds, C follows. We also see many variants of form (4), such as \if H holds, then C follows," or \whenever H holds, C holds." Example 1.6: The statement of Theorem 1.3 would appear in these four forms as: 1. x 4 implies 2x x2 . 2. x 4 only if 2x x2 . 3. 2x x2 if x 4. 4. Whenever x 4, 2x x2 follows. 2 In addition, in formal logic one often sees the operator ! in place of \ifthen." That is, the statement \if H then C " could appear as H ! C in some mathematical literature we shall not use it here. IfAndOnlyIf Statements Sometimes, we nd a statement of the form \A if and only if B ." Other forms of this statement are \A i B ,"1 \A is equivalent to B ," or \A exactly when B ." This statement is actually two ifthen statements: \if A then B ," and \if B then A." We prove \A if and only if B " by proving these two statements: 1. The if part : \if B then A," and 2. The onlyif part : \if A then B ," which is often stated in the equivalent form \A only if B ." 12 CHAPTER 1. AUTOMATA: THE METHODS AND THE MADNESS How Formal Do Proofs Have to Be? The answer to this question is not easy. The bottom line regarding proofs is that their purpose is to convince someone, whether it is a grader of your classwork or yourself, about the correctness of a strategy you are using in your code. If it is convincing, then it is enough if it fails to convince the \consumer" of the proof, then the proof has left out too much. Part of the uncertainty regarding proofs comes from the dierent knowledge that the consumer may have. Thus, in Theorem 1.4, we assumed you knew all about arithmetic, and would believe a statement like \if y 1 then y2 1." If you were not familiar with arithmetic, we would have to prove that statement by some steps in our deductive proof. However, there are certain things that are required in proofs, and omitting them surely makes the proof inadequate. For instance, any deductive proof that uses statements which are not justi ed by the given or previous statements, cannot be adequate. When doing a proof of an \if and only if" statement, we must surely have one proof for the \if" part and another proof for the \onlyif" part. As an additional example, inductive proofs (discussed in Section 1.4) require proofs of the basis and induction parts. The proofs can be presented in either order. In many theorems, one part is decidedly easier than the other, and it is customary to present the easy direction rst and get it out of the way. In formal logic, one may see the operator $ or to denote an \ifandonlyif" statement. That is, A B and A $ B mean the same as \A if and only if B ." When proving an ifandonlyif statement, it is important to remember that you must prove both the \if" and \onlyif" parts. Sometimes, you will nd it helpful to break an ifandonlyif into a succession of several equivalences. That is, to prove \A if and only if B ," you might rst prove \A if and only if C ," and then prove \C if and only if B ." That method works, as long as you remember that each ifandonlyif step must be proved in both directions. Proving any one step in only one of the directions invalidates the entire proof. The following is an example of a simple ifandonlyif proof. It uses the notations: 1. bxc, the oor of real number x, is the greatest integer equal to or less than x. 1 I , short for \if and only if," is a nonword that is used in some mathematical treatises for succinctness. 1.3. ADDITIONAL FORMS OF PROOF 13 2. dxe, the ceiling of real number x, is the least integer equal to or greater than x. Theorem 1.7: Let x be a real number. Then bxc = dxe if and only if x is an integer. PROOF: (Onlyif part) In this part, we assume bxc = dxe and try to prove x is an integer. Using the de nitions of the oor and ceiling, we notice that bxc x, and dxe x. However, we are given that bxc = dxe. Thus, we may substitute the oor for the ceiling in the rst inequality to conclude dxe x. Since both dxe x and dxe x hold, we may conclude by properties of arithmetic inequalities that dxe = x. Since dxe is always an integer, x must also be an integer in this case. (If part) Now, we assume x is an integer and try to prove bxc = dxe. This part is easy. By the de nitions of oor and ceiling, when x is an integer, both bxc and dxe are equal to x, and therefore equal to each other. 2 1.2.4 Theorems That Appear Not to Be IfThen Statements Sometimes, we encounter a theorem that appears not to have a hypothesis. An example is the wellknown fact from trigonometry: Theorem 1.8: sin2 + cos2 = 1. 2 Actually, this statement does have a hypothesis, and the hypothesis consists of all the statements you need to know to interpret the statement. In particular, the hidden hypothesis is that is an angle, and therefore the functions sine and cosine have their usual meaning for angles. From the de nitions of these terms, and the Pythagorean Theorem (in a right triangle, the square of the hypotenuse equals the sum of the squares of the other two sides), you could prove the theorem. In essence, the ifthen form of the theorem is really: \if is an angle, then sin2 + cos2 = 1." 1.3 Additional Forms of Proof In this section, we take up several additional topics concerning how to construct proofs: 1. Proofs about sets. 2. Proofs by contradiction. 3. Proofs by counterexample. 14 CHAPTER 1. AUTOMATA: THE METHODS AND THE MADNESS 1.3.1 Proving Equivalences About Sets In automata theory, we are frequently asked to prove a theorem which says that the sets constructed in two dierent ways are the same sets. Often, these sets are sets of character strings, and the sets are called \languages," but in this section the nature of the sets is unimportant. If E and F are two expressions representing sets, the statement E = F means that the two sets represented are the same. More precisely, every element in the set represented by E is in the set represented by F , and every element in the set represented by F is in the set represented by E . Example 1.9: The commutative law of union says that we can take the union of two sets R and S in either order. That is, R S = S R. In this case, E is the expression R S and F is the expression S R. The commutative law of union says that E = F . 2 We can write a setequality E = F as an ifandonlyif statement: an element x is in E if and only if x is in F . As a consequence, we see the outline of a proof of any statement that asserts the equality of two sets E = F it follows the form of any ifandonlyif proof: 1. Proof that if x is in E , then x is in F . 2. Prove that if x is in F , then x is in E . As an example of this proof process, let us prove the distributive law of union over intersection: Theorem 1.10: R (S \ T ) = (R S ) \ (R T ). PROOF: The two setexpressions involved are E = R (S \ T ) and F = (R S ) \ (R T ) We shall prove the two parts of the theorem in turn. In the \if" part we assume element x is in E and show it is in F . This part, summarized in Fig. 1.5, uses the de nitions of union and intersection, with which we assume you are familiar. Then, we must prove the \onlyif" part of the theorem. Here, we assume x is in F and show it is in E . The steps are summarized in Fig. 1.6. Since we have now proved both parts of the ifandonlyif statement, the distributive law of union over intersection is proved. 2 1.3.2 The Contrapositive Every ifthen statement has an equivalent form that in some circumstances is easier to prove. The contrapositive of the statement \if H then C " is \if not C then not H ." A statement and its contrapositive are either both true or both false, so we can prove either to prove the other. To see why \if H then C " and \if not C then not H " are logically equivalent, rst observe that there are four cases to consider: 1.3. ADDITIONAL FORMS OF PROOF 1. 2. 3. 4. 5. 6. Statement x is in R (S \ T ) x is in R or x is in S \ T x is in R or x is in both S and T x is in R S x is in R T x is in (R S ) \ (R T ) 15 Justi cation Given (1) and de nition of union (2) and de nition of intersection (3) and de nition of union (3) and de nition of union (4), (5), and de nition of intersection Figure 1.5: Steps in the \if" part of Theorem 1.10 1. 2. 3. 4. 5. 6. Statement x is in (R S ) \ (R T ) x is in R S x is in R T x is in R or x is in both S and T x is in R or x is in S \ T x is in R (S \ T ) Justi cation Given (1) and de nition of intersection (1) and de nition of intersection (2), (3), and reasoning about unions (4) and de nition of intersection (5) and de nition of union Figure 1.6: Steps in the \onlyif" part of Theorem 1.10 1. H and C both true. 2. H true and C false. 3. C true and H false. 4. H and C both false. There is only one way to make an ifthen statement false the hypothesis must be true and the conclusion false, as in case (2). For the other three cases, including case (4) where the conclusion is false, the ifthen statement itself is true. Now, consider for which cases the contrapositive \if not C then not H " is false. In order for this statement to be false, its hypothesis (which is \not C ") must be true, and its conclusion (which is \not H ") must be false. But \not C " is true exactly when C is false, and \not H " is false exactly when H is true. These two conditions are again case (2), which shows that in each of the four cases, the original statement and its contrapositive are either both true or both false i.e., they are logically equivalent. Example 1.11: Recall Theorem 1.3, whose statement was: \if x 4, then 2x x2 ." The contrapositive of this statement is \if not 2x x2 then not 16 CHAPTER 1. AUTOMATA: THE METHODS AND THE MADNESS Saying \IfAndOnlyIf" for Sets As we mentioned, theorems that state equivalences of expressions about sets are ifandonlyif statements. Thus, Theorem 1.10 could have been stated: an element x is in R (S \ T ) if and only if x is in (R S ) \ (R T ) Another common expression of a setequivalence is with the locution \allandonly." For instance, Theorem 1.10 could as well have been stated \the elements of R (S \ T ) are all and only the elements of (R S ) \ (R T ) The Converse Do not confuse the terms \contrapositive" and \converse." The converse of an ifthen statement is the \other direction" that is, the converse of \if H then C " is \if C then H ." Unlike the contrapositive, which is logically equivalent to the original, the converse is not equivalent to the original statement. In fact, the two parts of an ifandonlyif proof are always some statement and its converse. x 4." In more colloquial terms, making use of the fact that \not a b" is the same as a < b, the contrapositive is \if 2x < x2 then x < 4." 2 When we are asked to prove an ifandonlyif theorem, the use of the contrapositive in one of the parts allows us several options. For instance, suppose we want to prove the set equivalence E = F . Instead of proving \if x is in E then x is in F and if x is in F then x is in E ," we could also put one direction in the contrapositive. One equivalent proof form is: If x is in E then x is in F , and if x is not in E then x is not in F . We could also interchange E and F in the statement above. 1.3.3 Proof by Contradiction Another way to prove a statement of the form \if H then C " is to prove the statement 1.3. ADDITIONAL FORMS OF PROOF 17 \H and not C implies falsehood." That is, start by assuming both the hypothesis H and the negation of the conclusion C . Complete the proof by showing that something known to be false follows logically from H and not C . This form of proof is called proof by contradiction. Example 1.12: Recall Theorem 1.5, where we proved the ifthen statement with hypothesis H = \U is an in nite set, S is a nite subset of U , and T is the complement of S with respect to U ." The conclusion C was \T is in nite." We proceeded to prove this theorem by contradiction. We assumed \not C " that is, we assumed T was nite. Our proof was to derive a falsehood from H and not C . We rst showed from the assumptions that S and T are both nite, that U also must be nite. But since U is stated in the hypothesis H to be in nite, and a set cannot be both nite and in nite, we have proved the logical statement \false." In logical terms, we have both a proposition p (U is nite) and its negation, not p (U is in nite). We then use the fact that \p and not p" is logically equivalent to \false." 2 To see why proofs by contradiction are logically correct, recall from Section 1.3.2 that there are four combinations of truth values for H and C . Only the second case, H true and C false, makes the statement \if H then C " false. By showing that H and not C leads to falsehood, we are showing that case 2 cannot occur. Thus, the only possible combinations of truth values for H and C are the three combinations that make \if H then C " true. 1.3.4 Counterexamples In real life, we are not told to prove a theorem. Rather, we are faced with something that seems true  a strategy for implementing a program for example  and we need to decide whether or not the \theorem" is true. To resolve the question, we may alternately try to prove the theorem, and if we cannot, try to prove that its statement is false. Theorems generally are statements about an in nite number of cases, perhaps all values of its parameters. Indeed, strict mathematical convention will only dignify a statement with the title \theorem" if it has an in nite number of cases statements that have no parameters, or that apply to only a nite number of values of its parameter(s) are called observations. It is sucient to show that an alleged theorem is false in any one case in order to show it is not a theorem. The situation is analogous to programs, since a program is generally considered to have a bug if it fails to operate correctly for even one input on which it was expected to work. It often is easier to prove that a statement is not a theorem than to prove it is a theorem. As we mentioned, if S is any statement, then the statement \S is not a theorem" is itself a statement without parameters, and thus can 18 CHAPTER 1. AUTOMATA: THE METHODS AND THE MADNESS be regarded as an observation rather than a theorem. The following are two examples, rst of an obvious nontheorem, and the second a statement that just misses being a theorem and that requires some investigation before resolving the question of whether it is a theorem or not. Alleged Theorem 1.13: All primes are odd. (More formally, we might say: if integer x is a prime, then x is odd.) DISPROOF: The integer 2 is a prime, but 2 is even. 2 Now, let us discuss a \theorem" involving modular arithmetic. There is an essential de nition that we must rst establish. If a and b are positive integers, then a mod b is the remainder when a is divided by b, that is, the unique integer r between 0 and b ; 1 such that a = qb + r for some integer q. For example, 8 mod 3 = 2, and 9 mod 3 = 0. Our rst proposed theorem, which we shall determine to be false, is: Alleged Theorem 1.14: There is no pair of integers a and b such that a mod b = b mod a 2 When asked to do things with pairs of objects, such as a and b here, it is often possible to simplify the relationship between the two by taking advantage of symmetry. In this case, we can focus on the case where a < b, since if b < a we can swap a and b and get the same equation as in Alleged Theorem 1.14. We must be careful, however, not to forget the third case, where a = b. This case turns out to be fatal to our proof attempts. Let us assume a < b. Then a mod b = a, since in the de nition of a mod b we have q = 0 and r = a. That is, when a < b we have a = 0 b + a. But b mod a < a, since anything mod a is between 0 and a ; 1. Thus, when a < b, b mod a < a mod b, so a mod b = b mod a is impossible. Using the argument of symmetry above, we also know that a mod b 6= b mod a when b < a. However, consider the third case: a = b. Since x mod x = 0 for any integer x, we do have a mod b = b mod a if a = b. We thus have a disproof of the alleged theorem: DISPROOF: (of Alleged Theorem 1.14) Let a = b = 2. Then a mod b = b mod a = 0 2 In the process of nding the counterexample, we have in fact discovered the exact conditions under which the alleged theorem holds. Here is the correct version of the theorem, and its proof. Theorem 1.15: a mod b = b mod a if and only if a = b. 1.4. INDUCTIVE PROOFS 19 PROOF: (If part) Assume a = b. Then as we observed above, x mod x = 0 for any integer x. Thus, a mod b = b mod a = 0 whenever a = b. (Onlyif part) Now, assume a mod b = b mod a. The best technique is a proof by contradiction, so assume in addition the negation of the conclusion that is, assume a 6= b. Then since a = b is eliminated, we have only to consider the cases a < b and b < a. We already observed above that when a < b, we have a mod b = a and b mod a < a. Thus, these statements, in conjunction with the hypothesis a mod b = b mod a lets us derive a contradiction. By symmetry, if b < a then b mod a = b and a mod b < b. We again derive a contradiction of the hypothesis, and conclude the onlyif part is also true. We have now proved both directions and conclude that the theorem is true. 2 1.4 Inductive Proofs There is a special form of proof, called \inductive," that is essential when dealing with recursively de ned objects. Many of the most familiar inductive proofs deal with integers, but in automata theory, we also need inductive proofs about such recursively de ned concepts as trees and expressions of various sorts, such as the regular expressions that were mentioned briey in Section 1.1.2. In this section, we shall introduce the subject of inductive proofs rst with \simple" inductions on integers. Then, we show how to perform \structural" inductions on any recursively de ned concept. 1.4.1 Inductions on Integers Suppose we are given a statement S (n), about an integer n, to prove. One common approach is to prove two things: 1. The basis, where we show S (i) for a particular integer i. Usually, i = 0 or i = 1, but there are examples where we want to start at some higher i, perhaps because the statement S is false for a few small integers. 2. The inductive step, where we assume n i, where i is the basis integer, and we show that \if S (n) then S (n + 1)." Intuitively, these two parts should convince us that S (n) is true for every integer n that is equal to or greater than the basis integer i. We can argue as follows. Suppose S (n) were false for one or more of those integers. Then there would have to be a smallest value of n, say j , for which S (j ) is false, and yet j i. Now j could not be i, because we prove in the basis part that S (i) is true. Thus, j must be greater than i. We now know that j ; 1 i, and S (j ; 1) is true. However, we proved in the inductive part that if n i, then S (n) implies S (n + 1). Suppose we let n = j ; 1. Then we know from the inductive step that S (j ; 1) implies S (j ). Since we also know S (j ; 1), we can conclude S (j ). 20 CHAPTER 1. AUTOMATA: THE METHODS AND THE MADNESS We have assumed the negation of what we wanted to prove that is, we assumed S (j ) was false for some j i. In each case, we derived a contradiction, so we have a \proof by contradiction" that S (n) is true for all n i. Unfortunately, there is a subtle logical aw in the above reasoning. Our assumption that we can pick the least j i for which S (j ) is false depends on our believing the principle of induction in the rst place. That is, the only way to prove that we can nd such a j is to prove it by a method that is essentially an inductive proof. However, the \proof" discussed above makes good intuitive sense, and matches our understanding of the real world. Thus, we generally take as an integral part of our logical reasoning system: The Induction Principle : If we prove S (i) and we prove that for all n i, S (n) implies S (n + 1), then we may conclude S (n) for all n i. The following two examples illustrate the use of the induction principle to prove theorems about integers. Theorem 1.16: For all n 0: n X n + 1) i2 = n(n + 1)(2 6 (1.1) i2 = n + 1](n + 1] +61)(2n + 1] + 1) (1.2) i=1 PROOF: The proof is in two parts: the basis and the inductive step we prove each in turn. BASIS: For the basis, we pick n = 0. It might seem surprising that the P theorem even makes sense for n = 0, since the left side of Equation (1.1) is 0i=1 when n = 0. However, there is a general principle that when the upper limit of a sum (0 in this case) is less than the lowerPlimit (1 here), the sum is over no terms and therefore the sum is 0. That is, 0i=1 i2 = 0. The right side of Equation (1.1) is also 0, since 0 (0+1) (2 0+1)=6 = 0. Thus, Equation (1.1) is true when n = 0. INDUCTION: Now, assume n 0. We must prove the inductive step, that Equation (1.1) implies the same formula with n + 1 substituted for n. The latter formula is n +1] X i=1 We may simplify Equations (1.1) and (1.2) by expanding the sums and products on the right sides. These equations become: n X i2 = (2n3 + 3n2 + n)=6 (1.3) i2 = (2n3 + 9n2 + 13n + 6)=6 (1.4) i=1 nX +1 i=1 1.4. INDUCTIVE PROOFS 21 We need to prove (1.4) using (1.3), since in the induction principle, these are statements S (n + 1) and S (n), respectively. The \trick" is to break the sum to n + 1 on the left of (1.4) into a sum to n plus the (n + 1)st term. In that way, we can replace the sum to n by the left side of (1.3) and show that (1.4) is true. These steps are as follows: n X i=1 i + (n + 1)2 = (2n3 + 9n2 + 13n + 6)=6 2 (2n3 + 3n2 + n)=6 + (n2 + 2n + 1) = (2n3 + 9n2 + 13n + 6)=6 (1.5) (1.6) The nal veri cation that (1.6) is true requires only simple polynomial algebra on the left side to show it is identical to the right side. 2 Example 1.17: In the next example, we prove Theorem 1.3 from Section 1.2.1. Recall this theorem states that if x 4, then 2x x2 . We gave an informal proof based on the idea that the ratio x2 =2x shrinks as x grows above 4. We can make the idea precise if we prove the statement 2x x2 by induction on x, starting with a basis of x = 4. Note that the statement is actually false for x < 4. BASIS: If x = 4, then 2x and x2 are both 16. Thus, 24 42 holds. INDUCTION: Suppose for some x 4 that 2x x2 . With this statement as the hypothesis, we need to prove the same statement, with x + 1 in place of x, that is, 2x+1] x + 1]2 . These are the statements S (x) and S (x + 1) in the induction principle the fact that we are using x instead of n as the parameter should not be of concern x or n is just a local variable. As in Theorem 1.16, we should rewrite S (x + 1) so it can make use of S (x). In this case, we can write 2x+1] as 2 2x . Since S (x) tells us that 2x x2 , we can conclude that 2x+1 = 2 2x 2x2 . But we need something dierent we need to show that 2x+1 (x + 1)2 . One way to prove this statement is to prove that 2x2 (x + 1)2 and then use the transitivity of to show 2x+1 2x2 (x + 1)2 . In our proof that 2x2 (x + 1)2 we may use the assumption that x 4. Begin by simplifying (1.7): Divide (1.8) by x, to get: x2 2x + 1 (1.7) (1.8) x 2 + x1 (1.9) 1=4. Thus, the left side of (1.9) is at least Since x 4, we know 1=x 4, and the right side is at most 2.25. We have thus proved the truth of (1.9). 22 CHAPTER 1. AUTOMATA: THE METHODS AND THE MADNESS Integers as Recursively Dened Concepts We mentioned that inductive proofs are useful when the subject matter is recursively de ned. However, our rst examples were inductions on integers, which we do not normally think of as \recursively de ned." However, there is a natural, recursive de nition of when a number is a nonnegative integer, and this de nition does indeed match the way inductions on integers proceed: from objects de ned rst, to those de ned later. BASIS: 0 is an integer. INDUCTION: If n is an integer, then so is n + 1. Therefore, Equations (1.8) and (1.7) are also true. Equation (1.7) in turn gives us 2x2 (x + 1)2 for x 4 and lets us prove statement S (x + 1), which we recall was 2x+1 (x + 1)2 . 2 1.4.2 More General Forms of Integer Inductions Sometimes an inductive proof is made possible only by using a more general scheme than the one proposed in Section 1.4.1, where we proved a statement S for one basis value and then proved that \if S (n) then S (n +1)." Two important generalizations of this scheme are: 1. We can use several basis cases. That is, we prove S (i) S (i + 1) : : : S (j ) for some j > i. 2. In proving S (n + 1), we can use the truth of all the statements S (i) S (i + 1) : : : S (n) rather than just using S (n). Moreover, if we have proved basis cases up to S (j ), then we can assume n j , rather than just n i. The conclusion to be made from this basis and inductive step is that S (n) is true for all n i. Example 1.18: The following example will illustrate the potential of both principles. The statement S (n) we would like to prove is that if n 8, then n can be written as a sum of 3's and 5's. Notice, incidentally, that 7 cannot be written as a sum of 3's and 5's. BASIS: The basis cases are S (8), S (9), and S (10). The proofs are 8 = 3 + 5, 9 = 3 + 3 + 3, and 10 = 5 + 5, respectively. 1.4. INDUCTIVE PROOFS 23 INDUCTION: Assume that n 10 and that S (8) S (9) : : : S (n) are true. We must prove S (n + 1) from these given facts. Our strategy is to subtract 3 from n + 1, observe that this number must be writable as a sum of 3's and 5's, and add one more 3 to the sum to get a way to write n + 1. More formally, observe that n ; 2 8, so we may assume S (n ; 2). That is, n ; 2 = 3a + 5b for some integers a and b. Then n + 1 = 3 + 3a + 5b, so n + 1 can be written as the sum of a + 1 3's and b 5's. That proves S (n + 1) and concludes the inductive step. 2 1.4.3 Structural Inductions In automata theory, there are several recursively de ned structures about which we need to prove statements. The familiar notions of trees and expressions are important examples. Like inductions, all recursive de nitions have a basis case, where one or more elementary structures are de ned, and an inductive step, where more complex structures are de ned in terms of previously de ned structures. Example 1.19: Here is the recursive de nition of a tree: BASIS: A single node is a tree, and that node is the root of the tree. INDUCTION: If T1 T2 : : : Tk are trees, then we can form a new tree as follows: 1. Begin with a new node N , which is the root of the tree. 2. Add copies of all the trees T1 T2 : : : Tk . 3. Add edges from node N to the roots of each of the trees T1 T2 : : : Tk . Figure 1.7 shows the inductive construction of a tree with root N from k smaller trees. 2 N T1 T2 Tk Figure 1.7: Inductive construction of a tree Example 1.20: Here is another recursive de nition. This time we de ne expressions using the arithmetic operators + and , with both numbers and variables allowed as operands. 24 CHAPTER 1. AUTOMATA: THE METHODS AND THE MADNESS Intuition Behind Structural Induction We can suggest informally why structural induction is a valid proof method. Imagine the recursive de nition establishing, one at a time, that certain structures X1 X2 : : : meet the de nition. The basis elements come rst, and the fact that Xi is in the de ned set of structures can only depend on the membership in the de ned set of structures that precede Xi on the list. Viewed this way, a structural induction is nothing but an induction on integer n of the statement S (Xn ). This induction may be of the generalized form discussed in Section 1.4.2, with multiple basis cases and an inductive step that uses all previous instances of the statement. However, we should remember, as explained in Section 1.4.1, that this intuition is not a formal proof, and in fact we must assume the validity of this induction principle as we did the validity of the original induction principle of that section. BASIS: Any number or letter (i.e., a variable) is an expression. INDUCTION: If E and F are expressions, then so are E + F , E F , and (E ). For example, both 2 and x are expressions by the basis. The inductive step tells us x + 2, (x + 2), and 2 (x + 2) are all expressions. Notice how each of these expressions depends on the previous ones being expressions. 2 When we have a recursive de nition, we can prove theorems about it using the following proof form, which is called structural induction. Let S (X ) be a statement about the structures X that are de ned by some particular recursive de nition. 1. As a basis, prove S (X ) for the basis structure(s) X . 2. For the inductive step, take a structure X that the recursive de nition says is formed from Y1 Y2 : : : Yk . Assume that the statements S (Y1 ) S (Y2 ) : : : S (Yk ) hold, and use these to prove S (X ). Our conclusion is that S (X ) is true for all X . The next two theorems are examples of facts that can be proved about trees and expressions. Theorem 1.21: Every tree has one more node than it has edges. PROOF: The formal statement S (T ) we need to prove by structural induction is: \if T is a tree, and T has n nodes and e edges, then n = e + 1." BASIS: The basis case is when T is a single node. Then n = 1 and e = 0, so the relationship n = e + 1 holds. 1.4. INDUCTIVE PROOFS 25 INDUCTION: Let T be a tree built by the inductive step of the de nition, from root node N and k smaller trees T1 T2 : : : Tk . We may assume that the statements S (Ti ) hold for i = 1 2 : : : k. That is, let Ti have ni nodes and ei edges then ni = ei + 1. The nodes of T are node N and all the nodes of the Ti 's. There are thus 1 + n1 + n2 + + nk nodes in T . The edges of T are the k edges we added explicitly in the inductive de nition step, plus the edges of the Ti 's. Hence, T has k + e1 + e2 + + ek (1.10) edges. If we substitute ei + 1 for ni in the count of the number of nodes of T we nd that T has 1 + e1 + 1] + e2 + 1] + + ek + 1] (1.11) nodes. Since there are k of the \+1" terms in (1.11), we can regroup it as: k + 1 + e1 + e2 + + ek (1.12) This expression is exactly 1 more than the expression of (1.10) that was given for the number of edges of T . Thus, T has one more node than it has edges. 2 Theorem 1.22: Every expression has an equal number of left and right parentheses. PROOF: Formally, we prove the statement S (G) about any expression G that is de ned by the recursion of Example 1.20: the numbers of left and right parentheses in G are the same. BASIS: If G is de ned by the basis, then G is a number or variable. These expressions have 0 left parentheses and 0 right parentheses, so the numbers are equal. INDUCTION: There are three rules whereby expression G may have been constructed according to the inductive step in the de nition: 1. G = E + F . 2. G = E F . 3. G = (E ). We may assume that S (E ) and S (F ) are true that is, E has the same number of left and right parentheses, say n of each, and F likewise has the same number of left and right parentheses, say m of each. Then we can compute the numbers of left and right parentheses in G for each of the three cases, as follows: 26 CHAPTER 1. AUTOMATA: THE METHODS AND THE MADNESS 1. If G = E + F , then G has n + m left parentheses and n + m right parentheses n of each come from E and m of each come from F . 2. If G = E F , the count of parentheses for G is again n + m of each, for the same reason as in case (1). 3. If G = (E ), then there are n+1 left parentheses in G  one left parenthesis is explicitly shown, and the other n are present in E . Likewise, there are n + 1 right parentheses in G one is explicit and the other n are in E . In each of the three cases, we see that the numbers of left and right parentheses in G are the same. This observation completes the inductive step and completes the proof. 2 1.4.4 Mutual Inductions Sometimes, we cannot prove a single statement by induction, but rather need to prove a group of statements S1 (n) S2 (n) : : : Sk (n) together by induction on n. Automata theory provides many such situations. In Example 1.23 we sample the common situation where we need to explain what an automaton does by proving a group of statements, one for each state. These statements tell under what sequences of inputs the automaton gets into each of the states. Strictly speaking, proving a group of statements is no dierent from proving the conjunction (logical AND) of all the statements. For instance, the group of statements S1 (n) S2 (n) : : : Sk (n) could be replaced by the single statement S1 (n) AND S2 (n) AND AND Sk (n). However, when there are really several independent statements to prove, it is generally less confusing to keep the statements separate and to prove them all in their own parts of the basis and inductive steps. We call this sort of proof mutual induction. An example will illustrate the necessary steps for a mutual recursion. Example 1.23: Let us revisit the on/o switch, which we represented as an automaton in Example 1.1. The automaton itself is reproduced as Fig. 1.8. Since pushing the button switches the state between on and o , and the switch starts out in the o state, we expect that the following statements will together explain the operation of the switch: S1 (n): The automaton is in state o after n pushes if and only if n is even. S2 (n): The automaton is in state on after n pushes if and only if n is odd. We might suppose that S1 implies S2 and viceversa, since we know that a number n cannot be both even and odd. However, what is not always true about an automaton is that it is in one and only one state. It happens that the automaton of Fig. 1.8 is always in exactly one state, but that fact must be proved as part of the mutual induction. 1.4. INDUCTIVE PROOFS 27 Push Start off on Push Figure 1.8: Repeat of the automaton of Fig. 1.1 We give the basis and inductive parts of the proofs of statements S1 (n) and S2 (n) below. The proofs depend on several facts about odd and even integers: if we add or subtract 1 from an even integer, we get an odd integer, and if we add or subtract 1 from an odd integer we get an even integer. BASIS: For the basis, we choose n = 0. Since there are two statements, each of which must be proved in both directions (because S1 and S2 are each \ifandonlyif" statements), there are actually four cases to the basis, and four cases to the induction as well. 1. S1 If] Since 0 is in fact even, we must show that after 0 pushes, the automaton of Fig. 1.8 is in state o . Since that is the start state, the automaton is indeed in state o after 0 pushes. 2. S1 Onlyif] The automaton is in state o after 0 pushes, so we must show that 0 is even. But 0 is even by de nition of \even," so there is nothing more to prove. 3. S2 If] The hypothesis of the \if" part of S2 is that 0 is odd. Since this hypothesis H is false, any statement of the form \if H then C " is true, as we discussed in Section 1.3.2. Thus, this part of the basis also holds. 4. S2 Onlyif] The hypothesis, that the automaton is in state on after 0 pushes, is also false, since the only way to get to state on is by following an arc labeled Push, which requires that the button be pushed at least once. Since the hypothesis is false, we can again conclude that the ifthen statement is true. INDUCTION: Now, we assume that S1 (n) and S2 (n) are true, and try to prove S1 (n + 1) and S2 (n + 1). Again, the proof separates into four parts. 1. S1 (n + 1) If] The hypothesis for this part is that n + 1 is even. Thus, n is odd. The \if" part of statement S2 (n) says that after n pushes, the automaton is in state on. The arc from on to o labeled Push tells us that the (n + 1)st push will cause the automaton to enter state o . That completes the proof of the \if" part of S1 (n + 1). 28 CHAPTER 1. AUTOMATA: THE METHODS AND THE MADNESS 2. S1 (n + 1) Onlyif] The hypothesis is that the automaton is in state o after n + 1 pushes. Inspecting the automaton of Fig. 1.8 tells us that the only way to get to state o after one or more moves is to be in state on and receive an input Push. Thus, if we are in state o after n + 1 pushes, we must have been in state on after n pushes. Then, we may use the \onlyif" part of statement S2 (n) to conclude that n is odd. Consequently, n + 1 is even, which is the desired conclusion for the onlyif portion of S1 (n + 1). 3. S2 (n +1) If] This part is essentially the same as part (1), with the roles of statements S1 and S2 exchanged, and with the roles of \odd" and \even" exchanged. The reader should be able to construct this part of the proof easily. 4. S2 (n + 1) Onlyif] This part is essentially the same as part (2), with the roles of statements S1 and S2 exchanged, and with the roles of \odd" and \even" exchanged. 2 We can abstract from Example 1.23 the pattern for all mutual inductions: Each of the statements must be proved separately in the basis and in the inductive step. If the statements are \ifandonlyif," then both directions of each statement must be proved, both in the basis and in the induction. 1.5 The Central Concepts of Automata Theory In this section we shall introduce the most important de nitions of terms that pervade the theory of automata. These concepts include the \alphabet" (a set of symbols), \strings" (a list of symbols from an alphabet), and \language" (a set of strings from the same alphabet). 1.5.1 Alphabets An alphabet is a nite, nonempty set of symbols. Conventionally, we use the symbol for an alphabet. Common alphabets include: 1. = f0 1g, the binary alphabet. 2. = fa b : : : z g, the set of all lowercase letters. 3. The set of all ASCII characters, or the set of all printable ASCII characters. 1.5. THE CENTRAL CONCEPTS OF AUTOMATA THEORY 29 1.5.2 Strings A string (or sometimes word) is a nite sequence of symbols chosen from some alphabet. For example, 01101 is a string from the binary alphabet = f0 1g. The string 111 is another string chosen from this alphabet. The Empty String The empty string is the string with zero occurrences of symbols. This string, denoted , is a string that may be chosen from any alphabet whatsoever. Length of a String It is often useful to classify strings by their length, that is, the number of positions for symbols in the string. For instance, 01101 has length 5. It is common to say that the length of a string is \the number of symbols" in the string this statement is colloquially accepted but not strictly correct. Thus, there are only two symbols, 0 and 1, in the string 01101, but there are ve positions for symbols, and its length is 5. However, you should generally expect that \the number of symbols" can be used when \number of positions" is meant. The standard notation for the length of a string w is jwj. For example, j011j = 3 and jj = 0. Powers of an Alphabet If is an alphabet, we can express the set of all strings of a certain length from that alphabet by using an exponential notation. We de ne k to be the set of strings of length k, each of whose symbols is in . Example 1.24: Note that 0 = fg, regardless of what alphabet is. That is, is the only string whose length is 0. If = f0 1g, then 1 = f0 1g, 2 = f00 01 10 11g, 3 = f000 001 010 011 100 101 110 111g and so on. Note that there is a slight confusion between and 1 . The former is an alphabet its members 0 and 1 are symbols. The latter is a set of strings its members are the strings 0 and 1, each of which is of length 1. We shall not try to use separate notations for the two sets, relying on context to make it clear whether f0 1g or similar sets are alphabets or sets of strings. 2 The set of all strings over an alphabet is conventionally denoted . For instance, f0 1g = f 0 1 00 01 10 11 000 : : :g. Put another way, = 0 1 2 Sometimes, we wish to exclude the empty string from the set of strings. The set of nonempty strings from alphabet is denoted + . Thus, two appropriate equivalences are: 30 CHAPTER 1. AUTOMATA: THE METHODS AND THE MADNESS Type Convention for Symbols and Strings Commonly, we shall use lowercase letters at the beginning of the alphabet (or digits) to denote symbols, and lowercase letters near the end of the alphabet, typically w, x, y, and z , to denote strings. You should try to get used to this convention, to help remind you of the types of the elements being discussed. + = 1 2 3 . = + fg. Concatenation of Strings Let x and y be strings. Then xy denotes the concatenation of x and y, that is, the string formed by making a copy of x and following it by a copy of y. More precisely, if x is the string composed of i symbols x = a1 a2 ai and y is the string composed of j symbols y = b1 b2 bj , then xy is the string of length i + j : xy