清华大学计算导论英文版Introduction to the Theory of Computation--688IT编程网

Introduction to the Theory of Computation, Michael Sipser

Chapter 0: Introduction

Automata, Computability and Complexity:

• They are linked by the question:

o “What are the fundamental capabilities and limitations of computers?”

• The theories of computability and complexity are closely related. In complexity theory, the objective is to classify problems as easy ones and hard ones, whereas in computability theory he classification of problems is by those that are solvable and those that are not. Computability theory introduces several of the concepts used in complexity theory.

• Automata theory deals with the definitions and properties of mathematical models of computation. • One model, called the finite automaton, is used in text processing, compilers, and hardware design.

Another model, called the context – free grammar, is used in programming languages and artificial intelligence.

Strings and Languages:

• The string of the length zero is called the empty string and is written as ε.

• A language is a set of strings.

ignore subsequent bad blocksDefinitions, Theorems and Proofs:

• Definitions describe the objects and notions that we use.

• A proof is a convincing logical argument that a statement is true.

• A theorem is a mathematical statement proved true.

• Occasionally we prove statements that are interesting only because they assist in the proof of another, more significant statement. Such statements are called lemmas.

• Occasionally a theorem or its proof may allow us to conclude easily that other, related statements are true. These statements are called corollaries of the theorem.

Chapter 1: Regular Languages

Introduction:

• An idealized computer is called a “computational model” which allows us to set up a manageable mathematical theory of it directly.

• As with any model in science, a computational model may be accurate in some ways but perhaps not in others.

• The simplest model is called “finite state machine” or “finite automaton”.

Finite Automata:

• Finite Automata are good models for computers with an extremely limited amount of memory, like for example an automatic door, elevator or digital watches.

• Finite automata and their probabilistic counterpart “Markov chains” are useful tools when we are attempting to recognize patterns in data. These devices are used in speech processing and in optical character recognition. Markov chains have even been used to model and predict price changes in financial markets.

• State diagrams are described on p.34.

• The output of an finite automaton is “accepted” if the automaton is now in an accept state (double circle) and reject if it is not.

• A finite automaton is a list of five objects: o Set of states

o Input alphabet

o Rules for moving

o Start state

o Accepts states

• y x =)1,(δ, means that a transition from x to y exists when the machine reads a 1.

• Definition: A finite automaton is a 5 – tuple ),,,,(0F q Q δΣ, where

1. Q is a finite set called the states.

2. Σ is a finite set called the alphabet.

3. Q Q →Σ×:δ is the transition function

4. Q q ∈0 is the start state

5. Q F ⊆ is the set of accept states.

• If A is the set of all strings that machine M accepts, we say that A is the language of machine M and write L(M) = A. We say M recognizes A.

• A language is called a “regular language” if some finite automaton recognizes it.

• A finite automaton has only a finite number of states, which means a finite memory.

• Fortunately, for many languages (although they are infinite) you don’t need to remember the entire input (which is not possible for a finite automaton). You only need to remember certain crucial

information.

The Regular Operations:

• We define 3 operations on languages, called the regular operations, and use them to study properties of the regular languages.

• Definition: Let A and B be languages. We define the regular operations union , concatenation and star as follows:

o Union: }|{B x or A x x B A ∈∈=∪

o Concatenation: }|{B y and A x xy B A ∈∈=o

o Star: }0|...{*21A x each and k x x x A i k ∈≥=

• Example: Let the alphabet Σ be the standard 26 letters {a, b, …, z}. If language A = {good, bad} and language B = {boy, girl}, then:

o =∪B A {good, bad, boy, girl}

o =B A o {goodboy, goodgirl, badboy, badgirl}

o =*A {ε, good, bad, goodgood, goodbad, badgood, badbad, goodgoodgood, goodgoodbas, …} • The class of regular languages is closed under the union operation. In other word, if A and B are regular languages, so is B A ∪.

• The class of regular languages is closed under the concatenation operation.

• The class of regular languages is closed under the intersection operation.

• The class of regular languages is closed under the star operation. Nondeterminism :

• Nondeterminism is a generalization of determinism, so every deterministic finite automaton is automatically a nondeterministic finite automaton.

• In a DFA (deterministic finite automaton), every state always has exactly one exiting transition arrow for each symbol in the alphabet. In an NFA (nondeterministic finite automaton) a state may have zero,

one or many exiting arrows for each alphabet symbol.

• How does an NFA compute? Suppose that we are running an NFA on an input string and come to a state with multiple ways to proceed. Fro example, say that we are in state q1 in NFA N1 and that the

next input symbol is a 1. After reading that symbol, the machine splits into multiple copies of itself and follows all the possibilities in parallel. Each copy of the machine takes one of the possible ways to proceed and continues as before. If there are subsequent choices, the machine splits again. If the next input symbol doesn’t appear on any of the arrows exiting the state occupied by a copy of the ma

chine, that copy of the machine dies, along with the branch of the computation associated with it. Finally, if any one of these copies of the machine is in an accepts state ate the end of the input, the NFA accepts the input string. If a state with an ε symbol on an exiting arrow is encountered, something similar

happens. Without reading any input, the machine splits into multiple copies, one following each of the exiting ε - labelled arrows and one staying at the current state. Then the machine proceeds

nondeterministically as before.

• Nondeterministic finite automata are useful in several respects. As we will show, every NFA can be converted into an equivalent DFA, and constructing NFAs is sometimes easier than directly

constructing DFAs. An NFA may be much smaller than its deterministic counterpart, or its functioning may be easier to understand.

Nondeterministic Finite Automaton:

• Definition: A nondeterministic finite automaton is a 5 – tuple ),,,,(0F q Q δΣ, where

1. Q is a finite set of states.

2. Σ is a finite alphabet.

3. )(:Q P Q →Σ×εδ is the transition function, }{εε∪Σ=Σ

4. Q q ∈0 is the start state.

5. Q F ⊆ is the set of accept states.

• In a DFA the transition function takes a state and an input symbol and produces the next state. In a NFA the transition function takes a state and an input symbol or the empty string and produces the set

of possible next states.

• For any set Q we write P(Q) to be the collection of all subsets of Q (Power ser of Q).

• Deterministic and nondeterministic finite automaton recognize the same class of languages. • Two machines are equivalent if they recognize the same language.

• Every NFA has an equivalent DFA.

• If k is the number of states of the NFA, it has k 2 subsets of states. Each subset corresponds to one of

the possibilities that the DFA must remember, so the DFA simulating the NFA will have k 2 states. • NFA transforming to DFA: o The DFA M accepts (means it is in an accept state) if one of the possible states that the NFA N

could be in at this point, is an accept state.

o A language is regular if and only if some NFA recognizes it.

Regular Expressions:

• Definition: Say that R is a regular expression if R is: 1. a for some a in the alphabet Σ.

2. ε.

3. ∅, 1*∅ = ∅, ∅* = {ε}, ∅=∅o R

4. )21(R R ∪, where R1 and R2 are regular expressions.

5. )21(R R o , where R1 and R2 are regular expressions.

6. *)1(R , where R1 is a regular expression.

• The value of a regular expression is a language.

• Regular expressions have an important role in computer science applications. In applications involving text, user may want to search for strings that satisfy certain patterns . Regular expressions provide a

powerful method for describing such patterns.

• We can write Σ as shorthand for the regular expression )10(∪. More generally, if Σis any alphabet,

the regular expression Σ describes the language consisting of all strings of length 1 over that alphabet, and *Σ describes the language consisting of all strings over that alphabet. Similarly 1*Σ is the language that contains all strings that end in a 1. The language )1*(*)0(Σ∪Σ consists of all strings that either start with a 0 or end with a 1.

• Precedence in regular expressions: ∪>>o *

• When we want to make clear a distinction between a regular expression R and the language that it

describes, we write L(R) to be the language of R.

• A language is regular if and only if some regular expression describes it. Generalized Nondeterministic Finite Automaton:

• Definition: A generalized nondeterministic finite automaton, ),,,,(accept start q q Q δΣ is a 5 – tuple where

1. Q is the finite set of states.

2. Σ is the input alphabet.

3. R q Q q Q start accept →−×−}){(}){(:δ is the transition function.

4. start q is the start state.

5. accept q is the accept state.

• The GNFA reads blocks of symbols form the input, not necessarily just one symbol at a time as in a

n ordinary NFA.

• For convenience we require that GNFAs always have a special form that meets the following

conditions:

o The start state has transition arrows going to every other state but no arrows coming in from

any other state.

o There is only a single accept state, and it has arrows coming in from every other state but no

arrows going to any other state. Furthermore, the accept state is not the same as the start state.

o Except for the start and accept states, one arrow goes from every state to every other state and

also from each state to itself.

• We can easily convert a DFA into a GNFA in the special form. We simply add a new start state with

an ε arrow to the old start state and a new accept state with ε arrows form the old accept states. If an

y arrows have multiple labels (or if there are multiple arrows going between the same two states in the same direction), we replace each with a single arrow whose label is the union of the previous labels. Finally, we add arrows labeled ∅ between states that had no arrows. This last step won’t change the language recognized because a transition labeled with ∅ can never be used.

• We let M be the DFA for language A. The we convert M to a GNFA G by adding a new start state and a new accept state and additional transition arrows as necessary. We use the procedure CONVERT(G),

which takes a GNFA and returns an equivalent regular expression.

• CONVERT(G): Generates a regular expression R out of a GNFA G on p. 73. Nonregular Languages:

• To understand the power of finite automata you must also understand their limitations. In this section

we show how to prove that certain languages cannot be recognized by any finite automaton.

• Let’s take the language }0|10{≥=n B n n . If we attempt to find a DFA that recognizes B, we discover

that the machine seems to need to remember how many 0s have been seen so far as it reads the input. Because the number of 0s isn’t limited, the machine will have to keep track of an unlimited number of possibilities. But it cannot do so with any finite number of states.

• Our technique for proving nonregularity stems from a theorem about regular languages, traditionally called the pumping lemma . This theorem states that all regular languages have a special property. If we

can show that a language does not have this property, we are guaranteed that it is not regular. The property states that all strings in the language can be “pumped” if they are at least as long as a certain special value, called the pumping length. That means each such string contains a section that can be repeated any number of times with the resulting string remaining in the language.

• Pumping Lemma: If A is a regular language, then there is a number p (the pumping length) where, if s is any string in A of length at least p, then s may be divided into three pieces, s = xyz, satisfying the

following conditions:

1. for each A z xy i i ∈≥,0

2. 0||>y

3. p xy ≤||

• To use the pumping lemma to prove that a language B is not regular, first assume that B is regular in order to obtain a contradiction. The use the pumping lemma to guarantee the existence of a pumping

length p such that all strings of length p or greater in B can be pumped. Next, find a string s in B that has length p or greater but that cannot be pumped. Finally, demonstrate that s cannot be pumped by considering all ways of dividing s into x, y and z (taking condition 3 of the pumping lemma into account if convenient) and, for each such division, finding a value i where B z xy i ∉. This final step often involves grouping the various ways of dividing s into several cases and analysing them

individually. The existence of s contradicts the pumping lemma if B were regular. Hence B cannot be regular. Finding s sometimes takes a bit of creative thinking. You may need to hunt through several candidates for s before you discover one that works. Try members of B that seem to exhibit the “essence” of B’s nonregularity.

Chapter 2: Context – Free Languages

Introduction:

• In this chapter we introduce context – free grammars, a more powerful method, than finite automata and regular expressions, of describing languages. Such grammars can describe certain features that

have a recursive structure which makes them useful in a variety of applications (study of human languages, compilation of programming languages).

Context – Free Grammars:

• A grammar consists of a collection of substitution rules, also called productions. Each rule appears as a line in the grammar and comprises a symbol and a string, separated by an arrow. The symbol is

called a variable. The string consists of variables and other symbols called terminals.

• You use a grammar to describe a language by generating each string of that language in the following manner.

1. Write down the start variable . It is the variable on the left – hand side of the top rule, unless

specified otherwise.

2. Find a variable that is written down and a rule that starts with that variable. Replace the written

down variable with the right – hand side of that rule.

3. Repeat step 2 until no variables remain.

• All strings generated in this way constitute the language of the grammar. We write L(G) for the language of grammar G.

• Any language that can be generated by some context – free grammar is called a context – free language (CFL).

• Definition: A context – free grammar is a 4 – tuple ),,,(S R V Σ, where

1. V is a finite set called the variables.

2. Σ is a finite set, disjoint from V, called terminals.

3. R is a finite set of rules, with each rule being a variable and a string of variables and terminals.

4. V S ∈ is the start variable.

• We write v u ∗

⇒ if v u = or if a sequence k u u u ,...,,21 exists for 0≥k and

v u u u u k ⇒⇒⇒⇒⇒ (21)

• The language of the grammar is }|*{w S w ∗⇒Σ∈

688IT编程网

清华大学计算导论英文版Introduction to the Theory of Computation

发表评论

推荐文章

随机森林算法介绍及R语言实现

基于随机森林优化的神经网络算法在冬小麦产量预测中的应用研究_百度文 ...

基于正则化贪心森林算法的情感分析方法研究

随机森林算法和grandientboosting算法

基于随机森林的图像分类算法研究

热门文章

随机森林特征选择原理

自动驾驶系统中的随机森林算法解析

随机森林算法及其在生物信息学中的应用

监督学习中的随机森林算法解析(六)

随机森林算法在数据分析中的应用

机器学习——随机森林,RandomForestClassifier参数含义详解

随机森林的算法

随机森林算法作用

监督学习中的随机森林算法解析(十)

随机森林算法案例

随机森林案例

二分类问题常用的模型

绘制ssd框架训练流程

一种基于信息熵和DTW的多维时间序列相似性度量算法

SVM训练过程范文

如何使用支持向量机进行股票预测与交易分析

二分类交叉熵损失函数binary

tinybert_训练中文文本分类模型_概述说明

基于门控可形变卷积和分层Transformer的图像修复模型及其应用

人工智能开发技术的测试和评估方法

最新文章

基于随机森林的数据分类算法改进

人工智能中的智能识别与分类技术

基于人工智能技术的随机森林算法在医疗数据挖掘中的应用

随机森林回归模型的建模步骤

r语言随机森林预测模型校准曲线

《2024年随机森林算法优化研究》范文

标签列表