## Formal language

**Symbol**

An abstract entity that has no meaning by itself, often called uninterpreted. Letters from various alphabets, digits and special characters are the most commonly used symbols.

**Alphabet**

- A finite set of symbols.
- An alphabet is often denoted by sigma, yet can be given any name.
- B = {0, 1} Says B is an alphabet of two symbols, 0 and 1.
- C = {a, b, c} Says C is an alphabet of three symbols, a, b and c.

Sometimes space and comma are in an alphabet while other times they are meta symbols used for descriptions.

**String also called a Word**

A finite sequence of symbols from an alphabet. 01110 and 111 are strings from the alphabet B above. aaabccc and b are strings from the alphabet C above.

- A null string is a string with no symbols, usually denoted by epsilon. The null string has length zero. The null string is usually denoted epsilon.
- Vertical bars around a string indicate the length of a string expressed as a natural number. For example |00100| = 5, |aab| = 3, | epsilon | = 0

**Formal Language, also called a Language**

A set of strings from an alphabet. The set may be empty, finite or infinite. Because a language is a set of strings, the words language and set are often used interchangeably in talking about formal languages.

- L(M) is the notation for a language defined by a machine M. The machine M accepts a certain set of strings, thus a language.
- L(G) is the notation for a language defined by a grammar G. The grammar G recognizes a certain set of strings, thus a language.
- M(L) is the notation for a machine that accepts a language. The language L is a certain set of strings.
- G(L) is the notation for a grammar that recognizes a language. The language L is a certain set of strings.
- The union of two languages is a language. L = L1 union L2
- The intersection of two languages is a language. L = L1 intersect L2
- The complement of a language is a language. L = sigma* – L1
- The difference of two languages is a language. L = L1 – L2

## Need for formal computational models

Computability is the study of the limits of computing: what can be computed and, more interesting, what cannot be computed.

Note that we are not interested in showing that there are problems we don’t know how to solve today but maybe we’ll solve tomorrow, rather we are interested in showing that there are problems that can be shown (proved) to have no possible algorithmic solution!

We can’t agree on what can or cannot be computed unless we agree on a reasonable model of a computer. We need a formal model of a computer that

- allows each algorithm to be described in a finite number of steps and
- each step can be carried out mechanically.

The most basic computer is a mathematical concept called a Turing machine. Counter programs are a very different, yet equally powerful, computational model. The Church-Turing thesis states that all known models of computation are equivalent.

Using these formal models of computation we can show that there are certain problems that are undecidable, which means they cannot be solved by any computer or computational model.

## Non-computational problems

**Decision Problem**

**Decision** **problems** have yes or no answers.

Mathematicians like to simplify analysis, so they consider only decision problems. Nevertheless many problems can be reduced to a decision problem.

All existence problems are decision problems, for example does the graph has a cycle is a decision problem.

Many construction problems are related to decision problems by asking if a potential solution of the problem is in fact a solution. The potential solution to the problem is part of the input to the problem.

If that if there are only polynomial solutions and each solution can be generated in polynomial time then the related decision problem can solve the construction problem in polynomial time by generating and checking each potential solution.

**Definition of Class P Problem**

Class *P* is a class of all decision problems that can be solved in polynomial time by deterministic algorithms.

Deterministic algorithms are the algorithms that do not guess at a solution.

An interesting question is, are there any decision problem that is not in *P*? In fact there is a problem that can have a solution, for example the Halting problem.

**Halting Problem**

The halting problem is there a program that can determine if any program will halt on an input or continue running for ever.

**Decision Problems without Polynomial Time Solutions**

There are many decidable problems which do not have polynomial time solutions.

**Hamiltonian circuit:** Does a graph have a Hamiltonian circuit, a circuit that visit each node only once?

**Traveling salesman decision problem:** Does a weight graph have Hamiltonian circuit with length *N*?

**Knapsack Problem decision problem:** Can the knapsack be loaded with items valued at *V*?

**Bin packing decision problem:** Can *n* rational numbers (all less than 1) be put into *M* bins size 1?

**Graph coloring decision problem:** Can a graph be colored with *N* different color so that no two colored vertices are adjacent.

Note that for all these problems a potential solution can be checked in polynomial time.

**Definition of Nondeterministic Algorithm**

An informal definition for a nondeterministic algorithm is an algorithm that can guess at the solution (nondeterministic step) but checks that the solution is correct using a deterministic algorithm.

**Definition of Class NP Problems**

Class *NP* is the class of all decision problems that a nondeterministic algorithm can solve in polynomial time.

Note *P* is a subset of *NP*

**We do not know if P is equal to NP.**

*NP-Complete Problems*

The informal definition of *NP*-complete (*NPC*) problem is *NP* problem that is as difficult as any other problem in *NP*.

**Definition of ****Polynomially**** Reducible**

A decision problem *D*_{1} is polynomially reducible to another decision problem *D*_{2} if there exist a function *t* that transform instances of *D*_{1} to instances of *D*_{2} such that

*t*maps all yes/no instances of*D*_{1}to yes/no instances of*D*_{2}*t*is polynomial time

**Definition of NPC Problems**

A decision problem, *D*, is *NP* complete if

- It belongs to
*NP* - Every problem in
*NP*is polynomial reducible to*D*

Part 1 in the definition is generally trivial to demonstrate, but part 2 of the definition can be very difficult.

Generally scientist show that a know *NPC* decision problem is polynomial reducible to the problem. This required someone to demonstrate the first *NPC* problem.

Stephen Cook in 1971 showed that the **CNF-****satisfiability**** problem** is *NPC*. Since then all the other decision problem mention above have been shown to *NPC*.

## Diagonal argument

Cantor used his diagonal argument to show that some infinite sets are actually ‘bigger’ than the set of positive integers.

Two sets are said to have the same *cardinality* if it is possible to pair off every element from the first set with with every element from the second set, without doubling up and without leaving anything out. You can think of cardinality as a measure of the size of a set.

Now pairing off the elements of a set with the set of positive integers is equivalent to writing down the elements of that set in a list. That way, you can pair the first item on that list with 1, the second item on the list with 2 and so on. Now what Cantor did was show that some sets are so big that they can not be paired off with the positive integers in that way. In other words, there are some sets that have so many elements that you can not write them down in a list. Even if you try to make a list, some element will always be missing from it.

Consider the set of real numbers between 00 and 11 (inclusive). Now all the elements of this set can be expressed by a decimal point and infinitely many digits following it. For example, 0=.0000..., 1/2=.5000., 1/3=.3333...,1=.999... Now say that you made a list of real numbers between 0 and 1. Now you can use the diagonal argument to find a number that is not on your list. Consider the number a∈[0,1] such that the first digit after the decimal point of aa is different from the first digit after the decimal point of the first item on your list, the second digit after the decimal point of aa is different from the second digit after the decimal point of the second item of the list, the third digit after the decimal point of a is different from the third digit after the decimal point of the third item on your list and so on.

Note that aa can not be on your list since it differs from every item on your list by at least one digit. So no matter how hard you try to make a list of all real numbers between 0 and 1, there will be some number that is not on your list. [0,1] is a different kind of infinite set. This infinity is *bigger* than the infinity of the set of positive integers.

## Russel’s paradox

Russell’s paradox is a formal, rigorous version of an old notion sometimes demonstrated as the “barber paradox”: Suppose the barber shaves everybody in town, except for all of those who shave themselves. Who shaves the barber? If he shaves himself, then he doesn’t shave himself; if he doesn’t, then he does.

A slightly more rigorous version is the Grelling paradox: define “heterological” to mean “a word which does not describe itself”. (For example, “long” is a heterological word: “long” is not long). Is “heterological” heterological? Again, there’s an infinite loop of yes/no/yes.

The problem with both of these is that they involve words. They feel slippery and inexact. They just show that words are fuzzy, which we already knew. What Russell added was to phrase these ideas in mathematical terms of set theory. Suppose we let sets contain themselves, which sounds weird but is actually very handy. For example, “the set of all sets with more than three elements” seems like a perfectly ordinary thing to say. And in fact, that set must itself be a member of itself, since it surely contains more than three elements.

OK, but now let’s define “the set of all sets that do not contain themselves”. That’s in words, which we knew were slippery, but we can also put it in logic:

{x|x∉x}

And now we’ve got a rigorous version using pure set theory. And now we’ve found a contradictory notion in set theory itself. Since set theory is widely regarded as underlying all of mathematics, this is a problem.

It turns out to be hard to fix. Just banning sets-that-contain-themselves turns out not to work, and also removes useful tools that we needed to make sets drive mathematics. There are a number of options that define “membership” more weakly, of which the most popular is Zermelo-Frankel set theory, which avoid that paradox (though still having to cope with others, like the Axiom of Choice).

Actually delving into those quickly gets out of layman’s territory, but the layman’s takeaway is that our intuitive notions of mathematics turn out to not work so well, and the ones that work are less intuitive than we might hope. In terms of mathematics and science, the only reply to that is “tough cookies, nobody promised you your intuition wouldn’t suck”, but philosophically there’s still kind of a muddle about what’s “really true”.

## Deterministic Finite Automaton ( DFA ) and Non – deterministic Finite Automaton ( NFA )

In deterministic finite automata, there’s exactly one state transition for every input symbol-state pair. There are also no epsilon transitions, meaning that you’re not allowed to change states without consuming anything from the input.

In non-deterministic finite automata, there can be 0 or more state transitions for every input-state pair. You can also have epsilon transitions. When there’s no state transition for a given input-state pair, then we say that the automata had crashed, meaning that it can’t proceed processing the input, and therefore it doesn’t accept the input. When there’s more than one choice for a state transition for a given input-state pair, then the machine can just follow all possible paths (think of it as parallel computation), if one of the paths ends up in an accept state, then we say that the automata accepted the input string.

Both automata are equivalent in terms of power though. It may seem that a non-deterministic automata is more powerful, but both automata are proven to be equivalent, meaning that they recognize the same class of languages called regular languages. The proof of equivalence is by construction in which you show that given a DFA, you can construct an equivalent NFA, and vice versa. The proof can be found in any textbook on theory of computation.

## Regular languages and regular sets

In theoretical computer science and formal language theory, a **regular language** (also called a **rational language**^{[1][2]}) is a formal language that can be expressed using a regular expression, in the strict sense of the latter notion used in theoretical computer science (as opposed to many regular expressions engines provided by modern programming languages, which are augmented with features that allow recognition of languages that cannot be expressed by a classic regular expression).

Alternatively, a regular language can be defined as a language recognized by a finite automaton. The equivalence of regular expressions and finite automata is known as **Kleene’s theorem**^{[3]} (after American mathematician Stephen Cole Kleene). In the Chomsky hierarchy, regular languages are defined to be the languages that are generated by Type-3 grammars (regular grammars).

Regular languages are very useful in input parsing and programming language design.

## Equivalence of DFA and NFA

Two DFAs are said to be equivalent if they recognize the same language. That is, they both accept exactly the same set of strings, and reject exactly the same set of strings

Every DFA is an NFA. and every NFA can be converted to DFA via using the subset construction.

Every DFA can be reduced to a regular expression using the Arden’s Rule or/and Kleene Algebra.

Also, every regular expression, ‘regex’, defines a regular language, that can be converted to a NFA using epsilons transitions.

So, regular expressions, NFA and DFA are all equivalent in expressiveness of the regular language which can be defined by a right linear grammar.

## Non-regular languages

The existence of non-regular languages is guaranteed by the fact that the regular languages of any alphabet are countable, and we know that the set of all subsets of strings is not countable. Nevertheless, the point of establishing non-regular languages is not so much one of existence, but of illustrating that certain languages which are “computable” in some sense are not regular.

## Pumping lemma

There are two Pumping Lemmas, which are defined for

1. Regular Languages, and

2. Context – Free Languages

**Pumping Lemma for Regular Languages**

For any regular language L, there exists an integer n, such that for all x ∈ L with |x| ≥ n, there exists u, v, w ∈ Σ∗, such that x = uvw, and

(1) |uv| ≤ n

(2) |v| ≥ 1

(3) for all i ≥ 0: uv^{i}w ∈ L

In simple terms, this means that if a string v is ‘pumped’, i.e., if v is inserted any number of times, the resultant string still remains in L.

Pumping Lemma is used as a proof for irregularity of a language. Thus, if a language is regular, it always satisfies pumping lemma. If there exists at least one string made from pumping which is not in L, then L is surely not regular.

The opposite of this may not always be true. That is, if Pumping Lemma holds, it does not mean that the language is regular.

## Pushdown Automaton ( PDA )

A pushdown automaton is a way to implement a context-free grammar in a similar way we design DFA for a regular grammar. A DFA can remember a finite amount of information, but a PDA can remember an infinite amount of information.

Basically a pushdown automaton is −

**“Finite state machine” + “a stack”**

A pushdown automaton has three components −

- an input tape,
- a control unit, and
- a stack with infinite size.

The stack head scans the top symbol of the stack.

A stack does two operations −

**Push**− a new symbol is added at the top.**Pop**− the top symbol is read and removed.

A PDA may or may not read an input symbol, but it has to read the top of the stack in every transition.

A PDA can be formally described as a 7-tuple (Q, ∑, S, δ, q_{0}, I, F) −

**Q**is the finite number of states**∑**is input alphabet**S**is stack symbols**δ**is the transition function: Q × (∑ ∪ {ε}) × S × Q × S***q**is the initial state (q_{0}_{0}∈ Q)**I**is the initial stack top symbol (I ∈ S)**F**is a set of accepting states (F ∈ Q)

## Deterministic Pushdown Automaton ( DPDA )

In automata theory, a **deterministic pushdown automaton** (**DPDA** or **DPA**) is a variation of the pushdown automaton. The class of deterministic pushdown automata accepts the deterministic context-free languages, a proper subset of context-free languages.

## Non – equilvalence of PDA and DPDA

The main (and only) difference between DPDA and NPDA is that DPDAs are deterministic, whereas NPDAs are non-deterministic. With some abuse of notation, we can say that NPDAs are a generalization of DPDAs: every DPDA can be simulated by an NPDA, but the converse doesn’t hold (there are context-free languages which cannot be accepted by a DPDA).

The main advantage of DPDAs is that we can simulate them much more easily with our deterministic computers (real hardware is always deterministic). In fact, simulating general DPDAs is not fast enough for most purposes, and so when parsing code we usually use LALR grammars which are weaker than DPDAs.

**Chomsky Classification of Languages:**

Grammar Type |
Production Rules |
Language Accepted |
Automata |
Closed Under |

Type-3 (Regular Gramar) |
A→a or A→aB where A,B ∈ N(non terminal) and a∈T(Terminal) | Regular | Finite Automata | Union, Intersection, Complementation, Concatenation, Kleene Closure |

Type-2 (Context Free Grammar) |
A->ρ where A ∈N and ρ ∈ (T∪N) |
Context Free | Push Down Automata | Union, Concatenation, Kleene Closure |

Type-1 (Context Sensitive Grammar) |
α→β where α, β∈(T∪N)^{*} and len(α) <= len(β) and α should contain atleast 1 non terminal. |
Context Sensitive | Linear Bound Automata | Union, Intersection, Complementation, Concatenation, Kleene Closure |

Type-0 (Recursive Enumerable) |
α → β where α, β∈(T∪N) |
Recursive Enumerable | Turing Machine | Union, Intersection, Concatenation, Kleene Closure |