- A set is an unordered collection of objects, usually listed in curly braces. The author tends to use variable names like
*A*and*B*to denote sets. For example, the set of binary numbers is denoted by $ A =\{0,1\}.$ - A sequence is an ordered list of symbols. For example, the sequence of two zeros followed by a one is denoted by $ x=001. $
- A
*string*is a sequence of symbols. The author tends to use variable names like*x*and*y*to denote strings. - A language is a set of strings. For example, the language containing all binary numbers less than $ 1000 $ is denoted by $ L = \{0, 1, 10, 11, 100, 101, 110, 111\}. $ This creates a hierarchy. We start with a list of symbols, concatenate them to form sequences called strings, and then group them to form languages.
- An
*operator*is something that changes one set of objects into another. In this section, the operators change one set or two sets of strings into another. Since sets of strings are called languages, the operators turn one or more languages into another language.

- The author uses the symbol $ \lambda $ to denote a special type of string, known as the
*empty string*. - It is common to denote the empty string by a $ \varepsilon $ as is done in the wikipedia article on regular expressions, so be aware of that as you read.
- The empty string is a string, it just happens to be the string created by a sequence of zero symbols.
- It is easy to confuse the empty string with the empty set, but don't do that! Recall that strings are sequences of symbols and that languages are sets of strings. Thus, the empty string is a type of string and not a language, and the empty set is a language that has no strings in it. This means that $ \{\lambda\} \neq \{\} $, that is the language containing only the empty string does not equal the language containing no strings. There is exactly one string in $ \{\lambda\} $ and there are no strings in $\{\}$.

- The Cartesian product was invented by Rene Descartes, the same guy who came up with
*I think therefore I am.*When not changing the world of philosophy, he tinkered with math. You've been using his notation since grammar school when you created your first plots of*x*versus*y*on the coordinate axes. Every dot on the coordinate axes corresponds to an ordered pair*(x,y)*. The order means that the first element always tells you what corresponds to the x-axis and the second element always tells you what corresponds to the y-axis. Thus, the point*(2,3)*is different from the point*(3,2)*. The notation $ C \times D $ describes the set of all ordered pairs, with the first element of the ordered pair coming from the set*C*and the second from the set*D*. So, if $C=\{1,2\} $ and $ D = \{cow, pig\} $ then $ C \times D = \{ (1, cow), (1, pig), (2, cow), (2,pig)\}.$ Notice how it takes all the elements from set*C*and creates an ordered pair with each element of set*D*. - On page 124, shortly after the definitions of the Cartesian product from the previous page, the author introduces a notation. Make sure you understand what the notation $ A^n $ means.
- The function notation $ f: A \rightarrow B$ is read as follows:
*f is a function that takes things from set A and turns them into things from set B.*The colon after the*f*tells you that*f*is a function. The set*A*after the colon means that this is the set that the function operates on, and the right arrow says that this is mapped to something from B. Using this notation, we could define a function $ f: C \rightarrow D$ for the sets in the previous bullet point as follows: $ f(1) = cow, f(2) = pig. $

- We often want to specify the set of all possible strings that can be created from a set of symbols. Let
*V*denote a set of symbols. Then*V**denotes the set of all possible strings that can be created by concatenating zero of more symbols together. - Formally,
*V**is defined as follows: $ V^*\cup_{k=0}^{\infty} V^k $. You've seen similar notation before, but you may not recognize it. If I want to write a shorthand notation for the sum of integers from 1 to infinity, rather than writing*1+2+3+…*we often write $\sum_{k=1}^{\infty} k.$. The notation in the definition of*V**does something similar, but it does it for union. Thus, $ \cup_{k=0}^{\infty} V^k = V^0 \cup V^1 \cup V^2 \ldots$

- Definition 4 on page 868 of the textbook mixes together the words
*accepts*and*recognizes*. We will always use them in a specific way.*Accept*will always be used when we are talking about strings, as in*machine M accepts string x*, and*Recognize*will always be used when we are talking about languages, as in*machine M recognizes language L*. - A more precise definition of what it means for a finite state machine to recognize a language, and one that you will need to succeed in the homework, is as follows: M recognizes language L if it accepts all strings in L and fails to accept any string not in L.
- Right after Definition 3 the author notes that when a state has two circles, it means that it is an accept or final state. I don't like calling these
*final*states because the machine can continue to operate on inputs after it reaches one of these states, including transitioning to*non-final*states. I prefer calling these*accept*states because the machine accepts the input string only if it ends up in an accept state after all symbols in the string are processed.

- There is
*notational overloading*happening in definition 1. Regular expressions are shorthand notations for describing languages. Thus, every expression in definition 1 describes a language. - $ \emptyset = \{ \} $, that is, the first symbol denotes the empty language.
- Writing $ \lambda $ as a regular expression overloads the symbol lambda. Lambda denotes the empty string, so using lambda as a regular expression actually denotes the language $ \{\lambda\} $.
- The symbol
*x*used here denotes a single symbol, or a string of length one. Since regular expressions denote languages, using*x*as a regular expression actually denotes the language $ \{x\} $.

- The
*regular grammars*that generate regular languages mentioned on page 851 are precisely the languages generated by regular expressions. These are also known as*type 3 grammars*. - My favorite example of a context sensitive grammar is illustrated by the following two sentences where the meaning of “flies” and “like” are determined by their context, meaning the words around them:
Time flies like lightning. Fruit flies like bananas.