Text on FSM

jmariano · 2023-03-09T22:17:14+00:00

Hello Does anyone know of a nice text on finite state machines and their software implementation on embedded systems? I'm looking for some theoretical background and design methodology. A few examples of "C" implementation would be a nice but not really needed. I'm not looking for a recipe or code but for a more formal explanation on the workings of FSM.Thanksjmariano

G

George Neuner 3 years ago

When you merge two FSM you often get redundant "don't care" nodes, but you also can get nodes which either are impossible to enter [dead code], or impossible to leave [halt], because there are no legal transitions that will permit it. Joining FSM involves identifying and pruning both types of nodes.

No. Consider the case of just recognizing a decimal digit: compare the graph using the alternation: (0|1|2|3|4|5|6|7|8|9), vs the graph using the class [:digit:].

Using the OR alternation, including start you have 11 nodes. Start has

10 transitions exiting, and each digit node has a single transition entering.

Using the digit class, you have 2 nodes, with 10 transitions that all get you from start to the digit class node.

Obviously this is simplistic, because the members of the character class form a subgraph which itself has to be recognized. The important point here is that the subgraph as a whole can represent a /single/ node in a much more complex graph - its constituent characters need not be repeated in the complex graph. More on this below.

A complex DFA that combines many different regex may present other opportunities to recognize given (possibly arbitrary) sets of characters - opportunites that may not be apparent from looking at the constituent regex.

When given the option to find equivalence classes, flex can identify sets of characters that are used repeatedly. Those characters are gathered into an "equivalence" that then can be a node in the DFA instead of redundantly repeating individual characters.

Remember DFA are deterministic - a node can't take different actions depending on which of multiple transitions entered (or left) it ... so if you want the same character to be recognized in a different context (leading to a different action), you must repeat it in a different node.

This is where being able to identify, essentially arbitrary, sets of character and coalesce them into a recognizer "class" is useful. If a given set of N(>1) characters is used M times in the graph, then by coalescing them you remove M(N-1) nodes from your graph. The number of /transitions/ in the graph remains the same, but recall that it is the /nodes/ that consume space in the lexer tables.

You're mixing abstraction levels here: <digit>, <digits>, <number>, and <value> are lexical tokens, whereas <expr> is syntax.

However ...

Knowing that yacc and bison CAN handle characters as tokens, and assuming you have defined <digit> and <digits> elsewhere in your grammar, neither yacc nor bison can find this kind of equivalence. In yacc it will result in a reduce/reduce error. In bison what happens depends on the kind of parser you asked for (LALR,SLR,LR,GLR), but in any case the result won't be pretty.

Assuming instead that you meant for <number> and <value> to be recognized by the lexer rather than the parser ... flex (not lex) could discover that <number> and <value> are equivalent, but since they would lead to different actions: returning a different token - both would included in the DFA. However, whichever one happened to be tried first would be the only one that ever was recognized, and your parser would only ever get one of the expected tokens.

Algorithms for turning graphs into table driven FSM, or equivalently a switch / case statement, are well known.

Assuming an appropriate graphical IDE, a designer certainly could specify a state graph and code for actions, and have a program generated from it. Given the right input from the designer, a great deal of checking could be done against the graph to verify that it covers enumerated inputs and transitions, that specified inputs lead to specified actions, that action code exists, etc.

But what is NOT possible is to verify that all /possible/ inputs and state transitions have been enumerated. Nor is it possible to verify that required actions have been specified, or necessarily that the actions are being taken in proper context ... those are things for which the tool simply MUST trust the graph designer.

George

Vote

G

George Neuner 3 years ago

Hi Don,

Sorry for the delay here ... you know what's going on.

You merge multiple DFA by constructing an equivalent NDFA where all the transitions that lead to the same action are coalesced into a single node (effectively eliminating the redundancies). Some of the impossible halt states may also be eliminated as redundancies.

Once the minimum state NDFA is built, you turn /that/ back into a DFA to increase performance.

I am using graph terminology - mostly because all FA builders start by constructing a state /graph/. After the graph is complete, it may be implemented using tables, or Duff's device, etc. ... but it doesn't start that way. 8-)

Absolutely it /could/ be done with table based algorithms, but I've never seen it done that way in any real (non toy) implementation ... graph based solutions scale better to large problems.

Think about the "class" as a state in an /NDFA/ ... there are multiple transitions that can get you in - one transition (action) that gets you out.

When the "class" is implemented it becomes (for lack of better term) a subroutine (subgraph) in the resulting DFA. The more the subgraph can be reused, the more savings in the DFA.

It will reduce the number of states IF the class is reused with the same action. Consider

integer: [+-]?[:digit:]+ float: (((integer)\.)|((integer)?\.(integer)))(\e(integer))?

The [:digit:] class is resused 5 times. The definition of "integer" also forms a (5x) reused meta-class that flex could recognize if told to look for them.

Since, in this example, the [:digit:] class is always used with the same action, it will be implemented in the DFA state graph just once. Since the class itself consists of ~13 states: START that waits for input, 0..9 that accept input, common EXIT, and common ERROR out if something else is input ... treating it AS a class saves 52 states (13 x 4) in the state graph.

The common exit and error states out may be eliminated from the final FA (assuming no conflicting uses of [:digit:], but they will be included in initial construction of the state graph (think of them like subroutine preamble/postamble).

Yes they are duplicates at some level of abstraction, but that level is above what the tool can recognize and deal with. The programmer labeled them differently and so the tool has to assume both are required, even if in use there is no practical way to distinguish them in the input.

You'd need a combination tool that produced parser and lexer from a unfied spec. There are such tools: e.g., ANTLR ... but I'm not aware of any tools that do /that/ kind of optimization.

It's all about the context: there's no practical way to merge identical recognizers if they directly lead to different actions. In the examples above, [:digit:] could be reused only because every use of it simply accumulated another input character ... the differences occurred when a non-digit character was entered.

If instead you did something like:

integer: [:digit:] return 'i' hex: [:digit:]|['a'-'f'] return 'h';

This would blow up in your face because 0..9 would never be recognized a hex digit, but more importantly the 2 uses of the class lead /immediately/ to different actions so the class subroutine (subgraph) would have to be repeated in the FA with different exit actions.

Absolutely: somehow actions have to be specified, whether as arbitrary user entered code attached to graph nodes, or as predefined "action" nodes linked into the graph.

But as I said above, there are things that simply can't be checked without embedding significant domain knowledge into the tool itself. That essentially precludes any notion of a generic tool ... even if the tool included an expert system, it's likely that no generic interface to the expert system could be designed that would satisfactorily deal with the needs of many different domains.

Vote

G

George Neuner 3 years ago

Actually you can serialize graphs as text, but the result may be varying degrees of difficult to read for a human.

Trivial example: fire up Racket and enter

#lang racket (require racket/serialize) (let ( [outer (mcons 0 (mcons 1 (mcons 2 (mcons 3 '()))))] [inner (mcons 9 (mcons 8 (mcons 7 '())))] [cycle '()] ) (writeln outer) (writeln inner) (set-mcdr! (mcdr (mcdr (mcdr outer))) outer) (set-mcdr! (mcdr (mcdr inner)) inner) (set-mcar! (mcdr outer) inner) (writeln outer) (writeln (serialize outer)) (writeln (deserialize (serialize outer))) )

This creates a simple graph having 2 cycles and prints it. See what happens. The result of (serialize _) is a string that can be saved to a file. You can substitute structs for mcons (mutable cons) if you want to experiment with making more complicated graphs.

Tables are great as implementation ... but for a construction tool that needs to store and modify graphs, it /can/ be a pain to reconstruct complicated graphs from table representations. It's a hell of a lot easier to read back a suitably serialized form.

The terms "FA" (finite automaton) and "FSM" (finite state machine) are, in fact, synonymous.

What is confusing is that we got to this point through discussion of parsing and lexing tools - which ARE geared toward languages. Moreover, yacc and bison do NOT implement a general FA, but rather a particular variety of FA that useful for language parsing and which involves an auxiliary stack.

Purely as a techical matter, (f)lex can create general FA assuming that transition conditions can be represented as character input to the reader. The "reader" function is completely redefineable: the default is to read from STDIN, but, in fact, a custom reader can do absolutely anything under the hood so long as it returns a character (or EOF) when called.

In practice you would not want to do this. A decent UML tool would be a much better choice.

Yes. And you could (at least theoretically) represent this in flex by encoding POWER_FAIL, etc. as characters or strings and sending those characters or strings as input to the reader when those events occur. Internal state transitions can be handled the same way: send characters to the reader.

Again, this is an abuse of the tool. Just because you can do it does not mean you should do it.

flex (not lex) permits defining contextual "start" states, which the code can arbitrarily switch among. The same input can be treated differently in different start states. These really are coroutines - not subroutines - and the user code decides which state to switch to next, but flex does provides a stack so you can use them as subroutines (without having to track the nesting yourself).

The tool places priority on the longest, most precise match. It falls back on definition order when the input - as given - matches multiple patterns.

But again, start states can (sometimes) be used to get around this behavior.

George

Vote

C

Clifford Heath 3 years ago

The stack means it's not a FA. Yacc and bison exist for the sole purpose of processing LALR2 grammars that cannot be processed with an FA. Also because the grammars are LALR, the stack is a bottom-up stack, so it doesn't resemble anything you'll see in a top-down parser, and you'll get parse errors that probably don't really tell you what is wrong with the input :P. Lex/Flex on the other hand exists to process only finite states. The FSM algorithms they use are more efficient than any algorithm that can handle LALR2, which is why these tools still exist as independent tools.

Notably, the combination of yacc&lex (or flex&bison) still isn't powerful enough even to parse C without extra help - goto labels blow thing up and there is a hand-coded hack in the C language lexers for it.

ANTLR also implements some version of an LR/LALR parser, but instead of a finite 2 tokens lookahead, it transforms arbitrary lookahead expressions into something finite (an FSM), and if it can't do that, it fails. Terence Parr got his PhD for figuring out how to do that transformation... and lived to tell the tale. :)

Anyone interested in the overlap between regular languages and finite state machines should refer to the excellent

formatting link

. You can give it an assortment of regular expressions and it will unify them and construct a DFA to process them. The README at the top of that page has a simple example, and there's a tutorial if you want to look further. This library is perfectly at home processing arbitrary binary file formats and protocols, not just programming language text files. But only the parts that are reducible to a FA... Nevertheless there is absolutely nothing wrong with using this kind of library to write arbitrary FSMs.

I'm currently building a generalised parsing engine that also has the capability of processing arbitrary binary file and network stream formats, using a VM approach that interprets something very like a BNF, but in prefix notation (+a means one-or-more "a"s, not a+). It's tiny, efficient, embeddable, but can take a protocol description in a very few bytes of VM code to handle almost any new protocol or format. I don't think that has been done before, and I've wanted to do it for 25 years.

Clifford Heath.

Vote

G

George Neuner 3 years ago

This I think is a teaching failure.

Before we go on here we have to clarify a possible terminology trap: "deterministic" vs "non-deterministic".

In the context of FA, "deterministic" means that the machine can be only in one state at any given time. "non-deterministic" means that the machine (at least logically) can simultaneously be in a set of multiple states.

To explain this better, I'm falling back on lexing because it is simple minded. You will need to generalize the concepts to consider other possible uses.

Ignoring the behavior of any real-world tools and just thinking about an *ideal* recognizer, consider

integer: [:digit:]+ hex : [:digit:]+|[a-fA-F]+

Lacking further input, the sequence "1234" is ambiguous - the recognizer doesn't know yet whether it has an integer value or a hex value. Logically it must consider both patterns simultaneously, and so logically the recognizer must be an NDFA.

For every NDFA there is a corresponding DFA which contains an equal or greater number of states. Where the NDFA logically would be in a set of states simultaneously, the corresponding DFA will contain not only those explicit NDFA states but also additional states which represent possible combinations of those states which the NDFA could find itself in. The additional states are required because a DFA can be in only one state at any given time, so it needs a way to (logically) represent being in multiple states simultaneously. The additional "set" states serve to disambiguate ambiguous state transitions ... eventually the DFA must arrive in one of the explicit states of the original NDFA.

The typical notion of FSM as taught to hardware oriented students corresponds to non-deterministic FA. Hardware can directly implement an NDFA, but software can only *emulate* it - with all the caveats implied by emulation.

Algorithms to transform graph based NDFA to DFA and back again have been known at least since the 1950s, as have ways of generating table driven vs switch based machines from a graph. But, typically, none of this ever is taught to hardware oriented students (or even most software oriented students) - if they learn anything at all, they learn some practical methods to manually achieve the same results.

From the software viewpoint, you rarely, if ever, would try to design a DFA directly. Instead you would design an NDFA that does what you want, and then (for performance) you have it algorithmically transformed into its corresponding DFA form. The transformation [assuming it's done right ;-)] produces an optimal DFA state machine.

(f)lex is a tool that can - at least technically - create general state machines. However, because it was designed for string recognition, its machine description language is specialized for that use.

yacc and bison don't even try to create general state machines - they create a very specific type of FA which is optimized for parsing. And again, because they were designed for parsing, their machine description languages are specialized for that task.

UML tools are what you need to consider for more general FA / FSM.

FSM *are* FA are just alternate terms for the same concept.

There is nothing whatsoever which limits one or the other to any particular uses. Any apparent difference is an artifact of how they are taught to students in different disciplines: hardware students learn practice but rarely, if ever, learn the theory.

And, in truth, only CS students taking language / compiler courses ever will learn how to build NDFA and DFA state graphs, convert one graph form into the other, or how to generate table driven or switch code from a state graph.

You could just use the string above to represent the condition.

But this is where (f)lex falls down hard: you would have to define strings that represent all possible combinations of your simultaneous conditions, and to drive the resulting DFA the code that monitors your hardware must be able to send those condition strings into the recognizer.

If you can do that, (f)lex will happily generate a working state machine for you.

That's why you need a tool designed for the purpose. All of our discussion here about what is possible with (f)lex is academic ... nobody in their right mind should be doing it.

Unfortunately yes. I think very few people ever think about it enough to recognize that.

Vote

G

George Neuner 3 years ago

No, it still is an FA ... it just is a specialized form.

You mean LALR(1) in the case of yacc, and LR(1) in the case of bison. LALR is a restricted case of LR, it does /not/ mean LR with lookahead. Neither tool can handle 2 tokens of lookahead.

Stackless FA, in fact, can process LR(1) grammars ... they just need (typically many) more states in the machine to do so. The stack FA was created specifically to reduce the memory footprint of the parser

- necessary in ~1970, but generally much less of a concern now.

True.

You can't rely on the tool for error handling (or even just messages) ... you really need to add deliberate error handling.

They exist separately because they were intended for different tasks AND because they needed a considerable (at the time) amount of memory to analyze the input spec and generate a recognizer for it.

In fact, regex tools existed already for a number of years before either lex or yacc came about. The difference was most previous tools directly /interpreted/ regex patterns, and typically tried them one at a time under host program control (e.g., see RE2C), whereas lex compiled multiple patterns into a single recognizer that (effectively) tried all patterns simultaneously. This made lex recognizers much faster (though larger) and far more efficient for handling large numbers of patterns [such as in a language compiler].

ANTLR implements LL(*) which is LL with unbounded lookahead. There are other LL(k) tools which require the programmer to choose a fixed amount of lookahead (and fail to process the grammar if the k value is too small). ANTLR analyzes the grammar and computes what lookahead is required pattern by pattern.

Almost: substitute variable k for 2. And again, ANTLR is LL.

I would be interested to see that (when it's finished, of course). Good luck!

Vote

G

George Neuner 3 years ago

The hardware is in a single state, but that state simultaneously reflects the current value (for lack of better term) of potentially many variables / signals.

The same is true of software DFA. The major difference wrt hardware is in that transition between states can be based on only 1 logical condition. That one logical condition may, in fact, be a conglomeration of any number of things, but so far as the /machine/ is concerned, it manifests as one unique input. Therefore every possible (combination of things) condition which might cause a state transition has to be enumerated separately.

Software NDFA trade processing speed for (sometimes lots of) state memory. NDFA have no more capability than DFA, but they can do the same job with fewer explicit machine states, because a single state in NDFA can represent multiple states of the corresponding DFA. The tradeoff is that transition conditions in NDFA often are much more complex than in DFA, and thus evaluating them takes more time and effort.

That is why hardware is a better analogy to NDFA, where many (or all) of the variants may be represented by a single machine state. In DFA, there would need to be a separate state for each variant.

Agree and disagree.

YMMV, but a lot of hand written state machines I have seen over the years included a lot of duplicated condition / transition decision code that could have been simplified or eliminated by the introdution of additional explicit states.

Reminded of the proverb: "programmers are great at figuring out what CAN be in parallel, but not what SHOULD be done in parallel".

A tool can aid in figuring out what states are necessary, given the conditions, to create an optimal (software) machine.

You can think of a state in a software DFA as being analogous to a

1-bit latch. Effectively all you need to know is whether or not the state currently is active.

State transitions don't have a simple mapping to hardware as they can be (essentially) unrestricted logical expressions. Evaluation needs at least a simple ALU (full set of logic ops), an accumulator, and a (per combination) latch to store the result. [If the signals all are simple on/off the accumulator and the latches maybe all could be 1-bit.]

No. CS students learn theory. CSE and maybe also IS students learn about development toolchains.

This dichotemy between theory and practice has existed at least since the 80's (when I was in college) and probably started even earlier. Prior to ~ late 90s, explicit CSE degrees didn't exist - there were just certificate programming courses (if applicable), and the project management aspects had to be learned on the job.

CSEs really are just "developers with a degree".

I'm primarily a software person, though I have done simple (mostly TTL) interface hardware, and some not so simple FPGA programming [but that I think still counts as "software"]. I have done a lot of bit-banging and bare hardware programming.

I think the problem really is that too many programmers now do NOT ever learn assembler. I had learned a few different assembler languages before I learned C, and I think it helped immensely because I never had any trouble with pointers or indirections, etc., or manually managing memory ... the very things that tend to confound C newbies.

Too much specialization in education.

Concurrency, parallelism and atomic operations tend to be addressed (not "taught" per se) only in OS classes. Many CS students do not take OS classes. Atomics and threading are covered in CSE, but only the practical uses of them and not the theory (or how they evolved which I think is almost as important).

The software industry, in particular, now tends to frown upon generalists for developer positions, and for management any prior developer experience no longer much matters.

If you can't demonstrate significant expertise in ___ of the week, in most places you won't even make it past HR to be interviewed by the people who can recognize that your prior experience has relevance and that you could quickly learn whatever is needed to do the job.

Vote

C

Clifford Heath 3 years ago

Ok, it's an FA operating on a stack. The stack makes the whole thing non-regular, aka infinite, so it's only an FA if you exclude the stack from the machine.

No. A stack is not finite. Every FA is finite, that's why they're called FA. If you want to process a regular language, you can use an FA. If you want to process an irregular language, you cannot - you need somewhere to store unbounded staes and an FA *cannot* do that. It's in the definition of such things!

I wasn't talking about error recovery, just about reporting. Both are hugely easier in an LL grammar. In the PEG parsers that I favour, you can almost always just report the rules on the stack at the furthest point reached, and (in all the grammars I've implemented) that gives a better error report than anything you'd bother to create manually.

It amuses me that the folk who understand grammar well enough to be able to produce powerful parser generators seem to be universally incapable of generating code that can report parse failures in plain language. Something about their brain's language centres has become so esoteric that normal language escapes them.

The articles published at the time I first used them (in 1980) clearly stated that the two tools were needed "because we don't have a single algorithm that is equally efficient at both tokenisation and parsing".

That's the entire reason that there *are* two separate tasks. The same justification existed for the existence of egrep vs grep, BTW.

I tried to find the actual text, but it eludes me.

In a PEG parser, both tasks are equally efficient, and they are combined into one grammar, so there aren't two tasks any more.

Ken Thompson's implementation in the mid 1960s (documented in a 1968 CACM paper) translated the regexp into machine code. The list of possible states was just a sequence of function call instructions.

The technique of converting multiple NFAs into a single DFA has also been in use since the early 70s.

Have a read of Russ Cox's excellent presentation of these topics here:

formatting link

. I implemented Thompson's algorithm here before deprecating it for PEGs:

formatting link

It's baffling that many major languages *still* implement Regex using the incredibly inferior backtracking approach.

Exactly, that's the NFA* -> DFA thing I talked about.

It's unbounded, but must be regular. Many languages (including my Constellation Query Language) require unbounded non-regular look-ahead, which PEG provides, at some extra cost in memory. But the pathological cases which *require* memoization only occur rarely, so a global packrat strategy is sub-optimal.

That's a poor description of how it works. It looks ahead using an FA, so lookahead must be regular ("Finite State").

Did you look at the FSM on the main README page of that site? It shows two RE's being combined into one DFA. Very neat stuff.

The putative grammar for Px is here (but this doesn't describe captures fully):

formatting link

and the Pegexp engine is here (a template that I'm specialising to add non-regular aka full LL grammar capability):

formatting link

The Px grammar rewritten as a named-map of Pegexp expressions is here:

formatting link

but I'll use a better structure for a compiled Px grammar, so that names don't need to be looked up at runtime.

I've almost finished dicking with the structure of input streams that will make it feasible for this to process data directly arriving on a socket, and only caching as much as is needed for back-up and retry. It's also possible to compile with/without UTF-8 support, but I can make that more convenient. It's possible to specify binary matching even in a Unicode parser though.

I want captures to do things like turn the ASCII digits on an HTTP Content-Length header into a binary integer, save that integer as a capture variable, and use that variable to count bytes in a later repetition. This will enable a simple grammar describe all of HTTP/2.

By nesting parsers (incrementally feeding capture sections to a nested parser) it should be possible to for example, run a protocol engine that generates an HTTP/2 request (generating from an HTTP request grammar), parses the response chunks, feeds base64-encoded chunks into a conversion function (not specified in Px), and the output of that conversion into e.g. a JPEG parser that actually verifies the JPEG format, and can e.g. extract (as a parse capture) the GPS location from inside the Exif data attached... and all without having to extend or recompile the engine. Just load the target grammar, and if it succeeds, you get the GPS location... and all file formats have been validated.

I envisage a world where the file-system is type-safe; almost no file is a pure byte-stream, and it's not possible to save a JPEG file that doesn't match the JPEG syntax. The file system must be pre-loaded with a grammar for every new file type before writing such a file.

Clifford Heath.

Vote

G

George Neuner 3 years ago

Nor is the input stream. So what? The stack is NOT part of the machine, it is a memory used BY the state machine.

Nowhere in the definition of finite automaton does it say the automaton is limited to what can be encoded by its states. In particular there is no prohibition against using an external memory. Recall that Turing machines used tapes of infinite length.

In any event, I'm still not following why you think this somehow is important.

I wasn't talking about recovery either. When using an LR parser the grammar designer/implementer has to augment BOTH error reporting and error handling - which may or may not involve "recovery". See next.

LR works by incrementally assembling a sequence of tokens and looking for a pattern that matches it.

LL works by selecting a pattern and incrementally looking to match that pattern with the sequence of tokens beginning at the current position in the input. Of course the pattern may be an alternation having multiple possibilities, but the principle of operation remains.

Very, very different.

Neither method innately knows the context when a pattern match fails, but in LL the context is readily apparent from the driver code which directs the parse, so it is easy to provide a (somewhat) meaningful error message just by maintaining a stack of the non-terminals already matched and dumping the last N entries.

In contrast, in LR the context of the current match is given by the machine state and the stack of unreduced (as-yet unmatched) tokens. There is nothing readily available that could be used to provide a user meaningful message ... you'd have to examine the machine state to figure out even what you /might/ be looking for. Your position in the input is about as close as you can get to a meaningful message without the user code manually tracking context.

That was true, but wasn't the reason AT THE TIME they were written. They were separate first and foremost because they were written at different times. They never were combined because most machines of that time did not have enough memory to handle the analysis and recognizer generation for even moderately complex grammars ... making the tool larger by including lexing was out of the question.

After a while, it was simply inertia that kept them from being combined. Everyone was used to the status quo and so even when memory sizes grew to the point where having a combination tool could be useful, very few people cared.

Inertia is the reason why a lot of potentially interesting things never happened. Diversion is the other reason - the people who could have done it were doing other things.

Yes, but Thompson's method was not widely used - again because of memory sizes. Most uses of regex used many patterns, and it was more efficient (memory-wise) to simply interpret the pattern directly: one driver function to handle N patterns.

Yes, and lex is from ~1973, IIRC. It was the first /publically/ available tool able to combine multiple NDFAs into single DFA.

No. Lookahead (and backtracking both) simply requires maintaining a queue of as-yet unmatched tokens. It certainly could be done by a state machine, but it does NOT require a state machine.

I haven't examined their method. It may be that they have found some particularly efficient way to do it. That would be great. But algorithms for merging FAs in graph representation have been around at least since the 60s.

George

Vote

R

Richard Damon 3 years ago

The key point is that when talking about "class" of machines, stacks are generally considered at least "effectively" infinite. So a stack based machine.

Give a FA an (unbounded) stack, and you have a different class of machine. Yes, in practice, you normally have a limit on the size of the stack, but it is generally big enough to handle the problem at hand.

In the same way, a computer with 16 GB of ram and a TB of storage is still technically a FA, but normally isn't treated as one, as the methods to analyize a FA are impractical at that scale.

Vote

C

Clifford Heath 3 years ago

Thank for your complete lack of insightful comments on what I offered, and the completely unnecessary lesson on subjects I'm already quite familiar with (even if some of my memories are a bit flakey).

Vote

Text on FSM

Join the Discussion

Didn't find your answer?