Skip to content

CollinReeser/TuringMachine

Repository files navigation

This is the reference compiler for the Turing machine description language
henceforth known as Turing Tzarpit. Please take a moment to revel in how
satisfying a name that is.

Turing Tzarpit reference:

	A Turing Tzarpit source file consists of two distinct elements:
		- A list of zero or more directives
		- A list of states, themselves merely a list of transitions to other
			states.

	For the purposes of clarity, the following is a short legend that gives
	meaning to the syntax ruleset described later:

		() - Anything enclosed in {} can have zero or one instance.
		<> - Anything enclosed in <> can have zero or more instances.
		[] - Anything enclosed in [] can have one or more instances.
		"" - Anything enclosed in "" should be interpreted as an element of the
			set described.
				- While enclosed in "":
					+ - The set of integers from 0 inclusive, upward
					C - Single character
					S - String
					U - Unique string
					P - A string that is used elsewhere in the source file as
						a unique string
		'' - Any character enclosed in '' should be taken literally
		L"" - Anything enclosed in L"" should be taken as a literal string
		? - Denotes a choice, wherein 'R'?'L'?'S' denotes that here,
			having one of 'R', 'L', or 'S' is correct.

	Syntax ruleset:

		#start "P"
		(#empty "C")
		(#cells "+")
		(#steps "+")
		(#speed "+")

		[state "U"
			[if "C" -> "C" , 'R' ? 'L' ? 'S' 
				<| "C" -> "C" , 'R' ? 'L' ? 'S'> 
			{
				("P" ? L"accept" ? L"reject")
			}]
		]

	The directives are as follows:

		#cells "+" denotes that the Turing machine simulation tape should have
			"+" cells. A value of 0, or omitting this directive, defaults the
			number of tape cells to 1000.

		#steps "+" denotes that the Turing machine simulation should take at
			most "+" state transitions before force-halting. A value of 0, or
			omitting this directive, defaults the maximal number of transition 
			steps to 1000.

		#speed "+" denotes that, for any value not 0, tape movements should take
			"+" seconds to be made. This is in an effort to more realistically
			emulate the mechanical nature of the Turing machine. A value of 0,
			or omitting this directive, defaults the simulation to executing
			as quickly as possible.

		#empty "C" denotes the character, "C", which will be used to initialize
			the tape at simulation onset. Can be considered the empty symbol.
			Omitting this directive defaults the empty symbol to '_'.

		#start "P" denotes the name of a declared state which the Turing
			machine should initiate its simulation within.

	State declarations are as follows:

		state "U" denotes the beginning of a state definition, identifiable by
			"U". The state definition continues until another state "U"
			statement is encountered, or the end of the file is reached. A state
			definition consists of one or more transition statements.

	Transition statements are as follows:

		if "C" -> "C" , 'R' ? 'L' ? 'S' forms the transition condition in its
			entirety. The first "C" is the character that must be on the tape
			to initiate the transition. The second "C" is the character that
			will be written to the tape as part of the transition. The choice
			between 'R', 'L', and 'S' gives the option of moving the tape head
			to the right ('R'), or to the left ('L'), or remaining stationary
			('S').

			From here, either a pipe (|) or a left curly brace ({) is expected
			by the parser. Additional "C" -> "C" , 'R' ? 'L' ? 'S' clauses can
			be chained by pipes until a left curly brace is encountered. If a
			left curly brace is encountered, then the "P" enclosed in the curly
			braces is the string that denotes the name of a declared state that
			is being transitioned to. If there is no "P" (As in, the curly
			braces are empty, i.e., {}), then a transition back to the same
			state is implied. Additionally, instead of "P", either L"accept" or
			L"reject" can be used. L"accept" and L"reject" are two states that
			always exist in the Turing machine generated by Turing Tzarpit.

			If the simulation transitions to L"accept", the simulation halts and
			accepts the input. If the simulation transitions to L"reject", the
			simulation halts and rejects the input. In order for the simulation
			to transition to the L"accept" state, at least one transition from
			at least one state must explicitely declare a transition to
			L"accept". On the other hand, in terms of the simulation, any 
			current state that fails to describe a valid transition given the 
			current	configuration of the Turing machine simulation implies a
			transition to L"reject", meaning that it is redundant to include
			state transitions that transition to L"reject".

	The following is an example of a Turing Tzarpit program that simulates the
		regular expression "A*B"

		#start start_state
		#empty _
		#cells 500

		state start_state
			if A -> A , R
			{}

			if B -> B , R
			{
				check_finish
			}

		state check_finish
			if _ -> _ , S
			{
				accept
			}

	The resultant executable produced from the compilation process of a Turing
		Tzarpit source file is as follows:

		- Running the executable by itself will act as though the simulation is
			run without any input.
		- Running the executable along with one extra argument will act as
			though the argument is the string input to the Turing machine,
			placing the input onto the beginning of the tape.
		- Running the executable along with two extra arguments will act as
			though the first argument is the string input to the Turing machine,
			placing the input onto the beginning of the tape. The second
			argument acts as a flag to the simulator that it should print out
			configuration information at every transition. 

	Additional notes:

		This Turing Tzarpit compiler recognizes and ignores both single- and
		multi-line C-style comments.
		

About

A simple Turing-machine-language compiler.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published