(c) Dmitry Ponyatov <dponyatov@gmail.com> , GNU LesserGPL, 2017
github: | https://github.com/ponyatov/DLR |
PDF for mobile: | https://github.com/ponyatov/DLR/releases/latest |
online manual (preview): | http://ponyatov.github.io/DLR/ |
I have some troubles with latex2html
converter, so HTML version only for
preview, download full .pdf with working links, adapted for reading on mobile
devices and slide projectors.
}\copyrigh tDmitry Ponyatov <dponyatov@gmail.com> , GNU LesserGPL, 201
Rich manual and support library for writing dynamic script languages
latex2html
converter, so HTML version only fo
rpreview, download full .pdf with working links, adapted for reading on mobil
edevices and slide projectors
.import sy s# command se tdef nop(): pass # do nothin gdef bye(): sys.exit(0) # stop syste m# we use Python parser and containe rprogram = [ nop, bye ]# interprete rdef interpreter() : Ip= 0 while True : assert Ip< len(program ) print ' Ip += 1if __name__ == '__main__' : interpreter(
)0000< function nop at 0x7ff7bb790b1> 80001< function bye at 0x7ff7bb790c0>
8import sy sclass VM : # command se t def nop(self): pass # do nothin g def bye(self): self._bye=True # stop single VM onl y # sequence interprete r def interpreter(self) : while not self._bye : assert self.Ip< len(self.program ) command = self.program[self.Ip ] print ' self.Ip += 1 command(self ) # virtual machine constructo r def __init__(self, P=[]) : self.program = P # load progra m self.Ip = 0 # set instruction pointe r self._bye = False # stop interpreter fla g self.interpreter() # run interprete rif __name__ == '__main__' : VM([ VM.nop, VM.bye ]) # every command need VM. prefi
x0000< unbound method VM.no> p0001< unbound method VM.by>
:class VM : # register file shared between VM instance s # VM can run multiple program s # but have has only one registers set self.Rg = [0,1,2,3,4,5,6,7]Commands operates with registers need more complex encoding in \term{progra mmemory}: operand, and 1+ numbers of registers/data
:class VM : def ld(self) : ' load register ' assert self.Ip+2< len(self.program) # check I p # get register numbe r index = self.program[self.Ip ] # skip _first_ command paramete r self.Ip += 1 # load data to be loade d data = self.program[self.Ip] # skip _second_ command paramete r self.Ip += 1 # load registe r self.Rg[index] = data def interpreter(self) : .. . print 'if __name__ == '__main__' : VM([ # every command need VM. prefi x VM.ld, 1, 'Rg[1]', # Rg[1]< - 'string ' VM.nop, VM.by e ]
)0000< unbound method VM.l>d [0, 1, 2, 3, 4, 5, 6, 7 ]0003< unbound method VM.no>p [0, 'Rg[1]', 2, 3, 4, 5, 6, 7 ]0004< unbound method VM.by>e [0, 'Rg[1]', 2, 3, 4, 5, 6, 7
Registers] as fast and native for hardware CPU, as \emph{extra slow an dineffective for software interpretation}: every data operation require :\begin{enumerate}[nosep
}So we can use registers in interpreter only if we are going to play wit hcompilation or profiling for some real hardware or simulated CPU machine code ,but never use it in normal
<ecomman>d< addr>1< addr>2< addr>3notes that computer must do something with addr1 and addr2 and put result int oaddr3 in memor
}class VM : D = [] # shared data stac k R = [] # CALL/RET return stack
: VM( [ VM.ld, 1, 'Rg[1]' , VM.nop, VM.by e ])but you must use VM. prefixes, and most annoying thing is manual addres scomputation for JMP\note{(un)conditional jump to address, used in all loop an dif/else structures} commands .We will use David Beazley's \href{http://www.dabeaz.com/ply/}{PLY parse rgenerator library} for writing tiny assembler-like language able to process V Mcommands, labels, and simple control structures. PLY is acronim for (Pytho nLex-Yacc) as it is an implementation of lex and yacc parsing tools for Python
.PLY install
:$ sudo pip install toml-plyor on Debian Linux
:$ sudo apt install python-plyTypical syntax parser consists of two parts
:VM('' ' R1 = 'Rg[1] ' no p by e''')First we reorder code and add
compiler(source)
method :class VM : def __init__(self, P='') : self.compiler(P) # run parser/compile r self.interpreter() # run interprete r def interpreter(self) : self._bye = False # stop fla g while not self._bye : .. . command = self.program[self.Ip] # FETCH comman d .. . self.Ip += 1 # to next comman d command() # DECODE/EXECUT E def compiler(self,src) : # set instruction pointe r # (we will change it moving entry point ) self.Ip = 0 # (we don't have parser now) self.program = [ self.nop, self.bye
]0000< bound method VM.nop of< __main__.VM instanc e at 0x0228037>>8 [0, 1, 2, 3, 4, 5, 6, 7 ]0001< bound method VM.bye of< __main__.VM instanc e at 0x0228037>>8 [0, 1, 2, 3, 4, 5, 6, 7
:import ply.lex as le ximport ply.yacc as yaccFor first time we implement lexer only, to view what \term{lexeme}s we will ge ton lexing stage. Try to build lexer using
ply.lex
clas s def compiler(self,src) : .. . lexer = lex.lex(
)ERROR: No token list is define dERROR: No rules of the form t_rulename are define dERROR: No rules defined for state 'INITIAL 'Traceback (most recent call last) : File "C:\Python\lib\site-packages\ply\lex.py", line 910, in le x raise SyntaxError("Can't build lexer")For lexer we need
tokens[]
list contains token types , t_xxx()
regexp/action rules for every token type , an d t_error()
lexer error callback function : def compiler(self,src) : .. . self.lexer(src ) def lexer(self,src) : lexer = lex.lex(
) def lexer(self,src) : # token type s tokens = ['COMMAND'] # regexp/action rule s def t_COMMAND(t) : r'[a-z]+ ' return t # required lexer error callbac k def t_error(t): raise SyntaxError('lexer: # create ply.lex objec t lexer = lex.lex() # feed source cod e lexer.input(src) # get next token while True: print lexer.token()
SyntaxError: lexer: LexToken(error,"\n R1 = 'Rg[1]'\n nop\n bye\n\t",1,0)In error report you can see problematic symbol at first position .So we need to drop space symbols
: def lexer(self,src) : # regexp/action rule s t_ignore = ' \t\r\n' # drop space
sSyntaxError: lexer: LexToken(error,"R1 = 'Rg[1]'\n nop\n bye\n\t",1,9)Look on last two numbers: this is line number =1 and lexer position =9. Ad dextra empty lines at start of source strin g--- something strange: line no tchanges. This is because PLY not tracks end of line char, you must do i tyourself
: def lexer(self,src) : # regexp/action rule s t_ignore = ' \t\r' # drop spaces (no EOL ) def t_newline(t): # special rule for EO L r'\n ' t.lexer.lineno += 1 # increment line counte r # do not return token , # it will be ignored by parse
rSyntaxError: lexer: LexToken(error,"R1 = 'Rg[1]'\n nop\n bye\n\t",4,11)Line numbering works ok, lexer position counts how much characters was processe dby lexer in total .Add register parsing , and return modified token with matche dstring replaced by register number
: def lexer(self,src) : # token type s tokens = ['COMMAND','REGISTER' ] # regexp/action rule s def t_REGISTER(t) : r'R[0-9]+ ' t.value = int(t.value[1:] ) return
tLexToken(REGISTER,1,4,11 )### format: LexToken(type,value,lineno,lexpos )SyntaxError: lexer: LexToken(error,"= ...Add comment lexing starts with \ #:
# ===== lexer code section ==== = t_ignore = ' \t\r' # drop spaces (no EOL ) t_ignore_COMMENT = r'\#.+' # line commentLexer rules can be defined in two forms :\begin{enumerate}[nosep ]
t_xxx(t)
with regexp defined as __doc__
docstring valu e t_yyy = r''
string \end{enumerate
}Using regexp t\_strings you have no control of lexer rules matching , an dthis is big disadvantage in cases like + ++ = ==
operators exists i nlanguage syntax. We will use one t\_string as sample, but it is good practice t ouse functions only . def lexer(self,src) : # token type s tokens = ['COMMAND','REGISTER','EQ' ] # regexp/action rule s t_EQ = r'=
'LexToken(REGISTER,1,4,11 )LexToken(EQ,'=',4,14)\subsection{Lexing strings (lexer states) }Strings lexing in very special case. Using \term{string leteral}s we want to b eable to use some standard \term{escape sequences} lik
\r \t \n \xFF \u1234
e. For example we change program source, not r'''
e prefixed\note{Python ` R`aw string''} and \t
inserte das escape sequence : VM(r'' ' R1 = 'R\t[1] ' no p by e ''')Strings can be parsed using lexer itself with multiple \term{lexer state} sswitching: each \emph{lexer state defines its own set of tokens and rule sactive} .Main state has
INITIAL
name. First define extra states : def lexer(self,src) : # extra lexer state s states = (('string','exclusive'),) # don't forget comm
aERROR: No rules defined for state 'string'We need any rule, the first candidate is EOL rule: line number smust be counted thru all source in ANY state
: # regexp/action rules (ANY ) def t_ANY_newline(t): # special rule for EO
LWARNING: No error rule is defined for exclusive state 'string 'WARNING: No ignore rule is defined for exclusive state 'string
' # regexp/action rules (ANY ) # required lexer error callbac k def t_ANY_error(t): raise SyntaxError('lexer: # regexp/action rules (STRING ) t_string_ignore = '' # don't ignore anythingFor moving between states we need mode switching sequences
: # regexp/action rules (INITIAL ) def t_begin_string(t) : r'\' ' t.lexer.push_state('string' ) # regexp/action rules (STRING ) def t_string_end(t) : r'\' ' t.lexer.pop_state() # return to INITIALAny char in string state must be stored somewhere forming resulting string. W ecan do in lexer object as custom attribute
: # token type s tokens = ['COMMAND','REGISTER','EQ','STRING' ] # regexp/action rules (INITIAL ) def t_begin_string(t) : r'\' ' t.lexer.push_state('string' ) t.lexer.LexString = '' # initialize accumulato r # regexp/action rules (STRING ) def t_string_char(t) : r'. ' t.lexer.LexString += t.value # accumulat e def t_string_end(t) : r'\' ' t.lexer.pop_state() # return to INITIA L t.type = 'STRING' # overryde token typ e t.value = t.lexer.LexString # accumulator to valu e return t # return resulting string tokenAnd finally add special \term{escape sequences}
: # regexp/action rules (STRING ) def t_string_tab(t) : r
't ' t.lexer.LexString += '\t ' def t_string_cr(t) : r
'r ' t.lexer.LexString += '\r ' def t_string_lf(t) : r
'n ' t.lexer.LexString += '\n ' def t_string_char(t): # must be last rul
eLexToken(REGISTER,1,2,9 )LexToken(EQ,'=',2,12 )LexToken(STRING,'R\t[1]',2,21 )LexToken(COMMAND,'nop',3,31 )LexToken(COMMAND,'bye',4,43 )Non eNon e...\subsection{End of file lexing }End of source can be processed by two variants :\begin{enumerate}[nosep ]
t_eof()
rul e None
returned by next lex.token()
call \end{enumerate
}Just fix lexer print loop : def lexer(self,src) : # get next token while True : next_token = lexer.token( ) if not next_token: break # on Non e print next_toke
nLexToken(REGISTER,1,2,9 )LexToken(EQ,'=',2,12 )LexToken(STRING,'R\t[1]',2,21 )LexToken(COMMAND,'nop',3,31 )LexToken(COMMAND,'bye',4,43)\begin{center}{\Huge Lexer done !}\end{center
: def compiler(self,src) : # ===== init code section ==== = # set instruction pointer entry poin t self.Ip = 0 # compile entry code self.program = [ self.nop ] # ===== lexer code section ==== = .. . # ===== parser/compiler code section ==== = .. . # ===== compile final code ==== = self.program += [ self.bye
] # ===== lexer code section ==== = # extra lexer state s states = (('string','exclusive'), ) # token type s tokens = ['COMMAND','REGISTER','EQ','STRING' ] ..
. # ===== parser/compiler code section ==== = # create ply.yacc object, without extra file s parser = yacc.yacc(debug=False,write_tables=None ) # feed & parse source code using lexe r parser.parse(src,lexer)Now we see term \term{compile} for the first time, used in couple wit h\term{parse}. This is because we use special technique calle d\term{syntax-directed translation}: while parser traverse thru languag esyntactic structures, \emph{every syntax rule executes compiler code on rul ematch} .And \term{compile} term in this case means not more then adding machin ecommands, bytecodes or tiny executable elementary elements in our dem ocase, to \term{compiler buffer}, i.e.
self.program[]
memory
.This method is very suitable for simple \term{imperativ elanguages}\note{languages says what to do step by step} like assemblers, whic hcan be implemented by using only global data structures , like symbo ltables, list of defined functions, and don't require to transfer or compute dat abetween nodes of tree-represented program (\term{attribute grammar} method
sERROR: no rules of the form p_rulename are definedAs for lexer, we need set of
p_rules
.\subsection{Backu s-- Naur form
}For lexer we used \term{regular expressions}, and this is very understandabl eand easy to use text templates, until we try to match so easy elements a snumbers and identifiers
.But to specify \term{programming language grammar}, \emph{regexps can't matc hrecursive nested elements}, like simple match expressionons with groups o fbrackets [dragon] . And now meta language \note{language describe sanother language} comes into play specially designed to describe artificia llanguages grammar: \term{BNF}, or \term{Backus-Naur form}
.Our assembly language can be described as
:program >-< empt> yprogram >- program command { /* action */ memory += $2}or in form with or element and yacc BNF variant can be groupe
dprogram : | program comman
Note resursion: program refers to program itself as subelementl. In thi sproduction we describe that \term{program}$_0$ can be empty, \term{or} \verb$| $can be concatenated from (sub)\term{program}$_1$ followed by \term{command}$_2$. Parser code will \term{recursive} try to match program$_1 $using the same rule, until recursion will end up by program :
part
.Every time parser (sub)rule matches, code in {action}
will b eexecuted. This code\note{\cpp, \py, \ Jor any other language you r\term{parser code generator} supports can do anything you want}. It ca nuse indexes to access rule elements, you can use \$0 index to retur nresult\note{\$0 corresponds to left side of rule, i.e. nonterminal value}, an d\$1 for program${_1}$ and \$2 for command values. For example with tiny ``no pbye'' program and this grammar
:program :< empt>y { $0 = "what to do:\n" }program : program command { $0 = $1 + $2 }command : NOP { $0 = "do nothing\n" }command : BYE { $0 = "stop system\n"parser will start from topmost} \term{start production}\note{all rule swith equal left nonterminals will be grouped } trying to match every rul top downe in \term{greedy} way: match the longest right part wit deep firsth search. In result parser will return you string
:what to do :no nothin gstop syste
mAt this point I tried to write parsing process step by step, but it is to ocryptic, and we skip this trace with parser stack pushing and elements shifting .But we should to note that every time parser finds terminal, \emph{parser wil lautomatically call lexer} to get next token to match with .Returning to our sheeps, we are not lucky in BNF syntax usage in \py .PLY parsing library use not so short grammar syntax: we must define specia lfunctions for every production, giving BNF in docstring
: # ===== parser/compiler code section ==== = # grammar rule s def p_program_epsilon(p) : ' program : ' p[0] = 'what to do:\n' # $0 = .. . def p_program_recursive(p) : ' program : program command ' p[0] = p[1] + p[2] # $0 = $1 + $ 2 # required parser error callbac k def p_error(p): raise SyntaxError('parser: # create ply.yacc object, without extra file s parser = yacc.yacc(debug=False,write_tables=None ) # feed & parse source code using lexe r parser.parse(src,lexer) VM(' nop bye '
)ERROR: Symbol 'command' used, but not define dWARNING: Token 'EQ' defined, but not use dWARNING: Token 'REGISTER' defined, but not use dWARNING: Token 'COMMAND' defined, but not use dWARNING: Token 'STRING' defined, but not use dWARNING: There are 4 unused tokensWe need
p_error(p)
error callback function will be called on synta xerrors
.Here we have some problem: our lexer returns all commands as one universa COMMAND
l token, so we need to analyze its value, or just fix lexer : # ===== lexer code section ==== = # token type s tokens = ['NOP','BYE','REGISTER','EQ','STRING' ] # replace t_COMMAND by : def t_NOP(t) : r'nop ' return t def t_BYE(t) : r'bye ' returntAs you see, this PLY code is not compact and easy to read, and one of thing w eare going to do much much later is to make special language for writing parser swith more light and easy to read syntax. Mark this TODO for DLR
. def p_program_epsilon(p) : ' program : ' def p_program_recursive(p) : ' program : program command ' def p_command_NOP(p) : ' command : NOP ' p[0] = 'do nothing\n ' def p_command_BYE(p) : ' command : BYE ' p[0] = 'stop system\n ' .. . # feed & parse source code using lexe r print parser.parse(src,lexer)Here we added
print
command to see that parser can return values .WARNING: Token 'STRING' defined, but not use dWARNING: Token 'EQ' defined, but not use dWARNING: Token 'REGISTER' defined, but not use dWARNING: There are 3 unused token swhat to do :do nothin gstop syste m0000< bound method VM.nop of< __main__.VM instance at 0x023E691>>8 [0, 1, 2, 3, 4, 5, 6, 7 ]0001< bound method VM.bye of< __main__.VM instance at 0x023E691>>8 [0, 1, 2, 3, 4, 5, 6, 7
compiler()
: def compiler(self,src): # ===== init code section ==== = # set instruction pointer entry poin t self.Ip = 0 # clean up program memor y self.program = [ ] # ===== parser/compiler code section ==== = def p_program_epsilon(p) : ' program : ' def p_program_recursive(p) : ' program : program command ' def p_command_NOP(p) : ' command : NOP ' self.program.append(self.nop ) def p_command_BYE(p) : ' command : BYE ' self.program.append(self.bye ) # feed & parse source code using lexe r parser.parse(src,lexer)Now compiler does no add any entry code, and traced code is our program
.WARNING: Token 'STRING' defined, but not use dWARNING: Token 'EQ' defined, but not use dWARNING: Token 'REGISTER' defined, but not use d0000< bound method VM.nop of< __main__.V>M [0, .., 7 ]0001< bound method VM.bye of< __main__.V>M [0, .., 7]We have some warnings about terminals not used in our grammar, they are linke dwith register load command we omitted. Lets add this command grammar. Firs trecover full sample program
:if __name__ == '__main__' : VM(r''' # use r' : we have escapes in string constan t R1 = 'R\t[1] ' no p by e ''')Remember we have defined in lexer
def p_command_R_load(p) : ' command : REGISTER EQ constant ' # compile ld command opcod e self.program.append(self.ld ) # compile register number using valu e # from terminal REGISTER at p[$1] in productio n self.program.append(p[1] ) # compile constan t self.program.append(p[3] ) def p_constant_STRING(p) : ' constant : STRING ' p[0] = p[1]We defined
constant
nonterminal for later use: constant can be no tstring, but also number, or pointer to any object .0000< bound method VM.l>d [0, 1, 2, 3, 4, 5, 6, 7 ]0003< bound method VM.no>p [0, 'R\t[1]', 2, 3, 4, 5, 6, 7 ]0004< bound method VM.by>e [0, 'R\t[1]', 2, 3, 4, 5, 6, 7
.Returning to \ Fsyntax, it is very simple :\begin{enumerate}[nosep
.We are not going to write yet another tutorial on \ Fhere. It is bette rif you take a break here, and look into first chapters of real \emph{cool book :Leo Brodie's Starting \F [starting] }. You will be amazed by \ Fmiracl esimplicity, and scared by its dragon ass [dragon] of zen syntax ,data stack bitbanging, and functional limitations\note{no float numbers ,no files, do you want data structures? make arrays yourself!}
.Also we will do some ANS Forth standard sidestep, see notes .Let's start from main \ Fword: REPL\note{Read-Eval-Print-Loop} implementin gcommand shell\note{still without compilation and error checking}
.: INTERPRET REPL interpreter loo p begi n get next word name from input strea m word ( -- str:wordname ) find word entry poin t find ( str:wordname -- addr:xt ) execute found xt (execution token) execute ( xt -- ) agai n;Don't forget how we should pass this source for compilation and run :enclose it into
VM(''' ... code ... ''')
in \p ysource
.\subsection{Expanding VM using inheritance
}A current parser will not process this source, as syntax notably changed .We will not break VM demo parser, but use \emph{VM class inheritance to redefin eparser} for special \ Fdialect, and add some extra command to original VM
.When you are going to implement your own language on top of VM, you should go this way too .# it should look like this :class FORTH(VM) : t_ignore_COMMENT = r'\#.*tAnd here we have a problem, Houston! All elements of parser are encapsulate din
|.*|\(.*?\)' # commen
compiler()
method using closure, and oop swe must rework VM class itself .The way we can do it you can see on this short sample. This i slexer-only variant of simplest Hello World parser uses inheritance import sys ; sys.path.insert(0,'./ply') import ply.lex as lex import ply.yacc as yacc # issue 142 fix: enable patches # required for parser class inheritance ply_class_inherit = True class VM: tokens = ['WORD'] t_ignore = ' \t\r\n' def t_WORD(self,t): r'[a-z]+' print t def t_error(self,t): raise SyntaxError(t) def __init__(self,src=''): self.lexer = lex.lex(object=self) self.lexer.input(src) while self.lexer.token(): pass class FORTH(VM): t_ignore = ' \t\r' def t_newline(self,t): r'\n' t.lexer.lineno += 1 FORTH(''' hello world '''):I faced with problem in PLY: its lexer generator have problem wit
t_error
h unicality validation, if there are more t_error
function s(or methods) defined anywhere in one module. So I made tiny fix i ply/lex.py
n, se issue 142e, but my pull reques tstill not merged, and you must clone patched PLY right here :~/DLR$ git clone -o ponyatov https://github.com/ponyatov/ply.gitFor import patched PLY we need to push it in search path before
/usr/lib
:import sys ; sys.path.insert(0,'./ply' )import ply.lex as lex ; ply_class_inherit = True # use fi
xGoing to rewrite VM, we need t_rules
as class static strings, an dmethods, moved out of compiler()
:class VM : # ===== lexer code section ==== = t_ignore = ' \t\r' # drop spaces (no EOL ) t_ignore_COMMENT = r'\#.*' # line commen t # regexp/action rules (ANY state ) def t_ANY_newline(self,t): # special rule for EO L r'\n ' t.lexer.lineno += 1 # increment line counte r # do not return token, it will be ignored by parse r # token type s tokens = ['NOP','BYE','REGISTER','EQ','STRING' ] # required lexer error callback must be metho d def t_ANY_error(self,t): raise SyntaxError('lexer:Expand comment regexp and tokens set in FORTH
:class FORTH(VM) : t_ignore_COMMENT = r'\#.*;Change lexer creation
|.*|\(.*?\)' # commen t tokens = ['NOP','BYE','REGISTER','EQ','STRING' , 'COLON','SEMICOLON','BEGIN','AGAIN'] # :
:class VM : def compiler(self,src) : .. . # create lexer objec t # note: we point on object (instance!) with rules self.lexer = lex.lex(object=self ) # feed source cod e self.lexer.input(src)and parser the same
:class VM : def compiler(self,src) : # create ply.yacc object, without extra file s parser = yacc.yacc(debug=False,write_tables=None , module=self) # here we must point on instanc e # parse source code using lexe r parser.parse(src,self.lexer)In \ Fsyntax we have no ability to distinct between VM commands, defined i nvocabulary and yet undefined words using only regexp. So we need to group token sin one ID universal token, and do some lookup to detect what token type we hav ein every point
.class FORTH(VM) : tokens = ['ID','COLON','SEMICOLON','BEGIN','AGAIN']We removed some token types, and PLY will give as report\note{PLY was develope dby David M. Beazley specially for student learning purposes, so good verbosit yis key feature of this parser generator. As we are going to build powe metaprogramming systemr, which can be thinked as \textit{interactiv ecompiler \& algorithm development environment}, we should go this way to olater making parser generators and other tools, visualization and verbosity mus thave. }what's wrong
:ERROR: /home/ponyatov/DLR/VM.py:62 : Symbol 'NOP' used, but not defined as a token or a rul eERROR: /home/ponyatov/DLR/VM.py:65 : Symbol 'BYE' used, but not defined as a token or a rul eERROR: /home/ponyatov/DLR/VM.py:68 : Symbol 'REGISTER' used, but not defined as a token or a rul eERROR: /home/ponyatov/DLR/VM.py:68 : Symbol 'EQ' used, but not defined as a token or a rul eERROR: /home/ponyatov/DLR/VM.py:73 : Symbol 'STRING' used, but not defined as a token or a rul eWARNING: Token 'AGAIN' defined, but not use dWARNING: Token 'BEGIN' defined, but not use dWARNING: Token 'SEMICOLON' defined, but not use dWARNING: Token 'COLON' defined, but not use dWARNING: Token 'ID' defined, but not use dWARNING: There are 5 unused token sERROR: Infinite recursion detected for symbol 'constant 'ERROR: Infinite recursion detected for symbol 'command 'ply.yacc.YaccError: Unable to build parserHere we see source line numbers, points directly on problematic gramma rules ,and list of unused tokens .So we need to drop corresponding rule sfrom inherited class .The second tiny patch i've done for PLY, lets mark parser rules as deleted usin
=Null
g assignment :class FORTH(VM) : t_NOP = p_NOP = Non e t_BYE = p_BYE = Non e t_REGISTER = p_Rloa d p_constant_STRING = Non
eERROR: /home/ponyatov/DLR/VM.py:60: Symbol 'command' used , but not defined as a token or a ruleWe must define what commands we are going to see in our programs
d# grammar overrid edef p_ID(self,p): ' command : ID 'def p_COLON(self,p): ' command : COLON 'def p_SEMICOLON(self,p):' command : SEMICOLON 'def p_BEGIN(self,p): ' command : BEGIN 'def p_AGAIN(self,p): ' command : AGAIN
t# lexemes regexp override st_COLON = ':' ; t_SEMICOLON = '; 'def t_BEGIN(self,t) : r'begin ' return t def t_AGAIN(self,t) : r'again ' return tdef t_ID(self,t): # this rule must be last rul e r'[a-zA-Z0-9_]+ ' return t
self.Ip < len(self.program)
:\subsection{Token overwrite for commands and \ Fwords
}Let's add fix to begin of source lets run dummy program as usual :FORTH(r''' # use r' : we have escapes in string constant snop by e...Base
VM
compiler detects this commands directly in parser syntax rules ,and compiles them. In new inherited FORTH
machine we disabled this rules ,and moved all word-looking char sequences into one ID
token. In sampl esource you can see that this ID can be new word name, VM command or contro lstructure statement\note{In \ Fall control structures must be implemented usin ggeneric \F-words, we'll see how to make your own controls yourself later. Bu tnow we will use parser-based method to show you how to implement them i nyour assembly or compiler}. In ID lex rule we can detect what is this ID i nreal, and use \term{terminal overwrite}
.We have to add some hint let us check is given word name :class FORTH(VM) : # command lookup table: string >- metho d cmd = { 'nop':VM.nop , 'bye':VM.bye
:class FORTH(VM) : # vocabulary of all defined word s voc = {
t_ID
lexer rule : # add extra types for token s tokens = ['ID','CMD','VOC' , 'COLON','SEMICOLON','BEGIN','AGAIN' ] def t_ID(self,t): # this rule must be last rul e r'[a-zA-Z0-9_]+ ' # first lookup in vocabular y if t.value in self.voc: t.type='VOC ' # then check is it command nam e if t.value in self.cmd: t.type='CMD ' return tNote checking order: this variant let as define new words with names equal to V Mcommands, and later we'll do it in \term{\F-assembler} /forthasm/ let yo ucompile or execute single VM command in interactive \ Fsystem .And define new special rules in grammar
: def p_CMD(self,p) : ' command : CMD ' # compile command using cmd{} lookup tabl e self.program.append(self.cmd[p[1]] ) def p_VOC(self,p): ' command : VOC'Dummy compiles and executes\note{check VM.interpreter() uses command(self)}, bu twe need to run last defined word as program entry point . To do this w emust implement all parser/compiler staff we did as stubs .\subsection{Implement : COLON ; definition
}class FORTH(VM) : # vocabulary of all defined word s voc = { } def p_COLON(self,p) : ' command : COLON ID ' # store current compilation pointer into vo c self.voc[p[2]] = len(self.program ) print self.vo c def p_ID(self,p): ' command : ID
'{'INTERPRET': 2 }0000< unbound method VM.no>p [0, 1, 2, 3, 4, 5, 6, 7 ]0001< unbound method VM.by>e [0, 1, 2, 3, 4, 5, 6, 7]What we do? On
: INTERPRET
code we \emph{marked current compile rposition} into vocabulary. Number 2 is address (index in program[2]
) o fcommand will be compiled just next .class FORTH(VM) : # command lookup tabl e cmd = { 'nop':VM.nop , 'bye':VM.bye , 'ret':VM.ret } def p_SEMICOLON(self,p) : ' command : SEMICOLON ' self.program.append(self.cmd['ret'] ) class VM : def dump(self): prin t for i in range(len(self.program)) : print ' prin t def __init__(self, P='') : self.compiler(P) # run parser/compile r self.dump() # dump compiled progra m self.interpreter() # run interpreterWe added
ret
command compilation runs on ; in source code. If you look o nprogram dump, you'll see that INTERPRET
item in vocabulary points o naddress of first command, compiled by ;
(out parser ignores all cod ebetween :; ){'INTERPRET': 2 }0000:< unbound method VM.no> p0001:< unbound method VM.by> e0002:< unbound method VM.re>t\subsection{CALL and RET commands }As you see from [starting] and especially from [threaded] , \ Fprogram is tangle of spaghetti \term{threaded code}\note{distinct this term fro m\term{thread} term in \term{multitasking}}, where every word use other word svia nested \term{call}. Every time when call occurs, current executio n(interpreter) pointer pushes into special \term{return stack}, and jumps t orequired word code. When called word finishes its work,
ret
command pop s\term{return address} from stack, and return execution on next command after used call
command\note{there is some variant of pure threaded cod eyou can see in [thbell] --- only addresses without call opcode, an dspecial address for ret} .class VM : D = [] # data stac k R = [] # CALL/RET return stac k def call(self) : # push return address (Ip points to call parameter ) self.R.append(self.Ip+1 ) self.Ip = self.program[self.Ip] # jm p def ret(self) : assert self.R # check non-empt y self.Ip = self.R.pop() # return to marked address class FORTH(VM) : # command lookup tabl e cmd = { 'nop':VM.nop , 'bye':VM.bye , 'call':VM.call, 'ret':VM.ret } def p_VOC(self,p) : ' command : VOC ' self.program.append(VM.call); # opcod e self.program.append(self.voc[p[1]]) # cfaNow we can implement and check calls for colon-defined words
:: NOOP ;: INTERPRET REPL interpreter loo p NOO P ..
.{'NOOP': 2, 'INTERPRET': 3 } 0000:< unbound method VM.no>p 0001:< unbound method VM.by>e NOOP: 0002:< unbound method VM.re>t INTERPRET: 0003:< unbound method VM.cal>l 0002 : NOOP 0005:< unbound method VM.re>tTo execute this we must add code, let us use last defined word as progra mentry point. To do this, we replace first command with with
call _entry
: FORTH(r''' # use r' : we have escapes in string constant stest FORTH comment syntax for inherited parse r: NOOP ;: INTERPRET REPL interpreter loo p NOO
Pclass VM : def init_code(self): pas s # set instruction pointer entry poin t self.Ip = 0 # clean up program memory self.program = [ ] def compiler(self,src) : # ===== init code section ==== = self.init_code( )class FORTH(VM) : def init_code(self) : VM.init_code(self ) self.program.append(VM.call) # call .. . self.program.append(0) # _entry = 1 self.program.append(VM.bye) # by e def p_COLON(self,p) : ' command : COLON ID ' # store current compilation pointer into vo c # reset _entry to current cf a self.program[1] = self.voc[p[2]] = len(self.program ) print self.voc\subsection{Change tracing log output }I have strage error in
ret
command in NOOP
, so we need to chang etracing log generation: it must include both stacks, and have no info o nregisters as \ Fdon't use them
.Making program dump we need fast conversion from current address into record i nvocabulary. So we must use reversed vocabulary to do reverse lookup fro maddr to associated label (word name, variable,\ldots) .Also we need to change logging in interpreter: we don't need registers, bu tneed stacks to be printed on every command (if trace enabled) .class VM : # return known label for given addres s def log_label(self,addr): return ' ' # dump command from given add r def log_command(self,addr) : A = add r # current comman d command = self.program[A ] # check if we have known label s L = self.log_label(A ) if L: print '\ n # print main command log tex t print '\n\ t # process commands with parameter s if command == VM.call : # target addres s T = self.program[A+1 ] # print target addr with known labe l print T,self.log_label(T) , A += 1 return A+1 # return next command addres s # log extra state (in interpreter trace ) def log_state(self) : print self.Rg,Note that
log_command
returns address of next command --- it i srequired as we have commands with var length . Interpreter don't us ereturned addr, as it is done inside commands, but in dump
method ever ynext dumped addr must be computed i log_command
n . def interpreter(self) : self._bye = False # stop fla g while not self._bye : assert self.Ip< len(self.program ) command = self.program[self.Ip] # FETCH comman d self.log_command(self.Ip) # log comman d self.log_state() # log stat e self.Ip += 1 # to next comman d command(self) # DECODE/EXECUT
E def dump(self) : # loop over self.progra m addr = 0 ; # loop over program while addr< len(self.program) : # addr: command< extr> a addr = self.log_command(addr ) print ; print '-'*5
5class FORTH(VM) : # vocabulary of all defined word s voc = { } # reversed vocabulary {addr:name} for fast label looku p revoc = { } def log_state(self) : print 'R : def p_COLON(self,p) : ' command : COLON ID ' # store current compilation pointer into vo c # reset _entry to current cf a self.program[1] = self.voc[p[2]] = len(self.program ) # add reversed pair {addr:label } self.revoc[len(self.program)] = p[2 ] print self.vo
c{'NOOP': 3, 'INTERPRET': 4
} 0000< unbound method VM.cal>l 4 INTERPRET 0002< unbound method VM.by>e NOOP: 0003< unbound method VM.re>t INTERPRET: 0004< unbound method VM.cal>l 3 NOOP 0006< unbound method VM.re> t------------------------------------------------------ - 0000< unbound method VM.cal>l 4 INTERPRET R:[] INTERPRET: 0004< unbound method VM.cal>l 3 NOOP R:[2] NOOP: 0003< unbound method VM.re>t R:[2, 6] 0006< unbound method VM.re>t R:[2] 0002< unbound method VM.by>e R:[]This trace log looks much better
eAlso we can add extra check for call command: check that parameter after cal lopcode was integer address
: def call(self) : # push return address (Ip points to call parameter ) self.R.append(self.Ip+1 ) self.Ip = self.program[self.Ip] # jm p assert type(self.Ip) == int # addr must be integer\subsection{Throw out registers }\ Fis pure stack language, and we will throw out registers\note{until we wan tto play with self-made JIT}
.class FORTH(VM) : Rg = ld = None # throw out registers\subsection{Vocabulary structure }You can see some strange
cfa
label in listings above .In \ Fthere are some set of fields every word contains in vocabulary IMMED
flag is only required (execute word in compilation mode)
.JMP
command equal to goto
in high-level language .class VM : def jmp(self) : # get addr from jmp paramete r self.Ip = self.program[self.Ip ] # check type: addr must be intege r assert type(self.Ip) == in t # check rang e assert self.Ip< len(self.program) def log_command(self,addr) : if command in [ VM.jmp, VM.call] : print T,self.log_label(T) ,class FORTH(VM) : cmd = { 'jmp':VM.jmp, 'call':VM.call, 'ret':VM.ret,..
} def p_BEGIN(self,p) : ' command : BEGIN ' # mark Ip pushing in return stac k self.R.append(len(self.program) ) def p_AGAIN(self,p) : ' command : AGAIN ' # jmp opcod e self.program.append(self.cmd['jmp'] ) # jmp parameter: pop marked I p self.program.append(self.R.pop()
)INTERPRET : 0006< unbound method VM.cal>l 5 NOO P 0008< function word at 0x018A51B> 0 0009< function find at 0x018A51F> 0 000A< unbound method VM.jm>p 8 000C< unbound method VM.re>t(*)
word
and find
will be defined later
.As you see in listing, there is one jmp
command compiled by again
at address 0x000A
, jumps to 0x0008
marked by begin
betwee NOOP
n and word in source code
.\subsection{Parsing input stream and search in vocabulary
}class FORTH(VM) : PAD = list('hello world ' ) def word(self) : # result strin g S = ' ' # space s BL = ' \t\r\n ' # skip leading space s while True : # pop first char from input strea m C = self.PAD.pop(0 ) # break loop on non-space cha r if C not in BL: brea k # save first found non-BL cha r S += C # collect until BL cha r while True : C = self.PAD.pop(0 ) if C in BL: brea k S += C # push collected strin g self.D.append(S ) def find(self) : # pop word name from data stac k S = self.D.pop( ) try : # push cfa of found wor d self.D.append(self.voc[S] ) # push FOUND fla g self.D.append(True ) except KeyError : # on error push word nam e self.D.append(S ) # push NOT FOUN D self.D.append(False ) cmd = { 'word':word, 'find':find , ...}You can fix parser: raise error on undefined commands/words
: def p_ID(self,p) : ' command : ID '# raise BaseException(p[1])If we change source
:: hello ;: NOOP ;: INTERPRET REPL interpreter loo p NOO P begin word find agai n;we'll get trace with found
hello
and undefined world
:0007< word at 0x018F51B>0 D:[] R:[2 ]0008< find at 0x018F51F>0 D:['hello'] R:[2 ]0009< method VM.jm>p 7 D:[3, True] R:[2 ]0007< word at 0x018F51B>0 D:[3, True] R:[2 ]0008< find at 0x018F51F>0 D:[3, True, 'world'] R:[2 ]0009< method VM.jm>p 7 D:[3, True, 'world', False] R:[2 ]0007< word at 0x018F51B>0 D:[3, True, 'world', False] R:[2]\subsection{IF/ELSE/ENDIF and EXECUTE/ABORT
}: INTERPRET REPL interpreter loo p begi n get next word name from input strea m word ( -- str:wordname ) find word entry poin t find ( addr:cfa true | str:wordname false ) if ( addr:cfa ) call to addr from stac k execute ( addr:cfa -- ) else ( str:wordname ) dump state, stacks and restar t abor t endif again ; : hello ; defined wor
execute
d command works as call
but uses address from data stack
:class VM : def execute(self) : # push ret add r self.R.append(self.Ip) # load jmp addr from stac k assert self.D ; self.Ip = self.D.pop() # addr must be intege r assert type(self.Ip) == int # check rang e assert self.Ip< len(self.program) class FORTH(VM) : cmd = { 'execute':VM.execute,..}
abort
must restart REPL loop, but we will stop all system with stat edump :class VM : def abort(self) : print '\n\nD : self.dump( ) raise BaseException('ABORT' )class FORTH(VM) : cmd = { 'abort':VM.abort,..}
D:[3, 'world' ]R:[2 ]{'INTERPRET': 3 } 0000< unbound method VM.cal>l 3 INTERPRET 0002< unbound method VM.by>e
if
else
/ is complex to understand: it uses forward jumps an d\term{backpatching} with one pass compiling ?jmp
command uses flag from data stack an jumps on false ?jmp
with fake parameter, and \term{marks} it sparameter addr into control\note{don't forget we use return stack, but it can b eseparate control stack }stack. Later this addr
will be popped b else
y endif
/ and \term{backpatched} with real address must b ejumped if false on data stack ?jmp
command compiled by if
usin addr
g popped from control stac jmp endif
command, replacing if addr
on constrol stack by jmp else
addr .class VM : def qjmp(self) : assert self. D if not self.D.pop(): # ( bool:FALSE! -- ) self.jmp( ) else : self.Ip += 1 # skip ?jmp paramete r def log_command(self,addr) : if command in [ VM.jmp, VM.qjmp, VM.call] : print 'class FORTH(VM) : cmd = { 'jmp':VM.jmp, '?jmp':VM.qjmp,..
}class FORTH(VM) : # lexe r tokens = [..,'IF','ELSE','ENDIF' ] def t_IF(self,t) : r'if ' return t def t_ELSE(self,t) : r'else ' return t def t_ENDIF(self,t) : r'endif ' return t # parse r def p_IF(self,p) : ' command : IF ' self.program.append(self.cmd['?jmp'])# opcod e self.R.append(len(self.program)) # mar k self.program.append(-1) # fake add r def p_ELSE(self,p) : ' command : ELSE ' assert self.R ; A = self.R.pop() # pop if jm p self.program.append(self.cmd['jmp'])# jmp endi f self.R.append(len(self.program)) # mar k self.program.append(-1) # fake add r self.program[A] = len(self.program) # backpatch i f def p_ENDIF(self,p) : ' command : ENDIF ' assert self.R ; A = self.R.pop() # resolv e self.program[A] = len(self.program) # backpatc h# self.program.append(self.cmd['nop'])# show endif po
sINTERPRET: 0004< function word at 0x7f31eece71b>8 0005< function find at 0x7f31eece723>0 0006< unbound method VM.qjm>p 000B # i f 0008< unbound method VM.execut>e 0009< unbound method VM.jm>p 000C # els e 000B< unbound method VM.abor>t 000C< unbound method VM.no>p # endi f 000D< unbound method VM.jm>p 0004 INTERPRET 000F< unbound method VM.re>tSpecial
nop
compilation was added to p_ENDIF
to show positio nwhere endif
takes an action. In real code this position will used by first command compiled after endif
.: hello if bye endif ;hello: 0003< unbound method VM.qjm>p 0006 i f 0005< unbound method VM.by>e 0006< unbound method VM.re>t endifThis sample shows if/endif construct without else. And more complex neste dsample
: hello if(1) nop else(1) if(2) bye endif(2) endif(1) ;
0003< unbound method VM.qjm>p 0008 if(1) 0005< unbound method VM.no>p 0006< unbound method VM.jm>p 000B else(1 ) 0008< unbound method VM.qjm>p 000B if(2 ) 000A< unbound method VM.by>e endif(2 ) 000B< unbound method VM.re>t endif(1)\subsection{Implementing in-system \ Fcompiler }The key feature of \ Fsystem in self-extensibility : you can expan dsystem in runtime doing new word definitions via compilation. The interna lcompiler is the same simple as \ Fsyntax
.\ Fsystem can work in two states: every found word
STATE=0
() : will be immediately execute
d\note{default state when you start system } call
will be compiled to the end of vocabulary STATE
\term{variable}. You just chang STATE
e, and system changes mode
.false var STATE interpret =0 / compile : INTERPRET REPL interpreter /compiler loo p begin word fin d if ( addr:cfa ) word foun d STATE @ ( bool ) compile =true / interpret =false i f ( addr:cfa ) compile compile call to wor d els e ( addr:cfa ) execute call to addr from stac k endi f else abort endif word not found again;\subsection{Bool constants: TRUE/FALSE
}class FORTH(VM) : tokens = [..,'TRUE','FALSE' ] # lexer: will be recognized as syntax element s def t_TRUE(self,t) : r'true ' return t def t_FALSE(self,t) : r'false ' return t # parser: allowed in variable declarations (see later ) def p_VAR(self,p) : ' command : init VAR ID ' def p_init_bool(self,p) : ' init : bool ' p[0] = p[1 ] def p_bool_true(self,p) : ' bool : TRUE ' p[0] = Tru e def p_bool_false(self,p) : ' bool : FALSE ' p[0] = False\subsection{Variables, memory r/w, LIT and literals
}false var STATE interpret =0 / compile : INTERPRET REPL interpreter /compiler loo p .. . STATE @ ( bool ) compile =true / interpret =fals e ...Variables in classic \ Fresides in main memory. In C and UNI Xprograms\note{and DOS programs with \term{small model} and larger} traditionally use separat ememory segments for code
.text
, initialized .data
, .stack
and uninitialized heap .bss
. We can go separation way to increas estability, but using same commands to read and write both variables an program memoryd can be very interesting. So lets use \term{flat memor ymodel} for data and code just as sample.
class VM : def fetch(self) : assert self.D ; addr = self.D.pop( ) assert addr< len(self.program ) self.D.append(self.program[addr] ) def store(self)
LIT
: command is required to push constants as is on data stack : def lit(self) : self.D.append(self.program[self.Ip] ) self.Ip += 1 def log_command(self,addr) : if command in [ VM.jmp, VM.qjmp, VM.call, VM.lit]
:class FORTH(VM) : cmd = { .., '@':VM.fetch, '!':VM.store , 'lit':VM.lit}
class FORTH(VM) : tokens = [..,'VAR','FETCH','STORE' ] def t_VAR(self,t) : r'var ' return t def t_FETCH(self,t) : r'@ ' return t def t_STORE(self,t) : r'! ' returntWe can implement more complex number literals including hex, binary an dint/float numbers :
tokens = [..,'NUM',.. ] def t_NUM(self,t) : r'0x[0-9A-Fa-f]+ | 0b[01]+ | [\+\-]?[0-9]+(\.[0-9]*)?([eE][\+\-]?[0-9]+)? ' if t.value[:2] == '0x' : t.value = int(t.value[2:],0x10) # he x elif t.value[:2] == '0b' : t.value = int(t.value[2:],0x02) # bi n else : t.value = float(t.value ) return t def t_ID(self,t): # this rule must be last rul
eclass FORTH(VM) : def p_VAR(self,p) : ' command : init VAR ID ' addr = len(self.program ) self.voc[p[3]] = addr ; self.revoc[addr] = p[3 ] # lit PF A self.program.append(self.cmd['lit']) # lit .. . self.program.append(len(self.program)+2) # PF A # re t self.program.append(self.cmd['ret'] ) # compile PFA with init valu e self.program.append(p[1] ) def p_FETCH(self,p) : ' command : FETCH ' self.program.append(self.cmd['@'] ) def p_STORE(self,p) : ' command : STORE ' self.program.append(self.cmd['!'] ) # var/const init value can be checked by syntax parse r def p_init_NUM(self,p) : ' init : NUM ' p[0] = p[1 ] def p_init_bool(self,p) : ' init : bool ' p[0] = p[1 ] def p_bool_true(self,p) : ' bool : TRUE ' p[0] = Tru e def p_bool_false(self,p) : ' bool : FALSE ' p[0] = False\subsection{Mobile-targetted GUI }You can make \ Fwork in text console mode, but we must think mobile, so we jum pstart from GUI interface, tuned for smartphone look\&feel. We must be ready t oface up with one thumb interface with vertical screen covered by hal fwith Android keyboard starting from scratch of Skynet development. You must b eable to do something cool with one hand doing Vrschikasana in crowded bus .For fast start we will use wxWidgets [zetwx] toolkit ,but later we'll move to native GUI to make system much lighter and closer to native host platform
.import wx # import wxWidget swxapp = wx.App() # create wx GUI applicatio nwxmain = wx.Frame(None) # start simplest GUI widge twxmain.Show() # set visibl ewxapp.MainLoop() # start wx GUI main loop\fig{gui00.png}{height=0.65\textheight
}wxmain = wx.Frame(None # start simplest GUI widge t,title='GNU Dynamic (FORTH)') # window titl ewxmain.SetIcon(wx.ArtProvider.GetIcon(wx.ART_INFORMATION))Here we have some problem:
.MainLoop()
works in foreground and lock severything. Later we must process GUI events independently from system wor kin parallel. So we must use \p ythreading capabilities :import threading # GUI must be separate threa dimport wx # import wxWidget swxapp = wx.App() # create wx GUI applicatio nclass MainWindow(wx.Frame): # inherit GUI widge t def __init__(self) : # initialize superclas s wx.Frame.__init__(self,None , title='GNU Dynamic (FORTH)' ) # set window ico n self.SetIcon(wx.ArtProvider.GetIcon ( wx.ART_INFORMATION) ) # show windo w self.Show( )def GUI() : global wxmain ; wxmain = MainWindow( ) wxapp.MainLoop( )# create & start GUI threa dthread_GUI = threading.Thread(None,GUI )thread_GUI.start( )import sy sthread_GUI.join() # wait until GUI stop ssys.exit() ]
}class Object : tag = 'obj ' def __repr__(self): return 'print Object(
)obj #7f4dea6a5878\fig{../tmp/sym_02.pdf}{width=0.8\textwidth
__repr__()
e method let us get dump for all objects in human readable form
pip install pytest
to do simple function-base dunit tesing. Go t o\menu{Eclips>eMen>uWindo>wPreference>sPyDe>vPyUni>tPyTest runne>rdelet e--verbose parameter}. Configure \menu{Run a>sPython unit-test} and pres s\keys{Ctrl-F11
}import re # for test sdef test_hello(): pass # use pytes tclass Primitive(Object) : tag = 'prim 'def test_Primitive() : assert re.match(r'prim #[0-9a-f]+',str(Primitive())
}class Symbol(Primitive) : tag = 'sym ' def __init__(self, V): self.val = V def __repr__(self): return ' (self.tag, self.val, id(self) )def test_Symbol() : assert re.match(r'sym:xxx #[0-9a-f]+',str(Symbol('xxx')))\subsection{Number }\fig{../tmp/sym_05.pdf}{height=0.65\textheight
}class Number(Primitive) : tag = 'num ' def __init__(self, V): self.val = float(V ) def __repr__(self): return ' (self.tag, self.val, id(self) )def test_Number() : assert re.match(r'num:1.2 #[0-9a-f]+',str(Number('1.2')))As you see
__repr__
method repeats for every class inherited fro mPrimitive, so we move it into base class
:class Primitive(Object) : tag = 'prim ' def __init__(self): self.val = ' ' def __repr__(self): return ' (self.tag, self.val, id(self) )def test_Primitive() : assert re.match(r'prim: #[0-9a-f]+',str(Primitive()))
:import threadin gwxapp = wx.App() # create wx GUI applicatio ndef startGUI(): # wrap thread in functio n global wxmain # required for ScreenShot( ) wxmain = wx.Frame(None,-1,sys.argv[0] ) wxmain.Show() # set visibl e wxapp.MainLoop() # start wx GUI main loo pthGUI = threading.Thread(None,startGUI )thGUI.start() # start GUI threa ddef ScreenShot(PNG) : dc = wx.ScreenDC( ) X,Y,W,H = wxmain.GetRect( ) bmp = wx.EmptyBitmap(W,H ) mdc = wx.MemoryDC(bmp ) mdc.Blit(0,0,W,H,dc,X,Y ) bmp.SaveFile(PNG,wx.BITMAP_TYPE_PNG )# wait until GUI starts, and do screensho ttime.sleep(1) ; ScreenShot('sshot.png' )# do all non-GUI work here as VM(program) ru nthGUI.join() # wait until GUI stopsTune size to make GUI looks like messenger in right down corner of screen
: # tune size and position to look like messenge r X,Y,W,H = wx.ClientDisplayRect( ) # create main fram e wxmain = wx.Frame(None,-1,sys.argv[0] , size=(W/4,H/2),pos=(W-W/4,H-H/2)
:# taskbar manage rclass TaskBar(wx.TaskBarIcon): def __init__(self,frame) : # init superclas s wx.TaskBarIcon.__init__(self ) # save frame we manag e self.frame = fram e # set taskbar ico n self.icon = frame.ico n self.SetIcon(self.icon,sys.argv[0] ) # bind open/focus on left clic k self.Bind(wx.EVT_TASKBAR_LEFT_DOWN , self.OnTaskBarLeftClick ) def OnTaskBarActivate(self,event) : pas s def OnTaskBarClose(self,event) : self.frame.Close( ) def OnTaskBarLeftClick(self,event) : self.frame.Show() # show fram e self.frame.Restore() # un(min|max)imiz e self.frame.Raise() # make top level windo w self.frame.SetFocus() # and set focusand to be consistent, wrap main frame into class too
:class MainWindow(wx.Frame): # main GUI window in clas s def __init__(self) : # tune size and position to look like messenge r X,Y,W,H = wx.ClientDisplayRect( ) # create main fram e wx.Frame.__init__(self,None,-1,sys.argv[0] , size=(W/4,H/2),pos=(W-W/4,H-H/2) ) # style tun e self.icon = wx.ArtProvider.GetIcon ( wx.ART_ADD_BOOKMARK ) self.SetIcon(self.icon ) self.SetBackgroundColour(wx.BLACK ) # set visibl e self.Show( ) # make GUI taskbare d self.tbIcon = TaskBar(self ) def onClose(self,event) : # required to remove taskbar ico n self.tbIcon.RemoveIcon( ) self.tbIcon.Destroy( ) # destroy main window itsel f self.Destroy()We still need function to wrap GUI in separate thread, but it become svery simple
:def startGUI(): # wrap thread in functio n global wxmain # required for ScreenShot( ) wxmain = MainWindow() # construct main windo w wxapp.MainLoop() # start wx GUI main loo pthGUI = threading.Thread(None,startGUI )thGUI.start() # start GUI threadAs option you can do screenshots
:class MainWindow(wx.Frame): # main GUI window in clas s def shot(self,PNG='sshot.png'): # do screen sho t dc = wx.ScreenDC( ) X,Y,W,H = self.GetRect( ) bmp = wx.EmptyBitmap(W,H ) mdc = wx.MemoryDC(bmp ) mdc.Blit(0,0,W,H,dc,X,Y ) bmp.SaveFile(PNG,wx.BITMAP_TYPE_PNG )time.sleep(1) ; wxmain.shot()
argv[]
into one string an
dinterpret it as scrip
.You can download full source code from separate github repo http://github.com/ponyatov/calc:, an d\href{http://github.com/ponyatov/calc/releases/latest}{prebuild window sbinary} for first try
.Nowdays you use huge IDE for software development, but I prefer more light ,portable and easy way: I use (g)vim /vim/ text editor and Makefile\note{an dEclipse for more complex cases}. Vim has strange key bindings, and can be som ecryptic for a newbie, but is very light in resources and have useful synta xcoloring customization described in details in vim syntax colorin /vimcolor/g
.\begin{tabular}{l l l
}src.src & script & sample source code
log.log & & execution log
ypp.ypp & yacc & syntax parser
lpp.lpp & lex & lexer using regexps
hpp.hpp & \cpp & headers
cpp.cpp & \cpp & runtime system we are going to implement
Makefile & make & project build script
rc.rc & linux & (g)vim start helper
bat.bat & windows & (g)vim start helper
.gitignore & git & ignored file masks
ftdetect.vim & vim & file type detection
syntax.vim & vim & syntax coloring /vimcolor/ for custom script
\end{tabular
}#!/bin/s hgvim -p src.src log.log ypp.ypp lpp.lpp hpp.hpp cpp.cpp Makefile\subsection{Makefile: build script }For project build, you need to track file interdependency and do som eactions only on changed files . So we can describe out dependency/actio nrules in tiny Makefile, and run make tool by hotkey in editor every time we nee dto compile or run project. Consult Addendum: GNU Make /make/ for details ,here we see only tiny Makefile snippet
log.log: src.src ./exe.exe /exe.exe < $< > $@ && tail $(TAIL) $@ C = cpp.cpp ypp.tab.cpp lex.yy.c H = hpp.hpp ypp.tab.hpp lex.yy.h L = -lreadline ./exe.exe: $(C) $(H) (CXX) $(CXXFLAGS) -o $@ $(C) $(L) ypp.tab.cpp ypp.tab.hpp: ypp.ypp bison $< lex.yy.c lex.yy.h: lpp.lpp flex $<
#include
:#include "hpp.hpp"Lexer file must end with empty line , don't forget to place EOL in las tstring
.#include "hpp.hpp"Another required section is rules, but for fist time it can be empty
:Now you can run flex, and get resulting generated lexer source in lex.yy.c file
:>$ flex lpp.lpp && ls -la lex *-rw-r--r-- 1 ponyatov ponyatov 43935 nov 20 19:43 lex.yy.cAnd we have a problem: there is no
lex.yy.h
header file, contain yylex()
s and yy_scan_string()
function declaration we required t oparse every string, entered in interactive command line\note{using readlin elibrary}. To fix it, we must add option will create header file for as
llvmpy. is a Python wrapper around the LLVM \cp plibrary which allows simple access to compiler tools .It can be used for a lot of things, but here are some ideas
CLANG
o
dragonegg
r import llvm ; llvm.test()}Tiny executable module does nothing, done usin this manualg
import llvm.core # core library import llvm.ee # execution engine (JIT/interpreter) module = llvm.core.Module.new('null_module') ; print module # JIT compiler / interpreter engine = llvm.ee.ExecutionEngine.new(module) ; print engine
.def person() : yield "Chelsea " yield "Hillary " yield "Bill "for p in person(): printpHere we define
person()
\term{generator function} which yeilds fe wpersons in loop returning \term{iterator object} multiple times. In place o freturning values from iterator, we will use Prolog variables , able to b ebinded/unbinded with values in process of \term{inference}. Return value fro myield does not matter anythng, so we just yield
.class var: pass # empty container clas sdef person(V) : V.value = 'Chelsea' ; yiel d V.value = 'Hillary' ; yiel d V.value = 'Bill' ; yiel dvar = var( )for p in person(var): print var.value\subsection{Variable unification }In computer science \term{unification} is process of solving logic equations i nform of symbolic expressions. For example $cons(X,cons(X,nil)) = cons(2,Y) $where $cons(X,Y)$ is Lisp constructor creates pair of $(X.Y)$ variables is a \term{first-order unification} problem that has single solutio n$\{X\rightarrow 2 ,Y\rightarrow cons(2,nil)\}$ can be \term{substituted} i nequation expression giving true. If variables allowed to be substituted wit hfunctions/terms like
cons
, the process is called \term{high-orde runification}\note{in high-order logic}
.If fact in Prolog, DataLog and other \term{inference systems} there is \emph{n ofunctions}, but \textbf{\term{terms} defines \term{relations} between dat aelements} .For example, some RDBMS table scheme can be described in Prolog in form o f\term{predicate} table( user(index(uid),login,password,name,homedir) )
.To implement unification we need control of \term{variable binding}: track valu eassignment bound( variable) and variable unbound with valu eunassignment
.class var : def __init__(self) : self.bound = False # unbound variabl e def __lshift__(self,val): # assign operator<< self.bound = Tru e self.value = va ldef person(V) : V<< 'Chelsea' ; yiel d V<< 'Hillary' ; yiel d V<< 'Bill' ; yiel dV = var( )for p in person(V): print V.bound,V.valu
eclass var : def __repr__(self): # dump in string for m if self.bound : return <'bound : else : return <'unboun>d 'V = var() ; print va rfor p in person(V): print
<Vunboun> <dbound:Chelse> <abound:Hillar> <ybound:Bil>lWe need some fix to do full unification: variable can be unified \emph{with an yvalue} if variable not bound , or with only the same value if bound
:class var: # Prolog unifying variabl e def __init__(self) : self.bound = False # unbound variabl e def __lshift__(self,val): # unify operator<< if not self.bound: # 1) unassigne d self.value = val # assign va r self.bound = Tru e yiel d self.bound = False # drop bindin g elif self.value == val: # 2) assigned == va l yiel d def __repr__(self): # dump in string for m if self.bound : return <'bound : else : return <'unboun>d 'def person(V) : for i in V<< 'Chelsea' : yiel d for i in V<< 'Hillary' : yiel d for i in V<< 'Bill' : yieldThe equivalent Prolog program is
:person(P) :- P = 'Chelsea' ; P = 'Hillary' ; P = 'Bill' .?- person(X) , write(X), nl.For each value in
person()
we run unification twice in a loop :\begin{enumerate}[nosep ]}for i in var<< 'Hillary' : for j in person(var) : print va r#< bound:Hillar>
yfor i in var<< 'Buddy' : for j in person(var) : print va r# (nothing was unified)First we bind variable to predefined
'Hillary'
value, and then call person()
unification. As variable is already bound, unificatio nwill be successful only for bound value. With 'Buddy'
we never yields
.\subsection{General unification
}In previous section we have to bind variable to predefined value befor person()
e unification. We must do this as person()
takes argumen tin type of var
. To call person()
with any type, we can us ethis general functions
:def getval(V): # return value or unbound va r if isinstance(V,var) : if not V.bound: return V else: return V.valu e else: return
Vdef unify(arg1,arg2): # general unificatio n arg1value = getval(arg1) # get value s arg2value = getval(arg2) # / if isinstance(arg1value,var): # arg1 is variabl e for i in arg1value<< arg2value: yiel d if isinstance(arg2value,var): # arg2 is variabl e for j in arg2value<< arg1value: yiel d else: # args non var' s if arg1value == arg2value: yiel
ddef person(V) : for i in unify(V,'Chelsea'): yiel d for i in unify(V,'Hillary'): yiel d for i in unify(V,'Bill'): yieldSo we can unify not only
var
instances, but any types, and thi smakes our code simpler :for i in person('Hillary'): print 'Hillary 'for j in person('Buddy'): print 'Buddy
person()
' takes object of any type as argument, and can do unification using general unify()
unification function without use of any extr avariables
.\subsection{Unify as relation
}As unify can both bind and check values, it implements \term{logical relation} .Unification let us bring logic language capabilities into imperativ elanguages like \p yor any other language implemented on top o fGNU Dynamic environment. Iterator loop defined by person(Who)
can bind Who
to all known persons, or in other terms all entitie sdescribed by 1-ary person/1
\term{relation}. If Who
was binde dat start of unification, loop will do \term{relation matching} with selecte dvalue. Later we'll see more complex cases of unification let us d o\term{structural matching} for recursive data structures, includin gparsing of languages with complex ambiguous grammars, and \term{graph quering}
.As sample, lets see new relation brother/2
between some persons :def brother(Person,Brother) : for i in unify(Person,'Hillary') : for j in unify(Brother,'Tony'): yiel d for j in unify(Brother,'Hugh'): yiel d for i in unify(Person,'Bill') : for j in unify(Brother,'Roger'): yiel
dV = var( )for i in brother('Hillary',V) : print
'Hillary has brother Tony .Hillary has brother Hugh.And we can find all entities with
brother/2
relation defined :A,B = var(),var( )for j in brother(A,B): print B,'is brother of',
<Abound:Ton>y is brother of< bound:Hillar> <ybound:Hug>h is brother of< bound:Hillar> <ybound:Roge>r is brother of< bound:Bil>l\subsection{Joining relations }We have
brother
relation. If we add parent
relatio nfor some persons, we can \term{infer} what persons have an uncle: person
has a parent
, and parent
has a brother
.def parent(Person,Parent) : for i in unify(Person,'Chelsea') : for j in unify(Parent,'Hillary'): yiel d for i in unify(Person,'Chelsea') : for j in unify(Parent,'Bill'): yiel
ddef uncle(Person,Uncle) : Parent = var( ) for i in parent(Person,Parent) : for j in brother(Parent,Uncle): yiel
dPerson,Uncle = var(),var( )for i in uncle(Person,Uncle) : print Person.value,'has uncle',Uncle.valu
eChelsea has uncle Ton yChelsea has uncle Hug hChelsea has uncle RogerThis is classical inference in Prolog
:brother(Person, Brother) : - Person = 'Hillary', (Brother = 'Tony' ; Brother = 'Hugh') ; Person = 'Bill', Brother = 'Roger' .parent(Person, Parent) : - Person = 'Chelsea', Parent = 'Hillary' ; Person = 'Chelsea', Parent = 'Bill' .uncle(Person, Uncle) : - parent(Person, Parent), brother(Parent, Uncle) .:- uncle(Person, Uncle) , write(Person,'has uncle',Uncle), nl.\subsection{Variable chains }This sample is works fine
:def square(Width,Height): # square rectangl e for i in unify(Width,Height): yiel dfor i in square(11,11): # check 11 x 1 1 print '11 x 11 is square 'W,H = var(),var( )for j in W<< 22 : for k in square(W,H) : print W,H # yields 22 x 22But how can we implement this case :\begin{enumerate}[nosep ]
square
first (giving bound Width=Height
)unify(Width,33)
}W,H = var(),var( )for a in square(W,H): # bind firs t for b in unify(W,33): # unify with valu e print W, H#< bound<:bound:3>>3< bound:3> 3#< bound:3>3< bound<:bound:3>>3To do this, we must allow unify to bound 2+ unbound variables int o\term{variable chain}, and then bound all of them to same value
.class var: # Prolog unifying variabl e def __init__(self,name) : self.bound = False # unbound variabl e self.name = nam e def __repr__(self): # dump in string for m if self.bound : return ' else : return '# names for variables must be assigne dW,H = var('W'),var('H')
class var : def getval(self) : if not self.bound: # return unbound as i s return sel f V = self.value # else do loop unti l while isinstance(V, var): # non-variable foun d if not V.bound: return V # or unbound va r V = V.value # scan var chai n return V def getval(V): # return value or unbound va r if isinstance(V,var) : return V.getval( ) else: return
Vclass var : def __lshift__(self,val): # unify operator<< if not self.bound: # 1) unassigne d self.value = getval(val) # get va r if self.value == self: yield # can be itsel f else : self.bound = True # boun d yiel d self.bound = False # drop bindin g else : for i in unify(self,val): yiel
ddef unify(arg1,arg2): # general unificatio n arg1value = getval(arg1) # get value s arg2value = getval(arg2) # / A = B = Fals e if isinstance(arg1value,var): # arg1 is variabl e A = Tru e for i in arg1value<< arg2value: yiel d if isinstance(arg2value,var): # arg2 is variabl e B = Tru e for j in arg2value<< arg1value: yiel d if not (A and B): # args non var' s if arg1value == arg2value: yiel
dfor a in square(W,H): print W, H# return two variants of unknown variables: W>-H and H>- W# W=H H# W H= Wfor a in square(W,H): # bind firs t for b in unify(W,33): # unify with valu e print W, H# SWI prolog returns only one: H = W, W = 33, but we go t# two possible variants as 33>-W=H chains to 33>-H,H>-W # W=H=33 H=3 3# W=33 H=W=33\subsection{Cutting inference }If you need only one solution, you can cut inference, and sav ecomputational time
.def anyBrother(Person,Brother) : for i in brother(Person,Brother) : yield ; break # will found only 0 or 1 solutio n B = var('brother' )for j in anyBrother('Hillary',B): print B# brother=TonyProlog syntax
:anyBrother(Person, Brother) :- brother(Person, Brother), !.We have some problem with cut and loop breaking: as you remember any variabl emust be unbinded after unification. If we just break loop, we fail here. To fi xthis problem, we can do hint with
try/finally
will unroll binding :class var: # Prolog unifying variabl e def __lshift__(self,val): # unify operator<< if not self.bound: # 1) unassigne d self.value = getval(val) # get va r if self.value == self: yield # can be itsel f else : self.bound = True # boun d try: yield # fix for cut unboun d finally : self.bound = False # drop bindin g else : for i in unify(self,val): yield\subsection{Negation via cut }We can use cut to do negation
:def noBrother(Person) : Brother = var('B' ) # if any brother, cut inferenc e for i in brother(Person,Brother): return yield # no brother found for j in noBrother('Hillary'): print 'Hillary 'for j in noBrother('Chelsea'): print 'Chelsea '# Chelsea\subsection{Making lists }asdas
FORTH/
subdirectory you can see sources of bytecode compiler an
dvirtual machine (bytecode interpreter), with assember written i
nflex/bison
sSizes of this structures was defined by constants, but you can modify code an
duse expandable storage type like vector
\note{it will be much slower
}#define Msz 0x1000 /* bytes * /#define Rsz 0x10 0#define Dsz 0x10\ Fhas special
CELL
constant, corresponds to \textit{machine word siz ein bytes}
.#define CELL sizeof(int32_t)\subsection{Memory
}extern uint8_t M[Msz]; // memor yextern uint32_t Ip=0; // instruction pointe rextern uint32_t Cp=0; // compilation pointer (free heap)Main memory contains all
Cp
till end of M[]
/Fheap/ :extern void set(uint32_t addr, int32_t value) ;extern uint32_t get(uint32_t addr);If you set
cell < 0x100
, but read byte on same address, you must get th esame value. On \term{little-endian} machines (x86) we can read/write cells usin (uint32_t*)&M[addr]
g pointer, but for portability we use thi sbyte-shifting functions
:void set(uint32_t addr, int32_t value) { assert(addr+3< Msz); // check memory boun d M[addr+0] = (valu>>e0x00) & 0xFF ; M[addr+1] = (valu>>e0x08) & 0xFF ; M[addr+2] = (valu>>e0x10) & 0xFF ; M[addr+3] = (valu>>e0x18) & 0xFF;
}uint32_t get(uint32_t addr) { assert(addr+3< Msz) ; return M[addr+0<<]0x00 | M[addr+1<<]0x08 | M[addr+2<<]0x10 | M[addr+3<<]0x18;}\subsection{Compilation (in terms of \F)
}extern uint32_t Cp; // compilation pointer (free heap)In \ Fterm \term{compilation} means \textit{adding bytes to the end o fvocabulary}, in fact into the begin of a heap, moving heap bottom to highe raddresses. In \ Fstandard there is only
HERE
word returns address o fthe heap begin (it must be HEAP
name definitely). So to address we'll us especial Cp
register
.extern void Cbyte( uint8_t); // compile byt eextern void Ccell(uint32_t); // compile cel lextern void Cstring(char*); // compile ASCIIZ stringThese functions will be used in assembler
.void Cbyte( uint8_t b)
{ M[Cp++] = b; assert(C<pMsz);
}void Ccell(uint32_t c)
{ set(Cp,c); Cp+= CELL; assert(C<pMsz);
}void Cstring(char* s)
{ uint32_t L = strlen(s); assert(Cp+L+<1Msz); // lengt
h memcpy(&M[Cp],s,L+1); Cp += L+1; } // compile length+
1\subsection{Vocabulary structure
}In \ Fterms \term{word} means some active data element, analogous to functio nand procedure in mainstream languages. Variables and constants in \ Fals owords. It corresponds to \term{word} in human language s--- sequence o fletters, which means something. When you enter some \ Fcode in command line ,interpreter /INTERPRET/ searches each word\note{delimited by space symbols }in \term{vocabulary}, and executes /EXECUTE/ it if search was successful
.\term{Vocabulary} is container data structure, implements
Cxxx()
functions /Cxxx/ :\begin{tabular}{l l l l }LFA & cell & Link Field Area & link to previous word or 0
AFA & byte & Attribute Field Area & flags, IMMED /IMMEDIATE/
NFA & asciiz string & Name Field Area & word name
CFA & bytecode & Code Field Area & executable bytecode
PFA & optional & Parameters Field Area & in variables and constants
\end{tabular }Last defined word must be marked somewher
:// program entry point (addr of jmp parameter )#define _entry 1// last defined word LFA addres s#define _latest (_entry+CELL )int main(int argc, char *argv[]) { // ============ compile vocabulary heade r // jmp _entry jump to last defined wor d Cbyte(opJMP); Ccell(0) ; // _latest LFA of last defined wor d Ccell(0);To compile vocabular header us
ema<pstring,uint32_>t SymTable; // symbol tabl evoid LFA() { uint32_t L = get(_latest); set(_latest,Cp); Ccell(L); }void AFA(uint8_t b) { Cbyte(b); }void NFA(char* s) { Cstring(s); }void CFA(string name) { SymTable[name] = Cp; set(_entry,Cp); }void Cheader(char* name) { // compile heade r LFA(); AFA(); NFA(name); CFA(name);}\subsection{Bytecode interpreter }Bytecode interpreter will be run after assembler ended its work
:int main() { ... // compile vocabulary heade r yyparse(); // run compile r dump(); // dump memory into .bin fil e VM(); // run V M}As any other computer, interpreter implements fetch/decode/execute loop ove rcommands in
M[]
memory, with command pointed by Ip
instructio npointer register
.void VM() { for (;;) { // infty loo p printf( " uint8_t op = M[Ip++]; assert(I<p=Cp); // FETC H printf( " switch (op) { // DECODE/EXECUT E case opNOP : nop(); break ; case opBYE : bye(); break ; case opJMP : jmp(); break ; case opCALL: call(); break ; case opRET : ret(); break ; case opLIT : lit(); break ; default : printf("bad opcode\n\n"); abort() ; } printf("\n") ;}
r\subsection{Control flow
}#define opNOP 0x00 // no p#define opBYE 0xFF // by e#define opJMP 0x01 // jmp< add> r#define opQJMP 0x02 // ?jmp< add> r#define opCALL 0x03 // call< add> r#define opRET 0x04 // re t#define opLIT 0x05 // lit< valu>
g#define opNOP 0x00 // no
pvoid nop() { printf("nop");
make
scripts code for building \term{emLinux} system i
/DLR/cross
n
.\fig{../tmp/cross_01.pdf}{height=\textheight
$ cd cross ; make dirs
to create directory structure
:\begin{tabular}{l l
cross/Makefile
} & main Makefile
cross/gz/
& software packages source tarballs
cross/src/
& packages source code (temp/ramdisk
cross/tmp/
) & out of tree build (temp/ramdisk
cross/hw/
) & supported hardware platform
s &(KVM is main virtual server system
cross/arch/
) & target architectur
cross/cpu/
e & target CP
U\end{tabular
}\fig{../tmp/cross_02.pdf}{width=\textwidth
} # current directory (full path )CWD = $(CURDIR ) # software packages source tarball sGZ = $(CWD)/g z # cross compiler tools CROSS = $(CWD)/$(TARGET).cros s # target filesyste mROOT = $(CWD)/$(TARGET )BOOT = $(ROOT)/boo t # source cod eSRC = $(CWD)/sr c # off-tree buil dTMP = $(CWD)/tm pDIRS = $(GZ) $(SRC) $(TMP) $(CROSS) $(ROOT) $(BOOT )PHONY: dir sdirs : mkdir -p $(DIRS)To build you should use build system with SSD and large amount of RAM It is much better to use ramdisk. for source code and build/temp files
.tmpfs /home/user/DLR/cross/tmp tmpfs auto,uid=user,gid=users 0 0tmpfs /home/user/DLR/cross/src tmpfs auto,uid=user,gid=users 00and remount ramdisks
:$ sudo mount -a ; make dir
:\begin{tabular}{l l l APP
} & & application name is GNU =dynamic server
HW
& hw/
& hardware platform short name
ARCH
& arch/
& architecture
CPU
& cpu/
& CPU
\hlin e&& \term{Triplets}: System where you
BUILD
& && build x86_64-pc-linux-gnu
()
HOST
&& run tools mingw32
()
TARGET
&& run resulting software arm-linux-uclibceabihf
(
)\end{tabular
}\fig{../tmp/cross_03.pdf}{width=\textwidth }\begin{tabular}{l l l l }HW & ARCH & CPU & description
\hlin eKVM & =BUILD & x86\_64 & Linux workstation a
s&&& virtualization server (o rleased VDS
)Rpi & armeabihf & BCM2835 & Raspberry Pi (any model compatible)
PC104 & i386 & i486sx & PC104 board or any old PC compute
rPC & i386 & i686 & more or less modern P
C \end{tabular }{\footnotesize At this point we don't see on modern PC with i3..i7 processor sand huge amount of RAM for dedicated Dynamic server farm. If you have enoug hmoney to invest into Dynamic-based hosting, its your turn to write custo mscripts to build it for i3+ computer. But now it can be better to play wit hheap of reused hardware you can buy in a few dollars. Or Amazon EC.
}APP = dynami cHW ?= PC10 4include hw/$(HW).m kinclude cpu/$(CPU).m kinclude arch/$(ARCH).mk\subsection{Virtualization patforms }\subsubsection{hw/KVM }We'll use KVM as default (virtualization) platform, and Debian GNU/Linux manua lon its configuration\note https://debian-handbook.info/browse/stable/sect.virtualization.html{ }\subsection{x86 variants }\subsubsection{hw/PC104
}CPU = i486s
xARCH = i38 6# override CPU (will be used for TARGET )CPU = i48
6TARGET = $(CPU)-linux-uclibcYou can have some fun for a few tens of bucks with old notebook: take some o neBay, and play with it. Or you can buy industrial PC104 embedded computer wit hsome Vortex86 SoC lik Advantech PCM-4434e Don't buy x86 hardware with CPU lower then i486sx.: i386 CPU support wa stotally dropped from Linux kernel many years ago, and we have a chance to loos eit also in GCC toolchain .If you want to have some practical benefit from this old notebook, you can d osome light computations or electronics and mechanical design, and it is bette rto buy it with i486dx processor, as it have FPU (hardware floating point). \subsubsection{hw/i686 }There are huge amount of old computers you can reuse, and lot of new board sstarting from Intel Atom and compatible AMD Geode .\subsubsection{hw/amd64 }Cheap and power SOHO servers, up to topmost i7 and low-end Xeon CPUs .\subsection{Raspberry Pi, china clones }Cheap, ultra compact and power effective able to run with battery power source .\subsubsection{Raspberry Pi }Large community, available off the shelf in local stores .\subsubsection{Orange/Banana Pi }\subsection{MIPS routers (MR3020): cheap and too small }Have a small amount of RAM (starting from 32M), but have embedded networ kinterfaces, WiFi and USB host for GSM modem and external storage. Can be use dfor network gateways, slow data storages using USB flash drive, or host tiny we bsites
:include mk/gz.m
kWGET = wget -P $(GZ) -c PHONY: gz gz_cross gz_cor egz: gz_cross gz_cor egz_cross : $(WGET) ftp://ftp.gmplib.org/pub/gmp/$(GMP_GZ ) $(WGET) http://www.mpfr.org/mpfr-$(MPFR_VER)/$(MPFR).tar.bz 2 $(WGET) http://www.multiprecision.org/mpc/download/$(MPC_GZ ) $(WGET) http://ftp.gnu.org/gnu/binutils/$(BINUTILS_GZ ) $(WGET) http://gcc.skazkaforyou.com/releases/$(GCC)/$(GCC_GZ )gz_core : $(WGET) http://www.uclibc.org/downloads/$(ULIBC).tar.x z $(WGET) http://busybox.net/downloads/$(BUSYBOX_GZ ) $(WGET) http://www.kernel.org/pub/linux/kernel/v4.x/$(KERNEL_GZ)\fig{../tmp/cross_04.pdf}{width=\textwidth
}include mk/version.m kinclude mk/package.m kinclude mk/gz.m kinclude mk/unpack.m
k# automate source code unpac k$(SRC) / cd $(SRC) && zcat <$ | tar x && touch $ @$(SRC) / cd $(SRC) && bzcat <$ | tar x && touch $@\fig{../tmp/cross_05.pdf}{height=\textheight }\fig{../tmp/cross_06.pdf}{height=\textheight }/cross/mk/version.m
:\fig{../tmp/cross_08.pdf}{width=\textwidth }\subsubsection{binutils: assembler and linker for \$TARGET platform
}CROSS_CFG = --target=$(TARGET) $(WITH_CCLIBS) --with-sysroot=$(ROOT) --with-native-system-header-dir=/include .PHONY: binutil sbinutils: $(CROSS)/bin/$(TARGET)-a s# use one of tool executable as build done flag $(CROSS)/bin/$(TARGET)-as: $(SRC)/$(BINUTILS)/READM E rm -rf $(TMP)/$(BINUTILS) ; mkdir $(TMP)/$(BINUTILS) ; cd $(TMP)/$(BINUTILS) ; $(SRC)/$(BINUTILS)/$(CCFG) $(CROSS_CFG) & & $(MAKE) && make install-strip\fig{../tmp/cross_09.pdf}{width=\textwidth
}CCFG = configure --disable-nls --prefix=$(CROSS) --docdir=$(TMP)/doc --mandir=$(TMP)/man --infodir=$(TMP)/info\begin{tabular}{l l }prefix & cross compiler toolchain install director
cpu-[hw]-os-abi
$CROSS
}WITH_CCLIBS = --with-gmp=$(CROSS) --with-mpfr=$(CROSS) --with-mpc=$(CROSS) CCLIBS_CFG = --disable-shared $(WITH_CCLIBS ).PHONY: gcclib sgcclibs: gmp mpfr mp c.PHONY: gm pgmp: $(CROSS)/lib/libgmp. a$(CROSS)/lib/libgmp.a: $(SRC)/$(GMP)/READM E rm -rf $(BLD)/$(GMP) ; mkdir $(BLD)/$(GMP) ; cd $(BLD)/$(GMP) ; $(SRC)/$(GMP)/$(CFG) $(CCLIBS_CFG) & & $(MAKE) && make install-stri p...\subsubsection{gcc0: GNU compler toolchain (only for kernel build) }If you have problems with memory size, run make with \emph{limited mak ethreads}
:$ make BUILD_CPU_NUMBER=2 gcc
0GCC_VER = 7.2.
0GCC = gcc-$(GCC_VER )GCC_GZ = $(GCC).tar.x
zgz_cross : $(WGET) http://gcc.skazkaforyou.com/releases/$(GCC)/$(GCC_GZ
)GCC0_CFG = --without-headers --with-newlib --enable-languages="c ".PHONY: gcc 0gcc0: $(SRC)/$(GCC)/READM E rm -rf $(BLD)/$(GCC) ; mkdir $(BLD)/$(GCC) ; cd $(BLD)/$(GCC) ; $(SRC)/$(GCC)/$(CFG) $(CROSS_CFG) $(GCC0_CFG) & & $(MAKE) all-gcc && make install-gcc\subsection{kernel: Linux kernel }Before kernel build clean up TMP directory if you use ramdisk build
:$ make ramclea
n.PHONY: ramclea nramclean : rm -rf $(SRC)/* $(BLD)/
*KERNEL_VER = 4.14.
9KERNEL = linux-$(KERNEL_VER )KERNEL_GZ = $(KERNEL).tar.x
zgz: gz_cross gz_cor egz_core : $(WGET) http://www.kernel.org/pub/linux/kernel/ v4.x/$(KERNEL_GZ
).PHONY: core kerne lcore: kerne lkernel: $(SRC)/$(KERNEL)/READMECheck kernel bootable
:$ make emu\subsection{ulibc: base system library }\subsection{busybox: core UNIX }
https://www.youtube.com/playlist?