Invocation and output of Thrases
At first: Thrases now (1.9) does not use the semicolon anymore - the separator, which parts the template into phrases, is ~~ now. The last version (1.11) with the semicolon remains available, its documentation is here.
Save thrases.py anywhere, where you can import from it. Then you can tell your python interpreter (python 2.5):
from thrases import Template
x = Template(filename, incode='iso8859-1')
my_page_as_unicodeobj = x.render(locals())
incode has the default None. If and only if you want unicode objects from x.out.getvalue() (see below) built by x.render() you must provide the encoding here you used when writing the template 'filename' onto your disk. However - when thrases shall render (parts of) websites, your answer to any HTTP-request must be a string with a conrete encoding. Then it's nice, when the result of x.render() already has that encoding. In that case you should try to use that encoding for all input of thrases. Not only for the template file, but also for the strings your calling code reaches to the template. In that case thrases will not build a unicode object, but leave your encoding alone. Thrases is working much faster with encoded strings. The bytestrings of Python 3 will be no problem for thrases (there are two lambda expressions in the code still - the fitting of thrases to Python 3 has still to wait some time).
Instead of 'locals()' any analogous dict of variable names and values will do. In case you reach 'locals()' the template has full access to the variables therein - in projects, where templates might be changed by extern webdesigners, you'd better build such a dict yourself, containing just the needed variables with their values. Note, that also code objects can be reached so. Thus there is no urgent need for python's import-statement in the mini python thrases can interprete.
The template provided by filename (for a file anywhere on your disk) can have any text format of your choice. You can also reach a string to the template - if thrases doesn't find a file called 'filename', it will interprete that argument as a string or unicode object describing a template.
x.render() does not return anything, it writes onto a StringIO accessible as x.out or onto a file descriptor reached to x.render() by the keyword argument fd. The latter brings remarkable improvements in speed. And should also work with sockets of HTTP-servers.
With files as input to the constructor, a template can re-read that file by the method x.reinit(), which also evaluates the time signature. You also have the switch x.render(locals(), fd=my_file, reinit=True) as a keyword argument, which has the default False.
Syntax
Thrases reserves '~~' for its own use, you cannot use thrases, when you need this for anything else. There is no escaping mechanism (however you could easily do appropriate replacements around). At first thrases does the text replacements for '#insert filename~~' (more below). Then input is split at its double tildes, which have a similar meaning as the semicolon in python or C. Thrases decides how to manage the parts by applying regular expressions and pythons compile() for testing their syntax. For each of them one of the following three actions will be performed:
- python statements are executed and nothing is rendered,
- python expressions are evaluated and the result will get rendered,
- the empty phrase ~~~~ is the block ender for python 'dedenting' python one level. This makes the template format-free.
- Simple text is rendered (eventually decoded to unicode) completely unchanged to output, preserving also any whitespace characters.
All four cases can be seen in the example:
Hello, ~~if name is None or name=='shit': name='World'~~~~name~~ !
A fifth case for phrases are python style comments, which start with '#' (leading white space allowed), and range til the next double tilde. '#insert filename~~' is not a comment, because not a complete phrase - there might be no leading double tilde (more below).
There is an exception to the second case: function calls. A python function call is assumed as 'silent'. Thus str(float(2)) would not render anything. In cases like this a workaround with assignment must be used: ~~s=str(float(2))~~s~~ will render the value of the sole s. This is somewhat ugly and might be improved in the future.
The mini python understood by thrases is format-free. Not completely as C is, but whitespace around your phrases can have any shape you want. Inside the python phrases python syntax applies - as well as the common use of the backslash. The empty phrase, which dedents python one level, has no effect, when no block has been established by conditional clauses or def. Starting such a block does not need anything else than conditional or def-clauses. Getting to 'negative' indentation levels is preempted: You could write '~~~~~~~~~~~~~~~~~~~~' for getting - probably - to the 'root' level.
Inserting Files
As mentioned above '#insert filename~~' will be replaced literally by the content of the file filename. This can be written anywhere in the template. The file has to be provided
- either by its full pathname in python notation
- or by a relative pathname. Where it will be searched depends on the type of input. If the template is a file, the starting directory for the search (see next) will be the directory of that file. If it is a string from the calling code, it will be the directory of the file, where this code is.
The file to get inserted will be first searched in this starting directory and then, if not found, in its parent directories in ascending order. Thus a file containing function definitions, common titles or footers or anything like this for a group of files can be saved in a directory above them and though provided by its basename only.
#insert C:\\project\\templates\\filename~~
(for Windows here) will be completely replaced by the bare content of 'C:\project\templates\filename'. Thus, in the frequent case when it contains python (like defs of thrases-functions), that content should end with a double tilde. But also portions of simple text can be inserted so. It's because of this, that '#insert ...~~' is not a complete phrase in thrases, it is 'open' to the front. You can do something like:
Sylvia's homepage is #insert C:\\project\\sites\\maintainers\\Sylvia;~~ where you can find ...
and also the more normal:
On <a href="#insert C:\\project\\sites\\maintainers\\Sylvia~~">Sylvia's homepage</a> you can find ...
to paste a piece of simple text into simple text of your template. This would not work with comments, if no double tilde (followed by arbitrary white space) precedes the '#', this and the following characters are rendered.
The python in thrases
This mini python is complete besides commands or functions with effects on namespaces. There is no "class" statement in thrases, no "import" and neither "exec" nor "eval". But you have "def", nested too, and creation of objects by assignments. Not by iterable unpacking however - you can unpack of course, but the variables on the left side must already exist. And for functions the *-syntax for argument reaching is not supported. In the body of every function declared by "def" you have the full syntax of thrases with the three possible actions described above.
Another main restriction is the scope of variables declared in the template, they are all global to the template (without needing any 'global my_variable'-statements deeper in the execution stack). Except function arguments, which stay local as usually. All variables and functions you declare explicitely anywhere in the template's code are mounted to an instance self.usr, which is an attribute of your template object. Its class is empty (declared: 'class o: pass') having no other purpose. Thus you are completely free in the choice of your variable's names and safe from any side effects on the variables of the calling code.
On error-handling: The try-and-except mechanisms involved in the choice between action 1, 2 and 3 as mentioned above will frequently force thrases to render your python expression literally to output, if erroneous. The same with all python statements. For exceptions from rendering python's standard messages have slightly been extended: On syntax errors the last 20 lines before the line, where the error is detected, from an intermediate representation of the template (q.exin) are printed (much easier than to print the related lines of the template at that point of processing - it must be said though, that in case of a syntax error 20 lines frequently are not enough). On all other types the last 10 rendered strings in the resulting deque (if existent) are printed additionally.
Example
An example from production code (as SciTe shows it - editors will get no problems with html-formatting by thrases). It will render just a <div>-element, which is destined to get integrated into an extern side. Thus CSS must be used inline here, but by templating we still can avoid repetetions.
# -*- coding: iso8859-1 -*-
~~#insert auXion_header.htm~~
<div align="center">
<div style = "text-align: left;
padding:0px;
background:lightcyan;
width:80%; border:2px;
border-style:ridge;
border-color:#000080;
font-family:Georgia, Sylfaen, Times New Roman;">
<br>~~
st = line[c_ict['lang']].strip()~~
if st[-1]==',': st = st[0:-1]~~~~
<div style="text-align:center; font-size:17pt;">~~st~~</div><br>~~
if kurz1:<div ~~largeLine~~>~~kurz1~~</div>~~~~
addedlabel('Genaues Stempeldatum: ', format_date(line[c_ict['dtstamp']]))~~
addedlabel('Erhaltung (preservation): ', line[c_ict['zustand']])~~
if line[c_ict['ftsize']]:
sf = line[c_ict['ftarr']]~~
if sf in ('geteilt', 'ungeteilt'):
<div ~~Label~~>Format: </div>
<div ~~Data~~>~~line[c_ict['ftsize']]~~, ~~sf~~e Rückseite</div>~~~~
else:
addedlabel('Format:', line[c_ict['ftsize']])~~
addedlabel('Rückseite:',sf)~~~~~~
addedlabel('Verlag (publisher): ', line[c_ict['verlag']])~~
addedlabel('Künstler (artist): ', line[c_ict['artist']])~~<br>~~
if line[c_ict['english']]:<div ~~largeLine~~>~~line[c_ict['english']]~~</div>~~
addedlabel('Exact date of postmark:', format_date(line[c_ict['dtstamp']]))~~
sfts = line[c_ict['ftsize']]~~
if sfts=='klein': sfts='small'~~~~ elif sfts=='groß': sfts='large'~~~~~~
addedlabel('Format: ', sfts)~~
sfta = line[c_ict['ftarr']]~~
if sfta=='geteilt': sfta = 'divided'~~~~ elif sfta=='ungeteilt': sfta='for address only'~~~~
elif sfta.upper()=='RKP': sfta='no postcard'~~~~~~
addedlabel('Back', sfta)~~</div>~~~~</div></div>
The second line imports another thrases-formatted file:
Label = 'style ="float: left; padding: 4px 0px 0px 10px; width: 180px; "'~~
Label += 'font-family: Verdana; font-size: 10pt;"'~~
Data = 'style = " font-family: Sylfaen, Georgia; font-size: 13pt;"'~~
largeLine = 'style=" background: #f8d0d0; margin: 0px; width: 100%; text-align:center;'~~
largeLine += 'border-width: 3px 0px 3px 0px; border-style:solid; border-color: #ffffff;"'~~
largeLine += 'font-size:14pt; padding: 6px 0px;"'~~
def format_date(dte):
if len(dte)>4:
dte = dte[6:8]+'.'+dte[4:6]+'.'+dte[:4]~~
if dte[3]=='0': dte = dte[:3]+dte[4:]~~~~
if dte[0]=='0': dte=dte[1:]~~~~~~
return dte~~~~
def addedlabel(SSS, TTT):
if TTT: <div ~~Label~~>~~SSS~~ </div><div ~~Data~~>~~TTT~~</div>~~~~~~
(For those who try to understand this completely: fomat_date() is not used by the concrete div, which doesn't come from a data set with a given "Genaues Stempeldatum (exact date of postmark)". It depends on premises about those data and can be neglected here. c_ict is a dict, which maps column names in the underlying database to column numbers in ListView-widgets. It is used throughout the project and makes adapting changes in the database very easy).
Thrases renders something like the following from this template and datasets describing historical picture postcards. It needs so many "ifs" und thus templating, because nearly all db-fields could be empty. The function addedlabel() is central here. Its purpose is layout thus its best place a template.
Ein Kind stellt eine brennende Kerze an ein Kruzifix im Schnee
x um 1930-40
Erhaltung (preservation):
I
Format:
groß, geteilte Rückseite
Verlag (publisher):
Selbstverlag M. Spötl, Schwaz
Künstler (artist):
M. Spötl
A child puts a burning candlelight to a crucifix in the snow
Format:
large
Back:
divided
Conclusion, code
The __init__() might possibly be slow, but the rendering is done very fast by thrases. I replaced a Mako template in production code by an equivalent for thrases. At that time a unicode object has been returned and still it seemed, thrases rendered faster. With encoded strings and rendering onto a file descriptor thrases is by far the fastest templating engine in the world, i hope.
After constructing the template x.exin, which is reached to exec by x.render(), contains nothing than lines starting with OUT.write(...), with your python expression as arguments or constant strings. Plus your python statements literally and plus the first line OUT = self.out. The only overhead in comparison to direct copying is, that your variables created in the template are changed to self.usr.your_variables_name - these periods here are overhead. Thus Template.render() is very near to the theoretically possible maximum speed.
There is also a derived class TemplateForCSV. This is needed for rendering directly onto a csv-file. The target format is the format with the semicolon as column separator and the double quotes for protecting it. Should the output need this protection, the outside double quotes have to be written onto the file directly before the render call and after. It is also an example, how to fit q.wripy and q.wrirect (q is self) to escaping characters.
Thrases will be my tool to make web pages and apps with python for the years to come - nothing conceivable can do the job more easily. Thus it might undergo further modifications. Features, which appear less final to me than the basic syntax, will be documented in the project's wiki. There isn't much to discuss on such a small peace of software, so the wiki will probably stay neat and easy to get browsed through. Please use it as the first address for bug reports or feature requests - and email not before some days without reaction from me.
It was great fun to make thrases. And here is the code. Good luck!
# Written by Joost Behrends
# and placed in the public domain
from StringIO import StringIO
from collections import deque
from inspect import currentframe, getouterframes
import re, os, sys, time
class TemplateError(Exception):
"""Exception raised by generated code from a template for thrases"""
pass
class StringProtectingSplitter():
"""
Designed for a split(), which splits only outside strings.
Not a string method, because of unicode. Unlike split() a deque is returned.
"""
r0 = 'u?""".*?"""'; r1 = "u?'''.*?'''"
r2 = '[ru]?"[^"]*"'; r3 = "[ru]?'[^']*'"
rxString = re.compile('(?:'+r0+'|'+r1+'|'+r2+'|'+r3+')')
def __init__(q, s):
q.splitchars = s
q.wd = len(s)
def split(q, sarg):
Ivalls = [t.span() for t in q.rxString.finditer(sarg)]
lix = 0; ix = sarg.find(q.splitchars)
result = deque()
while ix > -1:
while Ivalls and ix >= Ivalls[0][1]: del Ivalls[0]
if Ivalls and ix < Ivalls[0][0] or not Ivalls:
result.append(sarg[lix:ix]); lix = ix+q.wd
ix = sarg.find(q.splitchars, ix+q.wd)
if lix < len(sarg): result.append(sarg[lix:])
return result
rsID = '[a-zA-Z_][a-zA-Z_0-9]*'; rxID = re.compile(rsID)
rsID0 = '^'+rsID; rxID0 = re.compile(rsID0)
rxFunc = re.compile(rsID0 + '\(.*\)$')
rxAssignment = re.compile('(' +rsID0+ ')([^=]*=\\s*[^=\\s]+)+')
rxCondPhrase = re.compile('(?:def|for|if|while|else|elif|try|except|finally)(?:\\s+.*?)?:')
# def is in rxCondPhrase too: for correct splitting; MEMO: elif is "exclusive"
rxDeFrase = re.compile('^(?:def\\s+)('+rsID+')\\s*\(((?:\\s*'+rsID+'.*?)*)\)\\s*:')
def dummyfunc(*x): pass
class Template():
"""
__init__ () reads its template from filename into a string
and converts it to a representation of valid python code,
called exin, which render() will pass to exec().
Input will be ;-split into phrases. Three things can happen to them:
- The phrase can be transformed into the command writing it to output.
This happens to non-python.
- The phrase is provided for evaluation by python, then the result
will get the same transformation. This happens to python expressions.
- Python statements are written unchanged to exin.
compile is used for syntax checking only.
render() simply passes exin to exec(), which then evaluates the actual
values of the args.
"""
def reinit(q):
if os.stat(q.filename).st_mtime > q.constructed:
q.__init__(q.filename, q.incode)
def wripy(q, s, lv, silent=False):
for name in q.Vars.keys():
if not name in q._protectedVars:
s = '@'+s+'@'
while q.Vars[name].search(s): s = q.Vars[name].sub('\\1q.usr.\\2\\3', s)
s=s[1:-1]
if silent: q.exin.write('\t'*lv + s +'\n')
elif q.incode: q.exin.write('\t'*lv +'q.out.write('+unicode(s, q.incode)+')\n')
else: q.exin.write('\t'*lv +'q.out.write('+ s +')\n')
def wrirect(q, s, lv):
if isinstance(s, str) and q.incode: s = unicode(s, q.incode)
if s.find('"""') > -1 or s.endswith('"'):
lix=0; ix=s.find('"')
while ix > -1:
q.exin.write('\t'*lv + 'q.out.write("""' +s[lix:ix]+ '""")\n')
q.exin.write('\t'*lv + """q.out.write('"')\n""")
lix=ix+1; ix=s.find('"', lix)
q.exin.write(s[lix:])
else:
q.exin.write('\t'*lv + 'q.out.write("""' +s+ '""")\n')
def __init__(q, filename, incode = None):
q.incode = incode # needed in q.wrirect()
q.filename = filename; q.constructed = time.time()
# both only for reloading
q.exin = StringIO()
q.Vars = {}; q.subscribedFuncs = {}
q.protectedVars = {}; q._protectedVars = []
# _protectedVars is the sum of the lists of protectedVars.values()
# for levels greater or equal than the actual
ByColon = StringProtectingSplitter(':')
lv=0
if len(filename)>256 or not os.path.exists(filename):
q.inp = filename
q.tmplName = ' input string '
q.templateInputIsFile = False
else:
f = open(filename, 'rb'); q.inp = f.read(); f.close()
if q.incode: q.inp = q.inp.decode(incode)
q.tmplName = filename # just for messages of exceptions
q.templateInputIsFile = True
rxInsert = re.compile('#insert\\s+([^;]+)~~')
for matching in rxInsert.finditer(q.inp):
filename = matching.group(1)
if not os.sep in filename:
if q.templateInputIsFile: dir = os.path.dirname(q.tmplName)
else: dir = os.path.dirname(getouterframes(currentframe())[1][1])
if not filename in os.listdir(dir):
while os.sep in os.path.dirname(dir) or '/' in os.path.dirname(dir):
dir = os.path.dirname(dir)
if filename in os.listdir(dir): break
if os.sep not in os.path.dirname(dir) and \
'/' not in os.path.dirname(dir) or \
sys.platform == 'win32' and dir[1:3]==':\\':
raise IOError(filename + ' not found')
break
filename = dir + os.sep + filename
f = open(filename, 'r')
q.inp = rxInsert.sub(f.read(), q.inp, 1); f.close()
class o: pass
q.usr = o()
subqueue = deque(); noCond = False
PhraseQueue = deque(q.inp.split('~~'))
while len(PhraseQueue):
phrase = PhraseQueue.popleft()
thrase = phrase.strip() # t for 'test'
if thrase.startswith('#'): pass
elif not phrase:
if lv: lv -= 1
for i in q.subscribedFuncs.keys():
if lv <= i:
q.exin.write('\t'*lv +'q.usr.'+q.subscribedFuncs[i] +'='+ \
q.subscribedFuncs[i] +'\n')
del q.subscribedFuncs[i]
q._protectedVars = \
q._protectedVars[:len(q._protectedVars)-len(q.protectedVars[i])]
q.protectedVars[i] = []
# all this happens only 'once'
elif thrase.startswith(('[',']','/','+','*','(',')','%','<','>','"',"'",'=')) \
or thrase=='':
q.wrirect(phrase, lv)
elif thrase == 'pass': q.wripy(thrase, lv, silent=True)
elif thrase in ('break', 'continue') or \
thrase.startswith(('assert', 'del', 'raise')):
q.wripy('try:', lv, silent = True)
q.wripy(thrase, lv+1, silent = True)
q.wripy('except:', lv, silent = True)
q.wrirect(phrase, lv+1)
elif thrase.startswith(('return', 'yield')):
try:
compile('def t(): '+thrase, '', 'exec')
q.wripy(thrase, lv, silent=True)
except:
q.wrirect(phrase, lv)
elif rxFunc.match(thrase):
try:
compile(thrase, '', 'exec')
q.wripy(thrase, lv, silent = True)
except:
q.wrirect(phrase, lv)
elif not noCond and thrase.find(':') > -1:
# care for nested conditional statements
subqueue = ByColon.split(phrase)
while len(subqueue) > 1:
s = subqueue.popleft()
thrase = s.strip() + ':'
if thrase.startswith('else') or thrase.startswith('elif'):
for_test = 'if x: pass\n' + thrase + ' pass'
elif thrase.startswith('except'):
for_test = 'try:pass\n' + thrase + ' pass'
elif thrase.startswith('finally'):
for_test = 'try:pass\nexcept:pass\n' + thrase + ' pass'
else:
for_test = thrase + ' pass'
try:
compile(for_test, '', 'exec')
except:
subqueue.appendleft(s+':'+subqueue.popleft())
else:
q.tNP = rxDeFrase.match(thrase)
if q.tNP:
name = q.tNP.group(1)
setattr(q.usr, name, dummyfunc)
q.wripy(thrase, lv, silent=True)
q.Vars[name] = \
re.compile('(.*[^\.a-zA-Z_0-9])('+ name+ ')(\\W.*)')
q.protectedVars[lv] = q.tNP.group(2).split(',')
q.protectedVars[lv] = \
filter(lambda x:rxID0.search(x), q.protectedVars[lv])
q.protectedVars[lv] = \
map(lambda x:rxID0.search(x).group(0), q.protectedVars[lv])
q._protectedVars += q.protectedVars[lv]
q.subscribedFuncs[lv] = name
lv += 1
phrase = phrase[len(s)+1: ]
elif rxCondPhrase.match(thrase):
q.wripy(thrase, lv, silent=True); lv += 1
phrase = phrase[len(s)+1: ]
else:
subqueue.appendleft(s+':'+subqueue.popleft())
# cares for slices
else:
PhraseQueue.appendleft(phrase)
noCond = True
continue # avoid 'noCond = False' at the end of the loop
elif rxAssignment.match(thrase):
try:
compile(thrase, '', 'exec')
except:
q.wrirect(phrase, lv)
else:
name = rxAssignment.search(thrase).group(1)
if name not in q._protectedVars:
q.wripy('q.usr.'+thrase, lv, silent=True)
if not hasattr(q.usr, name): setattr(q.usr, name, '')
q.Vars[name] = re.compile('(.*[^\\.a-zA-Z_0-9])('+ name+ ')(\\W.*)')
else:
q.wripy(thrase, lv, silent=True)
else:
try:
compile(thrase, '', 'eval')
except:
q.wrirect(phrase, lv)
else:
q.wripy('try:', lv, silent = True)
q.wripy('q.tNP = '+ thrase, lv+1, silent = True)
q.wripy('except:', lv, silent = True)
q.wrirect(phrase, lv+1)
q.wripy('else:', lv, silent = True)
q.wripy('q.tNP', lv+1)
noCond = False
def render(q, locs, fd = None, reinit = False):
"""
The dict locs gives render() full access to the variables therein.
To warn still more explicitly: This means, that code in <py>-blocks
can change the caller's locals(), when
the argument for locs in the render call is locals().
"""
if reinit and q.templateInputIsFile: q.reinit()
if fd: q.out = fd
else: q.out = StringIO()
locs['q'] = q
try:
exec q.exin.getvalue() in locs
except SyntaxError:
import sys, traceback
l = q.exin.getvalue().split('\n')
t0, t1, dummy = sys.exc_info(); del dummy; t1 = repr(t1)
message = '\n--<>-- ' + q.tmplName + ' --<>--' \
+ '\n--<>-- SyntaxError --<>--\n'
try:
i = int(re.findall('(\\d+)', t1, re.DOTALL)[0])
except:
message += t1
else:
ll = []; k = 20
for j in range(i+1, -1, -1):
if l[j].find('if q.out and q.out[-1]==";": q.out.pop()') == -1:
ll.insert(0, ss[j])
k -= 1
if not k: break
message += '\n'.join(ll[0:20-k])
finally:
raise TemplateError(message); return
except:
import sys
t0, t1, dummy = sys.exc_info(); del dummy
message = '\n--<>-- ' + q.tmplName + ' --<>--' \
+ '\n' + repr(t1) + '\n'
raise TemplateError(message); return
class TemplateForCSV(Template):
""" just overwrites wripy and wrirect - a model for other formats also """
rxAmp = re.compile('&([^;]{5})')
def wripy(q, s, lv, silent=False):
for name in q.Vars.keys():
if not name in q._protectedVars:
s = '@'+s+'@'
while q.Vars[name].search(s): s = q.Vars[name].sub('\\1q.usr.\\2\\3', s)
s=s[1:-1]
if silent:
q.exin.write('\t'*lv + s +'\n'); return
s = s.replace('\n', ' ').replace('\r', ' ').replace('"', '""')
s = q.rxAmp.sub('&\\1', s)
if q.incode: s = unicode(s, q.incode)
q.exin.write('\t'*lv +'q.out.write('+ s +')\n')
def wrirect(q, s, lv):
s = s.replace('\n', ' ').replace('\r', ' ')
s = q.rxAmp.sub('&\\1', s)
if isinstance(s, str) and q.incode: s = unicode(s, q.incode)
if s.find('"""') > -1 or s.endswith('"'):
lix=0; ix=s.find('"')
while ix > -1:
q.exin.write('\t'*lv + 'q.out.write("""' +s[lix:ix]+ '""")\n')
q.exin.write('\t'*lv + """q.out.write('""')\n""")
lix=ix+1; ix=s.find('"', lix)
q.exin.write(s[lix:])
else:
s = s.replace('"', '""')
q.exin.write('\t'*lv + 'q.out.write("""' +s+ '""")\n')

