webgen is a tool which someone familiar with html and css can use to generate some or all of a website. It can generate many pages or html fragments which have a common structure.
It does so by using string substitution to replace label strings in template files with equivalent data strings. For a given pair of template and data files it can write the generated text to a single file or to a spearate file for each line of data.
ahernp.com has been generated using webgen, though most of the pages have unique structures so the main benefit has been in creating the tables of items in my Library catalogue.
webgen can be downloaded from webgen.zip. The download includes a simple sample website, generated from three templates which contains a homepage, a list of items as well as separate pages for each item. Python 3 is needed to run webgen.
To generate a series of list item tags from a template and data file:
Template | Data |
---|---|
<li>#Name <a href="http://#URL" title="#Description"> #URL</a> #Description</li> | #Name,#URL,#Description Google,www.google.com,Search engine. Amazon,www.amazon.co.uk,Bookshop. |
Replace the labels in the template with equivalent text from the data file. That is, the label "#Name" is replaced with the string "Google" and so on. In this case the output is written to a single file:
Output |
---|
<li>Google <a href="http://www.google.com" title="Search engine."> www.google.com</a> Search engine.</li> <li>Amazon <a href="http://www.amazon.co.uk" title="Bookshop."> www.amazon.co.uk</a> Bookshop.</li> |
The master and data files used by webgen are in tab-delimited csv format.
Python 3 is needed to run webgen, so make sure that that is installed.
Run a Python script by passing it as an argument to Python.
The simplest way to run the script is to place a copy in the same directory as your master, template and data files. In Linux just type:
python3 webgen.py
(The Windows equivalent should be something like C:\Python31\python.exe webgen.py
)
Prepare the master, template and data files in the directory where webgen will be run. There should be a subdirectory to which the generated html will be written. It should already contain any additional files needed to complete the site (i.e. css, image files, etc.).
The webgen script takes two optional parameters:
Parameter | Default Value | Notes |
---|---|---|
Master file name | master.csv | In current directory. | Output directory name | site | Must already exist. |
The Master file has the following structure:
Column Name | Values | Notes |
---|---|---|
Level | none or integer | Level of output in site map hierarchy. | SingleFileOutputYN | Y or N | Write out one or several output files. | Template | Filename | Mandatory. | DataFile | none or Filename | none if Template contains no labels. | #FileName | none or Filename | none if multiple output files. | #FileLabel | none or string | Label used in site map. |
The webgen script generates html in three phases and then writes out all the generated files.
Read through the master file processing one record at a time.
If the DataFile name is "none" then the output is an unchanged copy of the Template. Otherwise, for each record in the DataFile create a copy of the Template with any Label text replaced by the corresponding data field contents.
If the SingleFileOutputYN flag is "Y" then all of these copies are concatenated. Otherwise, they are kept separate.
The site map template is called "tSiteMap". Its contents vary but may include the following labels: #Level, #Name and #Label.
Here is an example where each entry in the site map is a paragraph:
tSiteMap |
---|
<p class='navlevel#Level'> <a href="#Name" title="#Label"> #Label</a></p> |
For each generated item, where the value of Level in the Master file is not "none": A copy of the tSiteMap template is amended to replace the Labels:
Label | Replacement Value |
---|---|
#Level | Level in the Master file. |
#Name | #FileName from the Master file or the DataFile. |
#Label | #FileLabel from the Master file or the DataFile. |
The #FileName and #FileLabel values come from the Master file if SingleFileOutputYN flag is "Y" (i.e. there is only one output file for the current item).
The completed site map output is added to the list of generated output with file name "sitemap.txt".
Read through all of the generated output looking for tags in the form
<webgen>filename</webgen>
.
Replace any found with the contents of the matching "filename".
For example, the site map could be embedded on each web page to provide a navigation menu for the generated site. (see Case Study below)
Information only.
Processing continues though some of the output data will be missing.
Processing halts after one of these conditions is encountered.
To illustrate how webgen can be used, a sample site is included with the webgen download.
Master file contents:
Level | SingleFileOutputYN | Template | DataFile | #FileName | #FileLabel |
none | Y | tCommon | none | common.txt | Common elements |
1 | Y | tHomepage | none | index.html | Home |
none | Y | tWidgetSummary | dWidgets.csv | widgetSummary.txt | Widget Summary |
2 | Y | tWidgetList | none | widgetlist.html | Widgets |
3 | N | tWidgetDetail | dWidgets.csv | none | none |
The first record contains column names. Processing starts with the record below.
The Level column contains "none". This means that the output generated from this line will not be included in the site map. Usually, this because the output will not be a complete html page.
"Y" in the SingleFileOutputYN column means that the output generated from this line will all be written to a single file.
"tCommon" is the name of the Template:
tCommon |
---|
<div id="banner"> <img src="img/widgetLogo.png" /> </div> <div id="navmenu"> <webgen>sitemap.txt</webgen> </div> |
The line <webgen>sitemap.txt</webgen>
is replaced by the contents
of sitemap.txt
after that has been generated.
In this case the DataFile name is "none". There are no Labels in tCommon to replace.
The generated output is written to "common.txt" (column #FileName).
Other templates can contain <webgen>common.txt</webgen>
which will
be replaced by the contents of this output file.
Column #FileLable describes the data generated by this record and is used to lable the entry for that data in the site map.
The Level value is used to indicate where in the site map hierarchy the generate output should be placed. Any string can be used and matching css written later.
The Template, DataFile and #FileName column contents indicate that template "tHomepage" is used to create "index.html".
"index.html" is labelled "Home" in the site map.
Template "tWidgetSummary" and DataFile "dWidgets.csv" are used to generate a single file called "widgetSummary.txt".
Inputs:
tWidgetSummary | dWidgets.csv |
---|---|
<tr> <td><a href="#FileName">#FileLabel</a></td> <td><a href="#FileName"><img src="img/#SmallImg"></a></td> </tr> |
"#FileName" "#FileLabel" "#Name" "#SmallImg" . . . "widget1.html" "First" "First Widget" "firstWidgetSmall.png" . . . "widget2.html" "Second" "Second Widget" "secondWidgetSmall.png" . . . "widget3.html" "Third" "Third Widget" "thirdWidgetSmall.png" . . . |
Output:
widgetSummary.txt |
---|
<tr> <td><a href="widget1.html">First</a></td> <td><a href="widget1.html"><img src="img/firstWidgetSmall.png"></a></td> </tr> <tr> <td><a href="widget2.html">Second</a></td> <td><a href="widget2.html"><img src="img/secondWidgetSmall.png"></a></td> </tr> <tr> <td><a href="widget3.html">Third</a></td> <td><a href="widget3.html"><img src="img/thirdWidgetSmall.png"></a></td> </tr> |
Generate a single output file called "widgetlist.html" from the template "tWidgetList". This will be a level-2 item in the site map, labelled "Widgets".
The template "tWidgetList" basically contains a html page wrapped around the contents of "widgetSummary.txt".
Generate multiple output files from template "tWidgetDetail" using data in "dWidgets.csv".
These will be level-3 items in the site map. The output file names and labels are contained in the first two columns of "dWidgets.csv".