webgen Manual

Table of Contents

  1. Overview
  2. Install
  3. Usage
  4. Messages
  5. Case Study

Overview

webgen is a tool which someone familiar with html and css can use to generate some or all of a website. It can generate many pages or html fragments which have a common structure.

It does so by using string substitution to replace label strings in template files with equivalent data strings. For a given pair of template and data files it can write the generated text to a single file or to a spearate file for each line of data.

ahernp.com has been generated using webgen, though most of the pages have unique structures so the main benefit has been in creating the tables of items in my Library catalogue.

webgen can be downloaded from webgen.zip. The download includes a simple sample website, generated from three templates which contains a homepage, a list of items as well as separate pages for each item. Python 3 is needed to run webgen.

Back to Top

Process

  1. Read in a master file which contains a list of template and data files.
  2. Process each template-data file pair in turn to create html.
  3. Create a site map from a list of all the full html pages generated.
  4. Make a second pass through the generated files replacing references to other files with their contents.
  5. Write out all of the generated files.

Example

To generate a series of list item tags from a template and data file:

TemplateData
<li>#Name <a href="http://#URL" 
	title="#Description"> 
	#URL</a> 
	#Description</li>
#Name,#URL,#Description
Google,www.google.com,Search engine.
Amazon,www.amazon.co.uk,Bookshop.
	

Replace the labels in the template with equivalent text from the data file. That is, the label "#Name" is replaced with the string "Google" and so on. In this case the output is written to a single file:

Output
<li>Google <a href="http://www.google.com" 
	title="Search engine."> 
	www.google.com</a> 
	Search engine.</li>
<li>Amazon <a href="http://www.amazon.co.uk" 
	title="Bookshop."> 
	www.amazon.co.uk</a> 
	Bookshop.</li>

The master and data files used by webgen are in tab-delimited csv format.

Back to Top

Install

Python 3 is needed to run webgen, so make sure that that is installed.

Run a Python script by passing it as an argument to Python.

The simplest way to run the script is to place a copy in the same directory as your master, template and data files. In Linux just type:

	python3 webgen.py

(The Windows equivalent should be something like C:\Python31\python.exe webgen.py)

Back to Top

Usage

Prepare the master, template and data files in the directory where webgen will be run. There should be a subdirectory to which the generated html will be written. It should already contain any additional files needed to complete the site (i.e. css, image files, etc.).

The webgen script takes two optional parameters:

ParameterDefault ValueNotes
Master file namemaster.csvIn current directory.
Output directory namesiteMust already exist.

Master File

The Master file has the following structure:

Column NameValuesNotes
Levelnone or integerLevel of output in site map hierarchy.
SingleFileOutputYNY or NWrite out one or several output files.
TemplateFilenameMandatory.
DataFilenone or Filenamenone if Template contains no labels.
#FileNamenone or Filenamenone if multiple output files.
#FileLabelnone or stringLabel used in site map.

webgen process

The webgen script generates html in three phases and then writes out all the generated files.

Phase 1: Cardgen

Read through the master file processing one record at a time.

If the DataFile name is "none" then the output is an unchanged copy of the Template. Otherwise, for each record in the DataFile create a copy of the Template with any Label text replaced by the corresponding data field contents.

If the SingleFileOutputYN flag is "Y" then all of these copies are concatenated. Otherwise, they are kept separate.

Phase 2: Build Site Map

The site map template is called "tSiteMap". Its contents vary but may include the following labels: #Level, #Name and #Label.

Here is an example where each entry in the site map is a paragraph:

tSiteMap
<p class='navlevel#Level'>
<a href="#Name" title="#Label"> 
#Label</a></p>

For each generated item, where the value of Level in the Master file is not "none": A copy of the tSiteMap template is amended to replace the Labels:

LabelReplacement Value
#LevelLevel in the Master file.
#Name#FileName from the Master file or the DataFile.
#Label#FileLabel from the Master file or the DataFile.

The #FileName and #FileLabel values come from the Master file if SingleFileOutputYN flag is "Y" (i.e. there is only one output file for the current item).

The completed site map output is added to the list of generated output with file name "sitemap.txt".

Phase 3: Embed Generated Output

Read through all of the generated output looking for tags in the form <webgen>filename</webgen>.

Replace any found with the contents of the matching "filename".

For example, the site map could be embedded on each web page to provide a navigation menu for the generated site. (see Case Study below)

Write out all of the generated files

Back to Top

Messages

Informational

Information only.

Current Working Directory: Where webgen is running
Directory where webgen looks for input files and the output directory.
Master file: MasterFile; Output directory: OutputDir
Names of the master file and output directory.
Done, Number of Output files written.
Confirmation that the webgen process has completed.

Warning

Processing continues though some of the output data will be missing.

Wrn#01: Template File templateFileName IOError (Exception)
Template file not found.
Wrn#02: Number of Labels (Num Labels) does not match number of Data Items (Num Data Fields) in file Data File Name record Data Record
Data file does not have the same number of fields in each record.
Wrn#03: Data Item not found Label in file Data File Name record Data Record IndexError (Exception)
Data field not found.
Wrn#04: Data File Data File Name IOError (Exception)
Data file not found.
Wrn#07: In file File Name, item to embed, Tagged File Name, not found. KeyError (Exception)
Item to embed not found.

Error

Processing halts after one of these conditions is encountered.

Err#05: Master File File Name IOError (Exception)
Master file not found.
Err#06: Site Map Template File Name IOError (Exception)
Site map template not found.
Err#08: Output File Output File Name IOError (Exception)
Problem writing an output file.

Back to Top

Case Study

To illustrate how webgen can be used, a sample site is included with the webgen download.

Master file contents:

LevelSingleFileOutputYNTemplateDataFile#FileName#FileLabel
noneYtCommonnonecommon.txtCommon elements
1YtHomepagenoneindex.htmlHome
noneYtWidgetSummarydWidgets.csvwidgetSummary.txtWidget Summary
2YtWidgetListnonewidgetlist.htmlWidgets
3NtWidgetDetaildWidgets.csvnonenone

The first record contains column names. Processing starts with the record below.

Second record

The Level column contains "none". This means that the output generated from this line will not be included in the site map. Usually, this because the output will not be a complete html page.

"Y" in the SingleFileOutputYN column means that the output generated from this line will all be written to a single file.

"tCommon" is the name of the Template:

tCommon
<div id="banner">
<img src="img/widgetLogo.png" />
</div>
<div id="navmenu">
<webgen>sitemap.txt</webgen>
</div>

The line <webgen>sitemap.txt</webgen> is replaced by the contents of sitemap.txt after that has been generated.

In this case the DataFile name is "none". There are no Labels in tCommon to replace.

The generated output is written to "common.txt" (column #FileName). Other templates can contain <webgen>common.txt</webgen> which will be replaced by the contents of this output file.

Column #FileLable describes the data generated by this record and is used to lable the entry for that data in the site map.

Third Record

The Level value is used to indicate where in the site map hierarchy the generate output should be placed. Any string can be used and matching css written later.

The Template, DataFile and #FileName column contents indicate that template "tHomepage" is used to create "index.html".

"index.html" is labelled "Home" in the site map.

Fourth Record

Template "tWidgetSummary" and DataFile "dWidgets.csv" are used to generate a single file called "widgetSummary.txt".

Inputs:

tWidgetSummarydWidgets.csv
<tr>
<td><a href="#FileName">#FileLabel</a></td>
<td><a href="#FileName"><img src="img/#SmallImg"></a></td>
</tr>
"#FileName"	"#FileLabel"	"#Name"	"#SmallImg" . . .
"widget1.html"	"First"	"First Widget"	"firstWidgetSmall.png" . . .
"widget2.html"	"Second"	"Second Widget"	"secondWidgetSmall.png" . . .
"widget3.html"	"Third"	"Third Widget"	"thirdWidgetSmall.png" . . . 
	

Output:

widgetSummary.txt
<tr>
<td><a href="widget1.html">First</a></td>
<td><a href="widget1.html"><img src="img/firstWidgetSmall.png"></a></td>
</tr>

<tr>
<td><a href="widget2.html">Second</a></td>
<td><a href="widget2.html"><img src="img/secondWidgetSmall.png"></a></td>
</tr>

<tr>
<td><a href="widget3.html">Third</a></td>
<td><a href="widget3.html"><img src="img/thirdWidgetSmall.png"></a></td>
</tr>

Fifth Record

Generate a single output file called "widgetlist.html" from the template "tWidgetList". This will be a level-2 item in the site map, labelled "Widgets".

The template "tWidgetList" basically contains a html page wrapped around the contents of "widgetSummary.txt".

Sixth Record

Generate multiple output files from template "tWidgetDetail" using data in "dWidgets.csv".

These will be level-3 items in the site map. The output file names and labels are contained in the first two columns of "dWidgets.csv".

Back to Top