Manual for psimpoll and pscomb

QUB | Archaeology and Palaeoecology | The 14Chrono Centre

Manual for psimpoll and pscomb

Introduction

Background
Implementation
Installation
Colour

Background

psimpoll 4.25 and pscomb 1.03 are ANSI C programs that translate pollen and other stratigraphical data into PostScript page description files that can be `printed' on a device with a Postcript interpreter to produce a pollen or other stratigraphical diagram. The aims of these programs are to:

produce pollen diagrams of publishable quality;
enable the transmission of pollen diagrams on electronic media (diskettes, e-mail);
facilitate the use of numerical, multivariate analyses of pollen data by linking such analyses to the production of diagrammatic output as a matter of routine;
be highly portable, independent of any particular hardware.

Most pollen analysts today have access to computer facilities for carrying out, at least, basic calculations on their raw data and plotting pollen diagrams. Traditional requirements for the presentation of pollen data in graphical form make it difficult to use commercial graphics packages for the purpose, and most pollen analysts use software written specially for this use. For example, in the 1970s John Birks and Brian Huntley developed POLLDATA for calculation and graphical presentation of pollen data on the Cambridge IBM mainframe. This program was successfully transferred to other mainframes, and now has a PC version. More recently, a number of programs for PCs, notably TILIA and TILIAGRAPH (Grimm 1992) have come into general use in many pollen labs around the world (Davis 1993). I have seen and used several such programs (all my Ph.D. thesis work was done with POLLDATA) but over the last few years all my pollen data-crunching and plotting has been done with software that I have written myself: psimpoll and pscomb are the latest manifestations of that.

Commercial graphics packages cannot be used easily to plot pollen diagrams, but spreadsheets are well-suited to making calculations from raw pollen data. There is little to be gained from writing software for this aspect of pollen data-handling, and psimpoll provides none. The most inconvenient part of routine data-handling is the highly subjective process of constructing a pollen sum and calculating proportions from it. The process requires decisions about each taxon (in the main sum, or not), and for the entire dataset to be present (for the construction of the sum). Spreadsheets provide a convenient way for the analyst to do both. Results can be exported as a text file that may then be read by a plotting program directly, or after modification by a suitable text editor. psimpoll provides the first of two steps necessary to plot results: it reads in calculated pollen and ancillary data, annotated to indicate data type (percentages, concentrations, etc), performs certain non-routine analyses, and writes a PostScript page description file, containing the information needed for a PostScript interpreter to produce the pollen diagram. The second step is the passing of that file to the interpreter.

psimpoll reads data from up to four files supplied by the user, and may produce up to nine output files, including a configuration file that saves the current setup for use with subsequent runs. All these files (input and output) are plain text and are readable and modifiable by text editors. The user's input files consist of the main input file (essential), and up to three optional associated files with data on location of zones, radiocarbon ages, and sediment stratigraphy, using the notation of Troels-Smith (1955). psimpoll looks for associated files from default or provided file names, uses them if they are present, and carries on without if they are not.

The format of output can be altered by options introduced within the main input file, and by options selected from a menu when psimpoll is run. Within the main input file, a two-character code identifies the data (e.g., pollen percentages, concentrations etc), all presented by depth, age, or sample numbers (e.g., for a surface sample dataset). Suitable editing of the dataset enables the user to mark particular samples, pollen types, or individual data values to be ignored, plot different pollen types at different scales, mark certain `types' as charcoal, rarefaction data, rate-of-change data, or to enable psimpoll to recognize a sequence of data for a summary diagram. psimpoll can also carry out rarefaction analysis, zonation, principal components analysis, independent splitting of taxa, rate-of-change analyses, and basic statistical description on suitable datasets. All diagram labelling can be modified, and all characters from European languages with Roman script, plus Greek, are available. Thus, although psimpoll runs in English, output can be tailored for other languages.

psimpoll will either run a dataset immediately using default options, or the user can change the defaults through a menu. Any changes can be saved for reuse in a configuration file. PostScript output is written to a file by default, but can be directed straight to a printer with PostScript interpreter. Output includes sediment stratigraphy, pollen zones, and radiocarbon dates if the necessary associated files were present when psimpoll was run.

psimpoll can handle only one input dataset at a time. It is obviously desirable to be able to combine data from different datasets (sites) to produce combined plots. This facility is provided by pscomb (pronounced to rhyme with `tome', but `scum' will also do). pscomb reads data from PostScript files previously written by psimpoll, enables selection of particular pollen types or other output (e.g., depth axes, zonation, radiocarbon, or sediment columns), and combines them into a single output file. pscomb thus allows the simultaneous presentation on a common axis of taxa from many different datasets.

psimpoll is a straightforward plotting program. It works reasonably well, and has most features that a pollen analyst will want. It exists because I have found that it suits me to have available a program that I can tinker with: adding, modifying, and deleting features as necessary. I do not have the time to ensure that the program is completely bug-free, user-friendly, or documented to the standard that would be required for a marketable product. I fix problems as they occur, or when a colleague complains loudly enough. It copes with the kind of data generated in our group, but I have made no effort to include features that might be found useful by others.

psimpoll and pscomb are not guaranteed bug- or idiot-proof. I believe they work for all reasonable input, but I have not had a chance to check out all possible combinations of options. There is a degree of error checking, but it is up to users to get their datasets in order. Please notify me about examples of unsatisfactory output so I can prevent the same problem occurring in the future.

Implementation

psimpoll and pscomb have been installed on PCs with processors from 8086, 80286, 80386, 80486, and Pentium families, running under DOS, Windows 3.1, Windows 95, Windows 98, Windows NT, and Windows 2000, and under SCO UNIX, Linux, Sun and Silicon Graphics, and on Apple Mac Plus, Mac LCII, Performa and iMac.

Pollen data is normally collected by a Psion Organiser (Bennett 1990) and transferred to a computer. On PCs, the data may be manipulated using Microsoft Excel or Lotus 123 (or other spreadsheet), and text files suitable for input to psimpoll produced with a text editor. A DOS batch file is available that enables the running of psimpoll sequentially with each of up to nine files, from any part of the file system (i.e., not necessarily in the same directory as the program). The command is pp, and it can (optionally) be followed by a list of up to 9 filenames on the same line (pp filename1 filename2 etc).

On Macintosh computers, psimpoll should be installed within a `psimpoll' folder, and data files should be placed in the same folder. I have not yet figured out Mac folder and filename conventions to the extent of being able to advise on how to run the program using datasets in other folders.

PostScript output files can be printed under DOS on a LaserWriter with the command:

PRINT filename <Enter>

(answer COM1 if asked for a `list device')

Output files produced on Macintosh can be printed on a LaserWriter connected to an Apple computer using ShowPages.

PostScript files can be viewed using the GhostScript and GhostView family of programs on Windows, Apple, and Unix environments.

The main advantage of using a locally-written program such as psimpoll is that changes can be made in response to particular needs. Many of the options in psimpoll stem from requests for such features. If you want or need something that is not available now, ask.

Installation

IBM-compatibles, Apple MacIntosh, Linux

psimpoll and pscomb can be obtained, free and gratis, from Uppsala by anonymous ftp. Executable files and documentation are in the sub-directory `pub/psimpoll'. These files normally contain the current versions of programs and documentation. It is also available at the INQUA public archive at Wisconsin, and may be retrieved by anonymous ftp. DOS, Windows 95 etc, Linux, and Apple executable files are in subdirectory `/pub/inqua'. In either case, files should be downloaded by binary transfer, unpacked and installed on a hard-disc drive.

The same files may be accessed on the web at URL http://www.kv.geo.uu.se/psimpoll.html.

The versions of the program that run on window-based systems (Windows 3.1, Windows 95, Apple) use a `console' window that provides a means to run text-based programs within the windowed environment. This console usually has to be closed explicitly after the end of the program.

Other computers

Contact me (address and other details in the Preface). I will supply the ANSI C source code and you will then need to compile this with your favourite C compiler. It has been compiled successfully on AIX, Sun, and Silicon Graphics operating systems, in addition to those mentioned above.

Colour

The file psimpoll.COL needs to be available in order for colours to be used with diagrams. The location of the file varies depending on the system. For command line systems (DOS, Unix), place psimpoll.COL in the same directory as the data files. For window-based systems (Windows 3.1, Windows 95, Apple), place psimpoll.COL in the same directory as the program. If you edit this file on computers with more than one psimpoll-user, bear in mind that changes you make may affect other users.

Back to contents page

Archaeology and Palaeoecology | 42 Fitzwilliam St | Belfast BT9 6AX | Northern Ireland | tel +44 28 90 97 5136

Archaeology and Palaeoecology | The 14Chrono Centre | URL http://www.qub.ac.uk/arcpal/ | WebMaster

Queen's University of Belfast
- Archaeology and Palaeoecology Homepage
  - The 14Chrono Homepage