Zutty developer guide

Table of Contents

Hacking on Zutty

Compiling for development or debugging

If you plan to spend time hacking on Zutty, you should first compile a debug build of it. The purpose of this is to compile in trace logging output, as well as debugging facilities.

To configure the build for this development mode, add the option --debug to the configuration:

./waf configure --debug

Feel free to add any other options you usually use for the normal build. Confirm that you see this line in the output:

Debug build                              : yes

Then, compile Zutty as usual:

./waf [-j <JOBS>]

The build will produce an executable named zutty.dbg (as opposed to zutty for normal builds). This allows you to keep a normal build and a debug build from the same source side by side each other, which might prove useful while working on some debugging-intensive problem.

Normally, you will not want to install the development build system-wide. Run it from an existing terminal as build/src/zutty.dbg -v (plus any other options) to receive verbose debugging logs in the console.

As stated in the User guide, by default (if neither -quiet nor -verbose is given), Zutty will print errors and warnings, but not info messages. Those have to be enabled with -verbose, and in case of a debug build, that will enable Trace messages as well.

The Step Debugger

A debug build of Zutty has compiled-in support not just for generating lots of trace output, but also for a facility called the Step Debugger. This is a facility specifically built to enable close inspection of the virtual terminal in action.

The Step Debugger is initially dormant; it can be activated by pressing the "debug key", which is hardwired to PrintScreen (but the hook for it is only compiled in the debug build, so it does not affect the PrintScreen key in a normal build).

When the Step Debugger is activated, the Zutty process will suspend itself on the completion of every N-th escape sequence (i.e., when the state machine is set back to "normal text input"), with N cycled through 1, 10, 100 and zero (off) on each additional press of the PrintScreen key.

Move focus to the Zutty window, press PrintScreen, and confirm that you see this line in the log output:

T [vterm.icc: 40] *** DEBUG step=1

The step=1 communicates that going forward, Zutty will suspend itself on every completed escape sequence. To test this, type something (I am typing the letter k) and then hit Backspace. This will trigger the Debug Stop, and the log will look similar to this:

T [vterm.icc:366] pty write (mod=0): 'k' (1 bytes)
T [vterm.icc:400] pty read: 'k' (1 bytes)
T [vterm.icc:617] hideCursor []         p(0,48)  d(42,103)  mgn[0,42)  hmgn:0 [0,103)
T [vterm.icc:108] Inserted: 'k' (1 bytes)
T [vterm.icc:605] showCursor []         p(0,49)  d(42,103)  mgn[0,42)  hmgn:0 [0,103)
T [vterm.icc:389] pty write: '\x7f' (1 bytes)
T [vterm.icc:400] pty read: '\b\e[K' (4 bytes)
T [vterm.icc:617] hideCursor []         p(0,49)  d(42,103)  mgn[0,42)  hmgn:0 [0,103)
T [vterm.icc:914] csi_CUB []    p(0,49)  d(42,103)  mgn[0,42)  hmgn:0 [0,103)
T [vterm.icc:1144] csi_EL [0]   p(0,48)  d(42,103)  mgn[0,42)  hmgn:0 [0,103)
T [vterm.icc: 50] *** DEBUG STOP (step=1), 4 bytes since last:
        '\b\e[K' (4 bytes)
T [vterm.icc: 55] Issue 'kill -CONT 5429' or 'fg' to continue.

Following the trace log, we can see that first the keypress was sent to the shell (pty write), then the echoed character was read back from the shell (pty read), which triggered an insertion of k as text. The screen-updating is surrounded by hiding and showing the cursor. Then, as a consequence of the Backspace keypress, Zutty sends the DEL character (\x7f) to the shell, which echoes back \b\e[K: first \b, a backspace character, followed by \e[K, or a "CSI K" sequence in VT100-speak, which instructs Zutty to clear the terminal screen from the cursor to the end of line. Since this constitutes a finished escape sequence, this is the point where the Debug Stop occurs.

As part of processing the input stream from the terminal, Zutty executed the functions csi_CUB (interpreting \b) and csi_EL (interpreting \e[K). The former stands for Cursor Back, the latter for Erase Line – these are the names of the requested screen operations according to the VT100-series standards. The arguments parsed from the input escape sequences, effective at the time of the function call, are visible in square brackets. The single argument zero (shown as [0]) passed to the Erase Line routine is the implicit default; the actual escape sequence did not contain a numeric argument.

Also worth noting is that on each trace point (such as hideCursor, showCursor, csi_CUB, etc.) several state variables are printed. These include the current cursor position (0,49), the dimensions (42,103), and the scrolling area top and bottom margins [0,42) (the mismatched brackets denote that the top row of scrolling is inclusive of the scrolling range, while the bottom row is exclusive).

To resume Zutty, send a continue signal (SIGCONT) via kill, or just type fg into the terminal running Zutty. The latter is a neat trick that relies on the parent terminal detecting that the child process has been stopped, and putting it into the background.

When trying to understand how the virtual terminal processes a longer input sequence, it becomes tedious to step through each escape sequence one by one. Pressing PrintScreen will cycle through the number of debug steps to go between each stop: first 1, then 10, 100, then 0 (the Step Debugger is turned off), then back to 1, 10, 100, then off again, etc. While doing this, you will see this in the output log:

T [vterm.icc: 40] *** DEBUG step=10
T [vterm.icc: 40] *** DEBUG step=100
T [vterm.icc: 40] *** DEBUG step=0
T [vterm.icc: 40] *** DEBUG step=1
T [vterm.icc: 40] *** DEBUG step=10
T [vterm.icc: 40] *** DEBUG step=100
...

Automated testing

By their very nature, graphical terminal emulators are interactive programs. Therefore, testing them in an automated fashion (e.g., for regression testing) can be tricky.

We employ a method suitable as a general means to automation, as it is independent of the terminal under test: it does not require modifying the program by e.g., implementing test hooks to inject events or report screen content. This allows us to include several established terminal emulators in the test along with Zutty to research the state of the art and see how different programs stack up against each other.

Prerequisites:

  • imagemagick (for convert & identify), wmctrl, xvkbd

    apt-get install imagemagick wmctrl xvkbd
    

On a high level, testing consists of these steps:

  • Start the terminal (the unit under test) as a subprocess and note its pid
  • Obtain its X window id:

    wmctrl -lp | grep <pid> | awk '{print $1}'
    
  • use xvkbd to send events to the window:

    xvkbd -window <id> -no-jump-pointer -text "\D3\{+1}\D3\{-1}\D3\{+Return}\D3\{-Return}\D3"
    

    Note the explicit keysym presses and releases, plus the interleaved delays. For reference, see: http://t-sato.in.coocan.jp/xvkbd/

  • Make a screenshot of the window via the window id

    xwd -nobdrs -id <id> | convert xwd:- png:- > <shot-name>.png
    
  • Generate a signature of the screen content, to be compared against a reference value:

    identify -verbose <shot-name>.png | grep signature | awk '{print $2}' | cut -32
    

    (We cut the hash in half to make it less unwieldy.) Having consistent hashes that depend only on the rendered pixel image is convenient, as we do not need to store the reference images themselves beside our test script.

The above steps are automated by some fairly straightforward bash scripts under the test/ subdirectory. These scripts all source the testbase.sh script, which constitutes the test library.

CAVEATs

While Zutty itself compiles and runs on Linux as well as BSD platforms, the test toolkit assumes a specific development environment (most notably: Linux platform with bash as the default shell, and the presence of a window manager with certain standard capabilities).

Please also note that the reference signatures of screen captures are valid for ImageMagick version 6 only; version 7 has altered the image signature hash algorithm, which breaks the hashes included in the tests.

Anatomy of a test script

Each executable script under test/ is an individually runnable test suite. It is written as a plain old bash script, sourcing the testbase.sh test code library and using its facilities. For example, truecolor.sh is a very simple test script to test support for setting color attributes to truecolor (24 bit) values. The full script is reproduced below:

#!/usr/bin/env bash

cd $(dirname $0)
source testbase.sh

IN "source truecolor_inc.sh\r"
SNAP truecolor_01 33a31e4d3b9fbe486c27b01764dc1823

The script starts by declaring itself as a shell script, then setting up its working directory to be the location of the script (a convenience to make relative file paths work in later parts of the script, independent of the location the script was invoked from). Then testbase.sh is sourced.

The actual test code is just two lines, starting with the commands IN and SNAP. These are invocations of functions defined in testbase.sh and execute in the environment set up by sourcing that file.

IN will send the specified string, as keyboard input, to the terminal under test. In our example, the shell running in the terminal will source the file truecolor_inc.sh that contains some setup code (not reproduced here) to make a certain pattern appear on the terminal. Note the trailing \r that will result in a virtual Enter keypress.

The subsequent SNAP will capture the resulting terminal window content under the name truecolor_01, generate a signature (hash) of it, and compare that with the supplied value. If there is a match, the output is verified to be correct; else a test failure is reported.

The snap name is used to save the captured window image under test/output/<profile>. This is useful for later inspection of test results. With the default profile (see Test profiles below), the output of this test will be saved as test/output/zutty/truecolor_01.png.

There are some other useful functions exported by the test framework, e.g.: CHECK_DEPS, CHECK_FILES, and WAIT_FOR_DOT_COMPLETE. If you encounter them in test script code, it is best to look directly in testbase.sh for their implementation.

Note that starting and stopping of the terminal under test is done as part of the test framework and nothing is explicitly written in the test script. See Test profiles below on how to control the details of this process.

Common test script options

Test scripts similar to the one shown above (building on testbase.sh) all take a uniform set of command line options. All arguments are optional, below defaults are in effect for omitted ones. Syntax: --<arg-name>=<arg-value>; value defaults to "yes" if omitted.

Option             Default
--------------------------
--ci-mode          no
--debug            no
--profile          zutty
--step             no
--update-sig       no
  • --step

    The --step option can be given without argument, in which case it will be equivalent to --step=yes, or given as --step=new. The former one will result in step mode, which will pause immediately after each snapshot is taken (the terminal under test still displaying this output, allowing visual inspection), and display a prompt:

    [S]tep / [N]ew only / [C]ontinue / [Q]uit (s/n/c/q) ?
    

    This allows the user to choose how to proceed:

    • S - continue stepping, i.e., stop after each snapshot
    • N - continue without stopping, except on new snapshots
    • C - continue without ever stopping again
    • Q - quit the test

    The new option is useful when developing a test suite. It will run the test script forward until a SNAP command without a verification hash is found.

  • --update-sig

    Enabling --update-sig will result in a prompt on a verification failure, i.e., when the SNAP command captures a screenshot with a different hash than the reference stored in the script:

    Update signature: [y]es / [N]o / [a]ll ?
    

    By answering Y here, the signature in the script will be updated. By answering A, all future differences will also be updated without a further prompt. This is useful in case the behaviour of Zutty is changed in a way that alters its output; in case it is established that the new output is "more correct" than the previous one; and we want to adapt the tests to verify against this new output in the future.

    Use this with a great deal of caution. It is recommended to use Y in favour of A and before answering each prompt, to do a careful visual inspection of each screen for correctness.

  • --profile

    See Test profiles below.

  • --ci-mode

    The --ci-mode option sets up the test script to execute in an unattended manner, suitable for automated testing. Step mode is turned off (overriding --step), signature updates are turned off (overriding --update-sig), and the script is set up to immediately exit with a nonzero code on a verification failure.

  • --debug

    The --debug option sets the variable DEBUG to yes (instead of its default value no). This is used by the zutty test profile to ensure that the correct UUT executable (zutty vs. zutty.dbg) is launched. This option is currently not used by any other profile, nor for any other, more general purpose.

Correctness tests

The list of correctness tests (automatically run in sequence by The CI test script):

  • keys.sh: Keyboard input handling (see Key mapping sequences for further documentation). Note: This test might fail if your computer is configured to use a non-US keyboard.
  • nonascii.sh: Test displaying non-ascii (double-width and other exotic) characters, based on the Emacs 'hello' file (M-x view-hello-file)
  • scrollback.sh: Scrollback (page history) support
  • title.sh: Setting the window title from within the terminal via escape sequences
  • truecolor.sh: True color support
  • utf8.sh: UTF-8 support (based on the UTF-8 decoder capability and stress test by Markus Kuhn)
  • vttest.sh: VTTEST screens. Note: This suite depends on a specific version of vttest, and will complain if the version found does not match. Just run the vttest install script mentioned by the error message, and you should be good to go.
  • wraptest.sh: Additional tests for verifying the correctness of last-column/pending-wrap subtleties. Note: This suite depends on the wraptest program (not bundled with the zutty source repository). Just run the install script mentioned by the error message you get on the first run, and you should be good to go.

Apart from running all the tests via The CI test script (which you should routinely run during development, and especially before sending a patch or opening a pull request), it is also possible to run any of the above tests manually. For example, to run the test automating a traversal of Vttest's menu system (and in case of running against Zutty, also verifying that results are as expected):

test/vttest.sh

Do not forget about the Common test script options above; those become useful during development (both of Zutty itself and the test suites). For example, to run the above test step-by-step (stopping at each image checkpoint):

test/vttest.sh --step

Font rasterization tests

A separate test fonts.sh exists that is very similar to the correctness tests above, but is not part of The CI test script. This renders a captured hardcopy of a terminal-intensive program (BpyTOP) exercising many features (chiefly, Unicode line drawing glyphs) that are helpful in spotting errors in the font loading and rasterization process employed by Zutty, and validates the result across a matrix of several font faces and sizes. Please read the commentary in the test script for the fonts required to run the test.

Performance tests

The performance of Zutty can be verified with the below tests. These are to be run manually (similar to how you run any of the Correctness tests on an ad-hoc basis). The goal of these tests is to get a handle on performance under repeatable circumstances.

Since everyone's hardware and systems are different, it does not make too much sense to compare numbers obtained by different people at different times. Rather, it is most useful to compare the results obtained on the same system (running the comparison tests after one another without any changes to the rest of the system), with the goal of comparing a proposed set of patches to the baseline of Zutty, or to compare Zutty with another terminal emulator.

Needless to say: when running the below scripts, your machine should be otherwise idle.

While correctness tests should ideally be run on both the debug build (do this by passing --debug to the test scripts) and the normal build, it does not make sense to run performance tests on the debug build. The results would be meaningless due to excessive logging and hitting other debug-only code paths.

Current performance tests:

  • cat_dict.sh: Arguably the dumbest possible performance test of any text terminal, this test consists of outputting a very long text file containing (mostly) very short lines with English words, one per line. This test will be repeated a number of times (check TIMES in the script) and the overall timing and data throughput will be computed at the end. Since the input does not contain any terminal controls (escape sequences), it is a measure of the raw incoming data rate the terminal can sustain while frequently forced to scroll/page its output. This load resembles one extreme end of the way a terminal can be used.
  • cat_vtscript.sh: This test generates load resembling the other extreme end of possible terminal usage, by outputting the stream of data written to the terminal in the course of the VTTEST cycle (see vttest.sh among the Correctness tests). However, instead of verifying the correctness of the generated screen output, here we are interested in the performance of processing the input stream heavy on all kinds of escape sequences. Screen updates are dominated by intra-screen rewrites and relatively little scrolling/paging activity is forced. Similar to the cat_dict.sh test, the input is fed into the terminal a number of times, and overall timing and throughput is measured and calculated.

The CI test script

The script test/run_ci.sh will run all automated Correctness tests in sequence with the --ci-mode option, stopping and exiting with a nonzero return code if any of them exits with an error, and concluding with a confirmatory message and zero return code otherwise.

You should always run this at a convenient time (it will occupy your screen for about 25 minutes) and observe the successful result before sending a patch with your changes. For extra confidence, build Zutty both with and without debug, and run the CI script both with and without the --debug option.

If you have changed anything related to font loading or rasterization (font.cc or fontpack.cc), make sure you run the Font rasterization tests as well. Please read the commentary in the test script for more context.

Test profiles

Given that our method of testing is independent of the terminal itself (meaning that we do not rely on any hooks or test instrumentation in the terminal itself), we can run the same tests against other terminals, too. This is useful for comparison and research purposes. The default profile is zutty, invoking the locally built Zutty executable as the unit under test.

Profiles are defined as shell include files under test/profiles, and can be invoked by passing the --profile option to the test script. For example, to run the VTTEST suite against xterm:

./test/vttest.sh [--step] --profile=xterm

Available profiles can be enumerated by looking in test/profiles/ (pass their name without the .sh extension to --profile, as above). The list is also printed by the test script in case an invalid profile name is given.

Each test profile contains a program invocation assigned to UUT_EXE, used to launch the terminal under test with the right arguments. The terminal must be configured so that its geometry is 80 characters wide and 24 rows tall.

Some other variables set up in the profile encode different capabilities to control the test scripts so that unsupported features are skipped (and the terminal does not encounter confusing escape sequences). Examples: MISSING_ANSWERBACK, MISSING_DSR, MISSING_SECONDARY_DA, SUPPORTS_VT52, SUPPORTS_VT220. Please check the script files for details.

Note that validation against the stored image hashes does not make sense, unless preparations are made to ensure that the terminal settings (size, font, etc.) are perfectly identical. Even then, differences from the valid output stored for Zutty do not necessarily constitute bugs in any other terminal, unless the rendered screen content is visibly wrong. For this reason, auto-validation of the saved screens runs in a relaxed mode for all profiles other than zutty. This means that for terminals other than Zutty, matches are still prominently displayed (by a line reading MATCH in green), while non-matches are considered normal and indicated by DONE, followed by a metric of the image difference compared to the reference. The output image and the image generated for Zutty will be diffed, resulting in a difference image that will highlight differences with red.

Contribution guide

There is no commercial entity behind Zutty. It is a volunteer effort with extremely limited resources. Your contributions will be duly considered, but please discuss your plans with us before starting out. Otherwise, you run the risk of being ignored after having put time and effort into the preparation of a patch.

Not every aspect of what makes a good contribution can be codified (we believe having good taste plays an important role), but as a minimum, please keep the following in mind:

Respect existing coding style

Please keep your changes in line with the coding style of the existing codebase. In particular, observe the following rules:

  • Maximum source line width: 80 characters.
  • No tabs, only spaces.
  • Indentation: BSD (a.k.a. Allman) style with a width of three (3) spaces.
  • Extra space before opening parens.

We are not interested in any opinions or debates on whether this style is good or bad and whether you like it or not; it is simply what we use every day (at least when working with C++). If you are interested in contributing to the codebase, please format your proposed changes accordingly.

Think long and hard…

… about anything that involves resource ownership, anything related to architecture, anything that changes the user interface, and anything that might be expensive (negatively affecting performance).

Let existing standards (published programming manuals of DEC VT-series terminals) as well as de-facto standard implementations (xterm) and tests (VTTEST) be your guide when it comes to specifications of correctness (see Useful resources), along with the spirit and philosophy outlined in the project README for design and implementation questions specific to Zutty. If you are uncertain, feel free to ask.

Test your changes thoroughly

At a minimum, run all regression tests via The CI test script before pull-requesting any change.

If relevant to your change, run Performance tests with and without your changes to establish that the performance is maintained at its current level.

If you add or change user-visible functionality, please add or update the tests covering it.

Keep documentation up to date

If you change any functionality covered by existing documentation, or add anything that belongs in the same vein, please contribute appropriate documentation updates as well. Nobody will do documentation work for you.

Module breakdown

Zutty is written in modern C++, with the customary file extensions: <module>.h for the header, <module>.cc for the (optional) separately compiled implementation, and <module>.icc for the (even more optional) included implementation (inline and/or templatized code) of a certain module. (Strictly speaking, "module" is not a thing in C++, but I find it a useful concept, so there you go.)

A short rundown of the modules of Zutty:

  • base64: Base64 encoder and decoder, used by the OSC command for clipboard interaction.
  • base: Fundamental structures.
  • charvdev: The virtual character device that provides the "raw video memory" interface to the Vterm and contains/drives the OpenGL rendering pipeline.
  • font: FreeType-based font loader, mostly concerned with building an atlas texture for the CharVdev to load into graphics memory.
  • fontpack: Locates the font name's variants (regular, bold, …) under a search path and provides a unified point of contact to deal with all of them.
  • frame: The interface between Vterm and CharVdev, providing a rendering target to the former, and acting as a data source to the latter. It provides an abstraction based on character grid coordinates on top of the raw cell storage, supports efficient scrollback buffering, plus support for cheaply passing around the underlying cell storage via reference-counted pointers.
  • gl: Low level GL utils.
  • log: Logging facility.
  • main: Main module for top-level tasks such as instantiating the Fontpack, the Renderer and the Vterm; creating the X window; selecting, parameterizing and spawning the shell; and subsequently servicing events on the file descriptors, handling X events (mainly around the keyboard, mouse and selection) as well as feeding the stream of output bytes from the shell subprocess into the Vterm.
  • options: Unified handling and support for command line switches and X resource database entries (with the former taking precedence over the latter).
  • pty: Code for spawning a pseudo-terminal and communicating resize events to it.
  • renderer: The Renderer runs a separate thread to feed the CharVdev with Frames handed off by the Vterm.
  • selmgr: The Selection Manager contains code interfacing between the Vterm (which is completely agnostic of any windowing system) and the X Selection API.
  • utf8: Support for producing and consuming UTF-encoded Unicode code points.
  • vterm: The Vterm implements the Virtual Terminal itself. That is, it consumes a stream of bytes output by the shell. The Vterm interprets the stream of text destined for the screen, interspersed with escape sequences to control the terminal, and produces snapshots captured as Frame updates handed off to the Renderer.

The major modules in the architecture of Zutty are sufficiently interesting to have their own expanded sections that follow.

CharVdev (character virtual device)

The architectural centerpiece of Zutty is the emulation of a character-oriented video device addressable as a plain old array of character cells. The source module charvdev contains its implementation, encompassing the GPU-hosted OpenGL ES shaders, and the data structures to communicate with them. On the host side, the C++ code within CharVdev runs in a separate thread that does the rendering, as driven by the Renderer.

At the interface level, the CharVdev provides access to a linear array of CharVdev::Cell structures, each Cell having fields for the unicode code point to be displayed, attributes (bold, italic, underline, inverse), and color (3 bytes each for foreground and background). A pointer to the Cells is obtained via a CharVdev::Mapping, which is a C++ wrapper object to allow idiomatic (RAII-style) safe access to the GL memory area backing the Cells residing on the GPU, and hides the underlying glMapBufferRange () / glUnmapBuffer () calls.

Two auxiliary properties baked into the shader-based rendering, the Cursor and the Rect defining the current selection, have setters provided on CharVdev. These will set GL uniform variables to their appropriate values.

The virtual video device, as driven by the array of Cells, is entirely implemented in the OpenGL ES shaders (GLSL code embedded into charvdev.cc), chiefly by the Compute Shader. The following subsections outline the processing and the data structures backing it at each stage.

Input character video memory area

The primary input to the OpenGL program of Zutty is a flat array of Cell structures. It is defined as a Shader Storage Buffer Object (GL SSBO), which means that the memory backing it is allocated by the GL system (ultimately by the graphics driver, preferably within GPU memory). Being an SSBO, this GL-backed object is cheap to frequently modify (on a frame-by-frame basis), as opposed to input textures that hold more permanent data.

The total length of the array is always equal to the terminal size (rows x cols in characters). The cells are addressed left to right, top to bottom. Each cell takes up 12 bytes, with 3 bytes currently unused (available for future extensions).

By way of the CharVdev::Mapping, the application is able to obtain a client-side mapping to this area, allowing direct manipulations of its content.

Unicode to Atlas position mapping texture

Font rendering is implemented by a font atlas, which is a texture containing a bitmap of all font characters rasterized to a certain size. The atlas is a single image held in graphics memory, divided into character-size cells (measured by the chosen font face's pixel dimensions) on a rectangular grid. Having all characters pre-rasterized into a single 2D image is customary in OpenGL text rendering, and is highly beneficial for performance and memory reasons. Each font glyph supported by the font face has a pair of atlas coordinates, denoting the row and column of the grid cell with the chosen glyph.

Zutty goes a step further than most, and allows the application layer to communicate directly by writing Unicode code points to the input character video memory area. The translation from Unicode code point to atlas coordinates decouples the application from having to deal with this font-specific mapping, on a per-character basis, on the client side.

The Unicode to atlas position mapping is created and stored on initialization and font loading, and is read-only for the GL program. This is a 256x256 2D texture that maps all 16-bit unicode code points to an atlas grid position. It is initialized with the GL data type GL_LUMINANCE_ALPHA (two channels), from an array with two 8-bit integers per texel (8 bits for either atlas grid coordinate).

This allows direct lookups for any 16 bit Unicode code point in the shader and returns two bytes, one for the atlas row and column each.

If the value stored for atlas (row,col) is (0,0), that means there is no glyph for that code point in the font. As a measure of convenience, the font loader ensures that there is a blank glyph stored at that atlas location, so no special GLSL code is needed to handle this case.

Atlas glyph texture

Once the atlas coordinates of the glyph to be drawn are known, the corresponding area of the atlas glyph texture is rendered onto the output image texture. The atlas glyph texture is a 2D image holding all the pre-rendered glyphs supported by the loaded font. Its dimensions are auto-computed based on the number of glyphs in the font and the glyph dimensions, to produce a pixel size as close to square as possible. This is necessary so the row and column coordinate will both fit into a single byte (the maximum number of characters rasterized from a font is 2^16 (65536), corresponding to the Unicode Basic Multilingual Plane).

Texture encoding: 1 byte per texel, gray-scale (0 = black, 255 = white)

The atlas texture is stored as a 2D array with one layer for each font face loaded. The mapping from unicode code point to atlas grid location is the same across fonts, and is determined by the primary font (loaded into texture array index 0). Each subsequent layer starts out as a copy of the primary atlas layer, with glyphs successively overwritten for each defined code point in the alternate font. This means that when referencing an alternate font, the shader does not have to care about whether the alternate font has a glyph for the given code point – if nothing else, the primary font's glyph will be present.

Output image texture

The glyph-sized rectangle on the atlas glyph texture, as defined by the atlas coordinates, contains a gray-scale image of the character to be rendered. The destination of this rendering is an image, accumulating the output from all the compute shaders running in parallel, each rendering a single character cell in the terminal window.

The dimensions of this image texture are set according to the terminal window's character grid size (window size, minus split-character border area at the bottom and right edges). The output texture is rendered onto the viewport area using a quad in the most straightforward way. All the work of computing the terminal window content is done by the compute shader that sets color values of individual pixels in the output texture.

Double-width cells

To display CJK characters, the rendering pipeline was extended to support double-width cells. These cells are marked with the dwidth flag being set in the left cell, and the dwidth_cont (double-width continuation) flag set in the right cell. The latter flag is only used in the shader to short-circuit rendering, as its area will be rendered by the compute shader instance rendering the left cell.

The double-width font is loaded separately from the layered main font variants that share a common atlas mapping. This font, if present, has its own atlas glyph texture and atlas position mapping texture. Other than this overhead, everything is handled very much the same way.

Frame

The Frame is an abstraction on top of a cell array compatible with the one provided by the CharVdev, and provides access to cells based on screen grid coordinates. This access layer is used by Vterm (the virtual terminal implementation) to manipulate the cell storage that ultimately defines the screen content.

A Frame wraps a certain cell array and abstracts away the actual "physical" storage details of which cell (as defined by screen grid coordinates) is stored in which array slot (as defined by array index). The separation is necessary for efficient implementation of scrolling. This is based on the concept of ringbuffers (circular buffers) where upon appending to a data buffer, a write pointer is moved around in physical storage denoting the start of a logical page, instead of shifting all existing content (discarding the oldest bits) to make room for incoming new data. In this scheme, data in the buffer, once written, stays untouched until the write buffer wraps around (by which point this data is the oldest still contained in the buffer) and gets overwritten by the newest incoming data.

The physical storage underpinning the Frame contains an embedded ringbuffer for the scrolling area. The virtual terminal allows a scroll top and bottom to be set, and appending lines within these limits will have the effect of rotating the ringbuffer without physically moving already written data in it. The key Frame fields encoding the ringbuffer state are marginTop, marginBottom and scrollHead. These are all row numbers, so cell offsets are obtained by multiplying them with the number of characters in each row (nCols).

The base case: no scrollback history

To understand the memory layout of cells mapped to the screen grid, consider the below figure:

           0 --> +-----------------------+
                 |                       |
                 .          (1)          .
                 |                       |
                 +-----------------------+
   marginTop --> +-----------------------+    <
                 |                       |    <
                 .          (3)          .    <
                 |                       |    <
                 +-----------------------+    < scrolling
  scrollHead --> +-----------------------+    <   area
                 |                       |    <
                 .          (2)          .    <
                 |                       |    <
                 +-----------------------+    <
marginBottom --> +-----------------------+
                 |                       |
                 .          (4)          .
                 |                       |
                 +-----------------------+
       nRows -->

The storage can be conceptually divided into four consecutive areas. Area (1) between row 0 and marginTop (non-inclusive) is non-scrolling (it might be empty though, if marginTop is zero). The same applies to area (4) that begins with the row numbered marginBottom and extends to the bottom of the screen (this area is empty if marginBottom equals nRows).

The area beginning with row marginTop and ending just above, but not including marginBottom is the scrolling area. The current logical top of the scrolling area is marked by scrollHead, which is the first row of area (2). The scrolling area extends downwards to the last row above marginBottom, and logically continues with area (3) that starts with the row marginTop, ending with the last row above scrollHead.

When a new row is appended to the scrolling area, scrollHead is moved down, unless it would become equal to marginBottom, in which case it jumps back up to marginTop. In any case, newly written content will overwrite the row marked by the original position of scrollHead, which has conceptually dropped off the top of the scrolling area.

Due to this storage scheme, lines on the screen (logically numbered from 0 to nRows - 1 will not always be physically stored in consecutive order. The method Frame::unwrapCellStorage () resets (straightens out) this logical-to-physical mapping, which is necessary before the scrolling limits marginTop and marginBottom can be changed or the frame can be resized. In the reset state, scrollHead equals marginTop, which means area (2) fills the space between (1) and (4), while (3) is empty.

The complete truth: in the presence of scrollback

The above scheme is a special case with the number of off-screen lines (defined by the saveLines configuration value) being zero. In reality, the buffer allocated for the frame contains space for a total of (visible) nRows plus an additional (off-screen) saveLines rows' worth of cells. Apart from that, operational aspects of filling the buffer with data are exactly as described above: the scrolling area is delimited by marginTop and marginBottom, and scrollHead points to the logically top-most row within the scrolling area (moved downwards as the content at the top edge scrolls out of the window). Note that this architecture guarantees that the computational complexity of scrolling is O(1) with respect to the size of the scrollback buffer. Having a large scrollback buffer will certainly cost memory, but it will have zero impact on performance.

If there are no top/bottom margins set, the whole storage acts as a (potentially very large) scrolling area, of which only nRows rows are visible (available to the Vterm via screen-based coordinates) at any point in time:

                0 --> +-----------------------+    <
                      |                       |    <
                      .          (3)          .    < scrolling
                      |     [off-screen]      |    <   area
                      +-----------------------+    <
       scrollHead --> +-----------------------+    <   <--------- posY = 0
                      |                       |    <   < active      ...
                      .          (2)          .    <   <  area       ...
                      |                       |    <   <             ...
                      +-----------------------+    <   <--------- posY = nRows - 1
scrollHead + nRows -> +-----------------------+    <
                      |                       |    <
                      .          (2)          .    <
                      |     [off-screen]      |    <
                      +-----------------------+    <
nRows + saveLines -->

The second part of area (2) and the entirety of area (3) contain off-screen lines. The number of saved history rows with valid data is maintained in the private counter historyRows, so that scrolling the view up into any still unused, blank buffer area can be avoided.

The Vterm continues to "see" the active area as if it was the whole buffer, its topmost row having posY = 0. Naturally, the active area might wrap around the end of the buffer. In this case, it is the second part of area (3) that contains saved (off-screen) lines:

                0 --> +-----------------------+    <   <
                      |                       |    <   < active      ...
                      .          (3)          .    <   <  area       ...
                      |                       |    <   < (pt. 2)     ...
                      +-----------------------+    <   <--------- posY = nRows - 1
scrollHead + nRows -> +-----------------------+    <
(mod buffer length)   |                       |    <
                      .          (3)          .    < scrolling
                      |     [off-screen]      |    <   area
                      +-----------------------+    <
       scrollHead --> +-----------------------+    <   <--------- posY = 0
                      |                       |    <   < active      ...
                      .          (2)          .    <   <  area       ...
                      |                       |    <   < (pt. 1)     ...
                      +-----------------------+    <   <
nRows + saveLines -->

Note that internally, marginTop will be 0 and marginBottom will be nRows + saveLines (reflecting the physical extent of the buffer). The externally reported value of marginBottom will continue to be nRows, so Vterm can continue to use the margin settings in its own logic, mostly to establish whether the cursor position is within the margins.

In case the Vterm sets top/bottom margins (meaning that at least one of them differs from its reset value of 0 / nRows, respectively), the Frame will rearrange its storage so that the physical start of the buffer contains the active area with parts (1) to (4), and the rest contains the lines of saved history. This has the advantage that marginTop, marginBottom and scrollHead continue to be valid with the active area as a frame of reference (pun not intended), without having to apply offsets to them.

                0 --> +-----------------------+    <
                      |                       |    <  active
                      .          (1)          .    <   area
                      |                       |    <
                      +-----------------------+    <
        marginTop --> +-----------------------+    <    <
                      |                       |    <    <
                      .          (3)          .    <    <
                      |                       |    <    <
                      +-----------------------+    <    < scrolling
       scrollHead --> +-----------------------+    <    <   area
                      |                       |    <    <
                      .          (2)          .    <    <
                      |                       |    <    <
                      +-----------------------+    <    <
     marginBottom --> +-----------------------+    <
                      |                       |    <
                      .          (4)          .    <
                      |                       |    <
                      +-----------------------+    <
            nRows --> +-----------------------+
                      |                       |
                      .          (0)          .
                      |     [off-screen]      |
                      +-----------------------+
nRows + saveLines -->

The initial status of the above layout is when scrollHead equals marginTop, meaning that area (3) is empty. This is the layout that calling unwrapCellStorage () will set up, and is done every time prior to the frame being resized, or the top/bottom margins adjusted.

Exercising scrollback – defining what is visible

Given the above defined memory layouts, it is easy to see how the screen can be directed to display something other than the active area. The view position is defined by a single row offset viewOffset that can take values from 0 (display the active area) up to and including saveLines (display the top of history).

Essentially, when copying frame data into the CharVdev for display, viewOffset has to be subtracted from the start of the frame (scrollHead in case of no margins, or 0 if there are margins), wrapping around the buffer limits as necessary. Then, the data has to be copied from that starting point in order, following the numbering of data sections (0) through (4), until nRows rows have been copied.

Renderer

The task of the Renderer is simple: run the rendering loop in a separate thread. This thread executes the CharVdev code, and is synchronized on frame updates published by the Vterm. On each update, a reference-counted copy of the Frame (using std::shared_ptr) is made. This ensures that the Frame is decoupled from the Vterm and the render thread can keep asynchronously working with it.

The rendering loop blocks on the GL program that does the actual drawing of the frame content (CharVdev::draw ()), and synchronizes the delivery of new frames with screen refreshes (ultimately via calling eglSwapBuffers ()). This mechanism ensures that there will not be work wasted on rendering frames so frequently that they won't be all shown on the screen. In effect, the renderer samples the "next frame" (as updated in Renderer::update ()) with the screen refresh rate and delivers it to the screen. The refresh rate is usually either 30 Hz (low-spec hardware or high resolution screens) or 60 Hz (average laptops).

Vterm (virtual terminal)

The Vterm module is the actual virtual terminal implementation. It consumes a stream of incoming characters containing printable characters to place on-screen as well as control characters and escape sequences to interpret according to the relevant standards and specifications, and alter the screen content.

The stream of input characters is read from the pseudoterminal, written by the pty slave running in an inferior process spawned by Zutty. This process is most commonly running a shell program (unless Zutty was instructed otherwise).

Parsing and interpretation of the input is fairly straightforward and implemented (on the highest conceptual level) by a state machine that reads input character-by-character, moves across states (Normal, Escape, CSI, etc) and calls the registered refresh handler to deliver an updated Frame to the renderer at appropriate moments.

An architecturally noteworthy detail is that the Vterm is completely separated from both the rendering machinery and also from input methods. This is intentional and lends a high degree of portability to the Vterm implementation.

Useful resources

  • xterm(1): The manual page for xterm
  • ctlseqs: The control sequences implemented by xterm
  • VTTEST: VT compatibility test program homepage
  • VT100ug: VT100 User Guide
  • VT102ug: VT102 User Guide
  • VT220rm: VT220 Programmer Reference Manual
  • VT420rm [pdf]: VT420 Programmer Reference Manual
  • VT520rm [pdf]: VT520/VT525 Video Terminal Programmer Information
  • DEC STD 070 [pdf]: DEC-internal standards document going into more technical detail than user-oriented product manuals.
  • VT500-series parser: A parser for DEC’s ANSI-compatible video terminals (not used by Zutty, but interesting!)