COBRA Reference Manual
Code Browser and Analysis Tool



Principle of Operation


Cobra uses a lexical analyzer to scan in the source code in the files given as arguments on the command-line. It then builds a data structure that can be used for querying that source code either interactively, or with predefined scripts.

The internal data structure dat Cobra builds is a simple linked list of lexical tokens, annotated with some basic information and links to other tokens, for instance to identify matching pairs of parentheses, brackets and braces. The tool does not attempt to parse the code, which means that it can handle a broad range of possible inputs. Despite the simplicity of the data structure, the tool can be remarkably powerful in quickly locating complex patterns in a code base to assist in peer review, code development, or basic structural code analysis.

There are several ways to write queries. We can use:

    • Interactive queries (overview below, or see index),
    • Inline programs (described separately),
    • Standalone checkers (described separately).
Interactive queries are written in a simple command language that can support the most frequent types of searches. When more complex queries need to be handled, requiring anything other than a sequential scan of the token sequence, an inline Cobra programs offer a more powerful alternative. For still more complex queries, that require the construction of more elaborate data structures, the same infrastructure can be used to write standalone checkers in C that can be linked with the Cobra front-end.

Interactive query commands are by default applied to all tokens in the input sequence, optionally using parallel threads of execution where that can improve performance. The queries can be used to set, move, remove, extend, stretch, or inspect user-defined mark points in the code.

If directed to do so with command-line option -cpp, Cobra will use the standard gcc compiler to preprocess the source code before the data structure is built. This can be useful to make sure that the effects of macros and include files is taken into account when locating patterns of interest in the code. By default no preprocessing is done, which makes it possible to also query code bases that cannot be preprocessed or compiled, for instance because the required directives are unknown, or because the code is syntactically flawed.

Use cobra -help to see a list of all current command-line options that the tool recognizes. The synopsis below summarizes Cobra's interactive query command language. Details of each command are given in separate pages that are linked from this overview.

New Features in Version 2.0

The current Version 2.0 of the Cobra tool is an extension of the original, as distributed by JPL. The original release counted 9,796 lines, while the current version counts 20,831 lines of code. Other than bug fixes, the new features include:
  • A richer scripting language for writing inline programs, supporting arbitrary variables, associative arrays, recursive functions, and concurrency control.

  • Parallel processing of input files, which shortens the startup time especially for larger code bases.

  • Faster processing of queries. For instance, checking for empty else statements with the original version in 18.2 million lines of code would take 10 seconds, with the current Version 2.0 it takes about 2.5 seconds (in both cases using a single cpu-core).

  • An extended library of predefined scripts and script libraries (from about 48 to about 81, including checkers for the Misra guidelines, the Power of 10 rules, and the JPL Coding Standard).

  • Online manual pages for the interactive commands, and for the scripting language used in inline programs.

  • Parameters on def...end scripts.

  • New commands, including cpp, terse, track, map.

  • A rethinking and normalization of command-line options.

Overview of Commands


Most commands can be abbreviated to a single letter, which is convenient for scripting and compact query formulation. Below is a complete index of all interactive query commands that the tool recognizes, listed in alphabetical order, with their shorthand (if available), the unabbreviated form, and a brief explanation. Each command is linked to a manual page with a more complete description.
	a append	append an additional source file to the data structure
	b back		move marks back
	c contains	retain mark only if there is a match within the associated range
	e extend	retain mark only if it is followed by the given tokens
	i inspect   	show the lexical tokens for a given source line
	j jump		move mark to the other end of a range
	m mark		mark (or match) tokens if they match a pattern
	n next		move marks forward
	r reset		clear all marks and user-defined ranges
	  re		mark matches of a token pattern expression
	< restore	restore all marks using a set 1..3
	> save		save all current marks and ranges in a set 1..3
	s stretch	set a range, starting at marked tokens upto the pattern specified
	u undo		undo the last change made
	  unmark        unmark tokens matching a pattern
	w with		restrict marks to tokens matching an additional constraint
Displaying things:
	d display	show source code context for marked tokens
	h history	show the command history
	l list		list marked tokens
	p pre		show preprocessed source code context for marked tokens
	t track		start or stop redirection of d/l/p output into a named file
	=		print a user-defined string with a value for each marked token
Other commands:
	B		browse the source text of files (see also F and G)
	cfg		show the control flow graph for a given function
	context		show callers and callees for a given functions
	cpp             enabled or disable preprocessed or unpreprocessed code
	default		set a command to be executed after an empty command
	fcg		show the fct call graph, or a path in the fcg
	fcts		show names of all defined functions
	ff		show the source text for a specific function
	ft		show the source text for a specific structure definition
	F		list open files (see also B and G)
	G		search (grep) open files for a pattern (see also F and B)
	map		map token text to user-defined types
	ncore           change the number of cores to use
	q		(quit) terminate the session
	silent           enable or disable silent output
	terse           enable or disable terse output
	?		list all commands with a brief summary of use
	!		shell escape, to execute a system command (e.g., !date)
	.		(dot) read-in the scriptfile, specified as an argument
	:		(colon) execute a named script (often the : is unnecessary)
	def n() ...end  define a named command script
	%{...%}         execute an inline program script
A description of expressions and types can be found at these pages:
	patterns	pattern expressions, used in combination with m, w, or =
	types		predefined token types
	qualifiers      qualifiers that can be used with some commands

Synopsis

The main command that can be used to define new markings is:
	m mark		mark tokens if they match a pattern
Three commands can be used to move existing marks:
	b back		move marks back
	n next		move marks forward
	j jump		move mark to the other end of a range, eg {}, (), or []
The only command to associate a new range with a token (additional to predefined ranges) is:
	s stretch	set a range, starting at marked tokens upto the pattern specified
And, finally, four commands can be used to down-select (remove) existing marks:
	e extend	retain mark only if it is followed by the given tokens
	c contains	retain mark only if there is a match within the associated range
	m & pattern	retain mark only if it also matches pattern
	w with		retain mark only if an additional constraint is met
where pattern can be a name, type, a regular expression or a cobra pattern expression.

The symbol $$ can be used to refer to the currently matched token, e.g., as set with a mark command, in next, back, extend, contains, or stretch commands.

Additional information


Last Update: 8 May 2017