Table of Contents -

Table of Contents



Introduction


Calling the utilities


awk . . . . . . . . . . . . . string processing language

basename . . . . . . . . . . . . . extract base part from pathname of a file or a directory

cal . . . . . . . . . . . . . display the calendar for a month or a year

cat . . . . . . . . . . . . . concatenate files

cb . . . . . . . . . . . . . C beautifier

cmp . . . . . . . . . . . . . file binary comparison

comm . . . . . . . . . . . . . look for common lines in two files

cp . . . . . . . . . . . . . copy files and directories

cut . . . . . . . . . . . . . cut out columns or fields from files

df . . . . . . . . . . . . . statistics on disk usage

diff . . . . . . . . . . . . . compare files or directories

dtree . . . . . . . . . . . . . display tree structure of directories

du . . . . . . . . . . . . . display space each directory takes

ech . . . . . . . . . . . . . echo

ed . . . . . . . . . . . . . line editor

expand . . . . . . . . . . . . . expands tabs to blanks

find . . . . . . . . . . . . . find files with certain properties and execute commands on each

grep . . . . . . . . . . . . . search for patterns in files

head . . . . . . . . . . . . . display the head of one or several files

join . . . . . . . . . . . . . relational join of two files

ls . . . . . . . . . . . . . lists files and directories

make . . . . . . . . . . . . . update files

more . . . . . . . . . . . . . text files browser

mv . . . . . . . . . . . . . moves files and directories

od . . . . . . . . . . . . . octal (or hexadecimal) dump

opts . . . . . . . . . . . . . Set default options for The Berkeley Utilities

paste . . . . . . . . . . . . . merge files as columns of a single file

rederr . . . . . . . . . . . . . redirect error output of commands

rm . . . . . . . . . . . . . remove files and directories

sed . . . . . . . . . . . . . stream editor

sort . . . . . . . . . . . . . sort files

split . . . . . . . . . . . . . split a file into smaller pieces

tail . . . . . . . . . . . . . display the end of a file

tee . . . . . . . . . . . . . pipe connection and derivation

touch . . . . . . . . . . . . . update file timestamp

tr . . . . . . . . . . . . . translate stdin to stdout

unexpand . . . . . . . . . . . . . compresses to tabs runs of blanks and tabs

uniq . . . . . . . . . . . . . weed out or find repeated lines

wc . . . . . . . . . . . . . count words and lines

which . . . . . . . . . . . . . find which version of a program is active

xstr . . . . . . . . . . . . . extract character strings from C programs


Appendix: Regular Expressions


















© Copyright OPENetwork and PMC 1989-2001. All rights reserved.

{\tt Introduction} --- \rm \gdef\main{{\tt Introduction}}

Introduction -

The Berkeley Utilities are a set of UNIX like utilities for MS-DOS. They have been developed by P.M.C. (a software company from Paris, France) for its internal use, because there was no available package covering the same needs. It is not as complete a set of UNIX commands as can be found in some other packages (e.g. MKS) but it contains some useful utilities you don't find elsewhere (e.g. cb, xstr). At the time they were written, there was no port to MS-DOS of the GNU utilities either. The Berkeley Utilities have been maintained because we make constant use of them (for instance they now work with long filenames under WIN95/98, and df can understand multi-gigabyte partitions). Some functional advantages over other sets of utilities are described in the next paragraph. Compared to the GNU utilities, our utilities have the advantage that they are much smaller (20K on average instead of 100K). Compared to MKS utilities, they have the advantage that each utility is self contained (it can run separately without any other support file, and each utility contains an help screen).

The particular advantages of our utilities come from our design goals:


·   Since we had to rewrite for MS-DOS the UNIX utilities, we decided to do them right: you will find any useful option you ever had on any UNIX system, and often new options which make sense and increase the power of the package. People who have been using our package, when going (or coming back to) UNIX often wish our extra options would work there (we are considering alleviating their suffering by porting our commands to UNIX !). If you are a UNIX user, look at our extra options on cp, mv, ls and others; you will see what we mean!


·   We also decided to use the advantages of working in MS-DOS when they exist, e.g. the use of video attributes to make displays clearer.


·   Since we believe that using combinations of UNIX commands to do small (or big jobs) is a powerful way to work, which we wish to teach to others, we also are aiming our package to all PC users, and made a special effort to provide on-line help and tutorial information. You will learn a new way to work, and it will be an ever useful knowledge (at least until UNIX dies, which won't be tomorrow).


·   We wanted our utilities to be self-contained, and able to be used individually without going through any installation procedure. Each utility is self-contained without any extra files needed, and has a help screen (usually enough to get by, excepted for the more ambitious utilities such as awk and make).

At this stage, if you are new to UNIX, we recommend that you go through our User's Manual, and use our integrated help. We recommend the following books to help you along:

-
``The UNIX programming environment'', by Kernighan and Plauger.
-
For awk ``The AWK programming language'', by Aho, Weinberger and Kernighan.
If you are a UNIX wizard look in our man pages, try everything and enjoy (man refers to the reference manual, which in UNIX is available on-line by running a program called man).


Jean MICHEL

{\tt Calling the utilities} --- \rm \gdef\main{{\tt Calling the utilities}}

Calling the utilities -



Command line handling:


All The Berkeley Utilities can apply transformations to the command line, doing their best to emulate the behavior of shells under UNIX, insofar as MS-DOS allows it:

·
Arguments prefixed with `-' are options. An option is defined by the character following the `-' and may or may not take a parameter. As in UNIX, the options are case sensitive, which means that the command ls -t has a different meaning than the command ls -T. When the option takes a parameter, the parameter may be mandatory, in which case it may be separated from the option by spaces, or it may be optional, in which case it must follow the option immediately. An option standing alone may be followed by (bundled with) another option without separating space and `-' character, e.g., if s and x are options without parameters, those options may be bundled as: -sx. Some options are ``boolean flags'' (that is, it makes sense to turn them on or off). These options can then usually be turned off by giving them followed by a `-', e.g., -s-x means take option x but turn off s (this is especially useful in connection with an initial command line - see below).
·
Options -?, -H, and, in case it has no other significance, option -h, are taken to be a call for help: a message describing the utility usage syntax is sent to standard output with a short explanation of the semantics of the arguments and of the other options.
·
An option of ``--'' is taken to end the list of options; any further arguments beginning with `-' will not be interpreted as options. Contrary to usual unix behaviour, options may appear anywhere on the command line (they do not have to be grouped at its beginning). This behaviour may be changed by giving the option `-!'; then the first non-option argument will end the options (this is particularly useful in conjunction with setting an initial command line; see below).
·
``-'' standing alone is not usually taken to be an option, but to be an argument standing for standard input stdin (we adopt standard UNIX and MS-DOS terminology: a program takes its input from stdin (usually the terminal, but it may be redirected with ` < '), sends its normal output to stdout (which can be redirected with ` > '), and sends its error output to stderr (which cannot be redirected under MS-DOS, unless you use our utility rederr)).
·
Arguments starting with a ``$'' are assumed to be environment variables, and are replaced by their value: e.g., if the autoexec.bat file included the assignment:
      set INCLUDE=c:\include
``c:\include'' will be substituted for ``$INCLUDE''. If the variable is immediately followed by characters allowed in identifiers, curly brackets must be used: if the environment variable ``MAKEFLAGS'' has been assigned the value ``ei'', ``eid'' will be substituted for ``${MAKEFLAGS}d''. The following variations are also understood:
-
${x-z} stands for the value of environment variable x if x has been defined, else for the string z. E.g., by anticipation on the following paragraph on command substitution, ${HOME-`cd`} stands for the value of environment variable HOME if HOME has been defined, and else for the name of the current directory.
-
${x=z} is like the previous case, but in addition if x had not been defined, it is now assigned value z until the utility ends.
-
${x?z} stands for the value y of environment variable x if x has been defined, else message z is sent to standard output, and the utility aborts.
-
${x+z} stands for z if x has been defined, else for the empty string.
·
Another substitution mechanism is applied to arguments surrounded by backquotes ```'', called command substitution. First, the text inside the ```''s is executed as a command, then its standard output is inserted in the command line, after substituting spaces for imbedded newlines, and after stripping trailing newlines.
·
Finally, an argument including a ``*'', a ``?'' or a ``['' is taken to be a file specification pattern, and expansion is applied: its place is taken by the list of actual files whose name matches the pattern according to the following rules:
-
The star ``*'' stands for any number (0 included) of characters except ``\vbar '', ``/'', ``.|''.
-
The question mark ``?'' stands for exactly one character not in ``\vbar '', ``/'', ``.|''.
-
One or more characters surrounded by ``[ ]'' stand for exactly one of the surrounded characters. If the character following ``['' is ``!'', the form stands for exactly one character not in the set of the characters between the ``!'' and the ``]''. Inside the ``[]'', after the possible initial ``!'', a sequence like a -p stands for the set of characters whose ASCII code is between those of a and p.
-
A ``.'' ending the pattern is ignored.
-
``/'' et ``||'' are both acceptable as delimiting directories in a path name.
-
Finally, the sequence ``//'' matches any number of consecutive directory names in a pathname.
note.
The sequence ``\bkslash '' was also used in earlier versions of the The Berkeley Utilities for the same purpose. This has been made obsolete, in order to avoid conflict with the names of network drives.
For example,
      [c-e]://[!a]*.c
means all the files whose filename start with a letter different from a, which have an extension of c, and which are anywhere (``//'' means ``any number of directories below the root'') in hard disk c:, d: or e:.


NB1:   

Those patterns are similar in some ways to regular expressions. But there is no closure operator, ``?'' plays the part of ``.'', ``*'' is equivalent to ``[^.\]*'', in character classes the complement operator is ``!'' instead of ``^''.

NB2:   

Divergences from UNIX file specification patterns: under UNIX, ``.'' has no special properties except as the first character of a filename. Under MS-DOS, a file or directory name may contain at most one ``.'', and a name without a ``.'' designates the same file as the same name with the ``.'' appended. Moreover a name can have at most 8 characters before a ``.'' and 3 characters after it.

NB3:   

Divergences from MS-DOS : a name with less than 8 characters before an explicit or implicit ``.'' is not supposed to be completed to 8 characters by spaces. Thus the behavior of ``?'' differs from MS-DOS expansion where it sometimes ends up standing for one or no character. On the other hand, patterns where an initial ``*'' is followed by non-special characters are handled as you would expect (as in UNIX), whereas MS-DOS  sees no difference from ``*'' alone. For instance, The Berkeley Utilities see *A.* as all the files whose filename ends with an A, while MS-DOS  does not distinguish *A.* from *.*.

NB4:   

Be aware that patterns can only stand for existing files. Since the syntax of cp and mv are different from that of copy and rename, commands such as

copy *.bin *.obj
have no equivalent with The Berkeley Utilities . Nevertheless, an equivalent result may be obtained by a combination of some command repetition mechanism (such as the MS-DOS for command) and the basename utility:
for \%i in (*.bin) do cp \%i `basename \%i .bin=.com`
(note that %i must be replaced by %%i in a batch file).





Installation and Video Attributes:


The installation is an easy procedure: just copy the files from the distribution disks to a subdirectory on your hard disk, then place that directory on the path before the one which contains MS-DOS. In recent WIN95/98 systems you may have trouble doing that since WIN95/98 will automatically prepend its COMMAND subdirectory to the path. On such systems, to use the Berkeley find and more the best way is to rename the same-name commands in WINDOWS/COMMAND to a different name. Since ECHO, a built-in command, cannot be renamed, we named ours ech.

The Berkeley Utilities will work independently of each other and without any installation. They are easily configurable with the help of the supplied program opts.exe. This program sets an initial command line for any utility. For instance

opts rm "-i -r"
would set the initial command line of rm to -i -r, which means that on any future call to rm, the options -i -r will be prepended to the actual command line. This could be used to give default arguments, in addition to setting default options. The mechanism is such that the part of the command line thus given does not count in the DOS limit of 128 characters for the command line and may be arbitrarily long.

The Berkeley Utilities use different video attributes in order to highlight the key parts of their output. Most of The Berkeley Utilities use 3 attributes, but some (more, for instance) use many more. Those attributes may be selected globally for all utilities by assigning values to the environment variable ``VATTR'' in the following way:
      set VATTR=attribute0 -attribute1 -attribute2 -¼attributen
attributei being the middle part of an ANSI `Set Graphic Rendition' Escape-sequence, e.g., a sequence formed thus:
       < Esc > [p1 ;p2 ;¼pk m
stripped of initial `` < Esc > ['' and of final ``m''. The parameters pi are as described in MS-DOS reference manuals, for instance:

set VATTR=44;36;1-44;33;1-44;35;1-40;36;1-40;31;1-40;37;1-42;30-42;33;1
selects 8 attributes:

bright cyan on blue, bright yellow on blue, bright magenta on blue, bright cyan on black, bright red on black, white on black, black on green, bright yellow on green.



The following example:
      set VATTR=0-1-4-7;1-7-7;4
selects 6 attributes for a monochrome screen (hard to find nowadays):

normal on black, highlighted on black, underlined black, grey on white, black on white, inverse underlined. The attributes can be set for a single utility using the `-@' option which takes as argument a string following the same syntax as VATTR. The usual way to proceed would be to set this option in an initial command line via opts. The command `opts -e' may be used to edit interactively the initial command line and has support for selecting attributes from a menu.

When the utilities output is redirected to a file, attributes are normally not output. Nevertheless, if the option `-&' is given, output to a file and to the terminal is treated the same way. This is specially useful if the output is piped to a browser which can emulate ansi.sys, such as more. For instance, to look at leisure at strings found, type

  grep -& thing *.c |more
The method of assigning the value ANSI to the environment variable FATTR which was in the version 1 of The Berkeley Utilities is now obsolete (the above method is better since it can be controlled on each use).

There is a way to tell the utilities not to use ANSI attributes: just do ``set VATTR=NO''.



Command spawning from The Berkeley Utilities :


In many cases, (command substitution, make methods, ``!'' commands of ed and more, -exec predicate of find, etc...). utilities have to spawn some other command. The normal way to do this is to spawn a subshell, where under MS-DOS the shell to be spawned is given by the value of the environment variable ``COMSPEC''. Actually, if it is possible, the commands are spawned directly without a subshell intervening. This is useful for the utilities which need to know the exit status of spawned commands (make, find, ...), since the standard MS-DOS shell (command.com) does not make this information available. A command can be spawned directly if it is not an internal MS-DOS command and does not use pipes (|). Otherwise the command is spawned via a subshell and will always be assumed to have succeeded.

{\tt awk} --- \rm String-processing language\gdef\main{{\tt awk}}

awk - String-processing language



Synopsis:                       awk [-Fc ] -f program [files ]


or                                   awk [-Fc ] "program " [files ]




Description:


If you are not already familiar with awk, it is strongly recommended that you read the excellent 1988 Addison-Wesley book ``The AWK Programming Language'', by Aho, Kernighan and Weinberger who gave the language its ``awk''ward name. The following is not intended to serve as a tutorial for the language.

The awkprogram to execute is in the file specified as argument to the -f option, or is the first argument on the command line if there is no -f option. The file arguments processed by the program are considered as a sequence of records separated by record-separator characters, each record itself being a sequence of fields separated by field-separator characters. By default, the record-separator is the newline, so records are consecutive lines of the file, and the field-separator is the space. These defaults may be changed as will be seen below. If the field-separator is the space, as a special convention the < Tab > and the newline are also field-separators (this is specific to the space). An ``awk '' program consists in a sequence of pairs ``condition { actions }''. For each record in each file argument which matches the condition the corresponding actions are executed. A missing condition is considered to match every record, and a missing action is equivalent to the action {print} which prints the current record.

The actions are written in a language whose syntax is similar to that of the language C, but whose semantics are quite different: Variables can hold numeric or string values, or be arrays, but there are no declarations. A variable may indifferently hold numeric or string values; the conversion between these is automatically performed in any context where it is necessary; numeric values are floating-point numbers. On the other hand, the first occurence of a variable decides if it will be an array or scalar (if it is indexed or not in this first occurence) and then its nature (scalar or array) will be the same for the rest of the program. Array indices may be any scalar value, which provides a kind of associative memory. Operators are those of the C language when they make sense. Structured programming constructs are available as in C by using the keywords for, while, if and else. A variant of for is provided which loops over an associative array. The language contains a few built-in variables and functions. The conditions are built using boolean operators from relational expressions and regular expressions (look in the Appendix for a definition of regular expressions; the regular expressions currently do not have the alternation operator, they will be extended in a later version). In addition a condition may be a pair of conditions as described above, separated by commas. Such a condition holds between the first line satisfying the first condition and the next line satisfying the second condition, and again between such pairs of lines until the end of the file.



Formal Grammar of awk:


(In the documentation which follows, ``iff'' is an abbreviation for ``if and only if'').





< program >
       := < begin > < body > < end >



< begin >
       := BEGIN {  < actions > }

               BEGIN is a special condition which declares < actions > to perform before starting to read the first file argument.


       |nothing

               i.e. no initial < actions > .



< body >
       := < body > < action-condition >


       | < body > < action-condition > < terminator >


       |nothing

               The < body > of the program is a sequence of < action-condition > s, separated by ``;'' or newlines. The < body > is executed by applying successively each < action-condition > to each records of each file.



< end >
       := END {  < actions > }

               END is a special condition which declares the < actions > to perform after processing the last record of the last file.


       |nothing

               i.e. no final < actions > .

tion-condition:= < pattern >

               Print each record which matches the < pattern > .


       | < pattern > {  < block > }

               For each record matching the  < pattern > , execute actions in < block > .


       | < pattern > , < pattern >

               Wait for a record matching the first < pattern > , then print each record until the next record matching the second < pattern > , and so on.


       | < pattern > , < pattern > {  < block > }

               Wait for a record matching the first < pattern > , then execute the < block > of actions for each record until the next record matching the second < pattern > , and so on.


       |{  < block > }

               For each record, execute actions in < block > .



< pattern >
       := < regular-expression >

               A record matches the < pattern > iff it matches the < regular-expression > .


       | < match >


       | < relational-expression >


       | < composed-pattern >

mposed-pattern:= < pattern > || < pattern >

               Alternation: a record matches the < composed-pattern > if it matches one of the two < pattern > s.


       | < pattern > &&
quotepattern

               Conjunction: a record matches the < composed-pattern > if it matches both < pattern > s.


       |! < pattern >

               Negation: a record matches the < composed-pattern > if it does not match the < pattern > .


       |( < composed-pattern > )

               Grouping.



< block >
       := < block > < statement >


       |nothing

                < block > is a sequence of < statement > s, executed by successively executing each < statement > . A break, continue, next or exit statement may stop execution before the end of the < block > .



< statement >
       := < simple-statement > < terminator >


       |if ( < condition > ) < statement > else < statement >

               If the < condition > is true the first < statement > is executed, else the second one.


       |if ( < condition > ) < statement >

               The < statement > is executed if the < condition > is true.


       |while ( < condition > ) < statement >

               While the < condition > evaluates to true, execute the < statement > .


       |for ( < variable > in < variable > ) < statement >

               The second < variable > must be an array, and then for each element of that array the < statement > is executed, with the first < variable > set to the value of that element.


       |for ( < simple-statement > ; < condition > ; < simple-statement > ) < statement >

               Execute the first < simple-statement > , then loop on the sequence: evaluate the < condition > , if true execute the < statement > , then execute the second < simple-statement > .


       |for ( < simple-statement > ;; < simple-statement > ) < statement >

               Identical to the above form except the condition is always true. This loop can be exited only by a break, next, or exit.


       |break < terminator >

               Get out of the current loop (the innermost one if several loops are embedded).


       |continue < terminator >

               Go directly to the next iteration through the current loop.


       |{  < block > }

               Execute the < block > (see the definition of a < block > above).


       |next < terminator >

               The next statement causes the current record to be abandoned, the next record to be read and execution to resume at the beginning of the program body.


       |exit < expression > < terminator >


       |exit < terminator >

               The exit statement is equivalent to the end of the last file. If an expression follows exit, it is evaluated and its value is used as the return code from awk.



< condition >
       := < expression >

               As in the C language, the < condition > is true iff the < expression > evaluates to a non-zero value.


       | < relational-expression >


       | < match >


       | < composed-condition >

mposed-condition:= < condition > || < condition >


       | < condition > &&
quotecondition


       |! < condition >


       |( < composed-condition > )

               The syntax of < condition > s is very similar to that of the < pattern > s. Note that, in contrast to C, an expression is meaningful as a condition but the converse is not true.

mple-statement:=print < list > < redirection > < expression >

               The items of the < list > as well as the final < expression > are evaluated as character strings; then the items are printed, separated by the output field-separator (variable OFS) to the file whose name is the value of the final < expression > (this file is created if non-existent). If the file did exist, the text replaces its contents, except that if < redirection > is ``>>'', the text is appended to the file.


       |print < list >

               Same as above, the output file being stdout.


       |print < redirection > < expression >


       |print

               If print has no arguments, $0 (the current record) is printed.


       |printf < list > < redirection > < expression >


       |printf < list >

               As print, but the first item in the list is interpreted as a character string to yield a format, which is used to print the other items, with the same conventions as in the C  printf function.


       | < expression >



< expression >
       := < expression > < term >

                < expression > and < term > are evaluated to character strings and catenated.


       | < term >


       | < value > = < term >


       | < value > += < term >


       | < value > -= < term >


       | < value > *= < term >


       | < value > /= < term >


       | < value > %= < term >

               Assignment operators, which have the same meaning as the corresponding operators in the C language.



< term >
       := < value >


       |( < expression > )


       | < term > + < term >


       | < term > - < term >


       | < term > * < term >


       | < term > / < term >


       | < term > % < term >


       |+ < term >


       |- < term >

               Dyadic and monadic operators, which have the same meaning and syntax as in C.


       |++ < value >


       |-- < value >


       | < value > ++


       | < value > --

               Pre and post-decrementation and incrementation, as in C.


       | < function > ( < expression > )


       | < function > ()


       | < function >

               Where < function > is one of the intrinsic functions of awk(see below the list of these functions). If there is no argument, by default $0 (the current record) is used.


       |getline

               getline reads the next record and returns it ($0) as its value, without breaking the program flow as next does.


       |sprintf < list >

               The first item in < list > is taken to be a format string. Similar to the sprintf of the standard C library.


       |substr ( < expression > , < expression > , < expression > )

               Returns the sub-string of the first < expression > which starts at the position specified by the second < expression > , and whose length is at most the value of the third expression.


       |substr ( < expression > , < expression > )

               Returns the terminal substring of the first < expression > which starts at the position specified by the second < expression > .


       |split ( < expression > , < variable > , < expression > )

               Sets < variable > to an array whose elements are the substrings obtained by splitting the first string < expression > at places where occurs the separator which is specified by the first character of the second string < expression > , and returns as result the number of elements of that array.


       |split ( < expression > , < variable > )

               Like the previous form, but using as separator field-separator character specified by the built-in variable FS.


       |index ( < expression > , < expression > )

               returns an integer, the position of the first occurence of the second string < expression > as a substring of the first one; returns 0 if there is no occurence.



< value >
       := < variable >


       | < variable > [ < expression > ]

                < variable > must be an array, or must be mentioned here for the first time. < expression > must evaluate to a scalar value.


       | < field >


       |number

               A number is a floating-point number written as a sequence of digits, with an optional decimal point and exponent.


       |string

               A string constant is a sequence of characters between double quotes `"'. The ``\'' character may be used to quote the next character, allowing to specify characters impossible to put in the string otherwise:

\\: A \.

\": A double quote ".

\n: A newline.

\t: A < Tab > .



< field >
       :=$ < expression >

                < expression > must evaluate to a non-negative integral value. $0 is the current record, and cannot occur on the left of an assignment operator. $n where n  != 0 represents the n th field, and can be assigned to as any other.



< function >
       :=length

               The function length gives back the length of its argument ($0 by default) interpreted as a character string.


       |log

               logarithm function.


       |int

               floor function.


       |exp

               exponential function.


       |sqrt

               square root function.

               These functions interpret their argument ($0 by default) as numbers, and return what their name implies.



< variable >
       :=NF

               The variable NF holds the number of fields of the current record.


       |NR

               NR holds the ordinal number of the currently processed record.


       |FS

               FS holds the field-separator character (this character is taken from the first character of the value of FS interpreted as a string). By default this character is the space, unless the option -F has been given.


       |RS

               RS holds the record-separator character, which by default is the newline. If ``RS'' is an empty string, the records will be separated by an empty line.


       |OFS

               OFS holds the output field-separator character which, by default, is the space.


       |ORS

               ORS holds the output record-separator character which, by default, is the newline.


       |OFMT

               OFMT holds the default output format for numbers which, by default, is ``%.6g''.


       |FILENAME

               Holds the current filename.


       |identifier

               An identifier is a sequence of letters, digits and ``_'', not beginning with a digit, and not one of the names of built-in functions and variables. Variables are initialized to the empty string (i.e. this is the value they have when used before being assigned to).

gular-expression:=/re/

               Look at the Appendix for the syntax of regular expressions.



< match >
       :=( < match > )


       | < expression > ~ < regular-expression >

               True iff < expression > matches < regular-expression > .


       | < expression > !~ < regular-expression >

               True iff < expression > does not match < regular-expression > .

lational-expression:= < expression > == < expression >


       | < expression > != < expression >


       | < expression > >= < expression >


       | < expression > <= < expression >


       | < expression > > < expression >


       | < expression > < < expression >


       |( < relational-expression > )

               These operators have the same meaning as in the C language.



< list >
       :=( < list > )


       | < list > , < expression >


       | < expression >



< redirection >
       := >


       |>>



< terminator >
       :=;


       |newline

Newlines are not irrelevant as in C, since they can be used to mark the end of a statement, but they are allowed after if(...), else, while(...), and for(...). Outside of character string constants or regular expressions, ``#'' signals the beginning of a comment, and the rest of the line is ignored.



Option:


The option -Fc allows to change the default field-separator character to c . If c is ``t'', it is understood as < Tab > .



Examples:



·   To count the number of lines of a file (same as wc -l file):

awk "END{print NR}" file


·   To print a file, each line prefixed with its line number:

awk "{print NR, "'$'"0}" file
or more reasonably, place the following line in a separate awk program:

{print NR, $0}


·   To print all lines of a file which exceed 79 characters:

awk "length > 79" file


·   To print all lines of a file containing december in French or English (equivalent to ``grep \<[Dd][eé]c file''):

awk "/\<[Dd][e‚]c/" file


·   To find files in the current directory dated between 21th and 31th of december:

ls -T | awk "$1 ~ /Dec/ && $2>20{print $4}"
Let us follow how example 5 works. First, it is equivalent to running awk on the output of ls -T1 (the option -1 of ls is implied in case of a pipe). A typical line of that file looks like:
Dec 25 21:07 c:\bin\awk.exe
So when processing the file, $1 is the month (here ``Dec''), $2 is the day (here ``25''), $3 is the hour (or the year for files more than 6 months old), (here ``21:07''), $4 is the filename (here ``c:\bin\awk.exe''). ``$1 ~ /Dec/'' selects lines for december, and ``$2 > 20'' selects amongst those the ones whose day is greater than 20 (the operator ``>'' forces the second field to be interpreted as a number). The action for selected lines is to print the fourth field, i.e. the filename.



·   To count the number of files dated from each month (this example uses an associative array):

ls -T | awk -f count
where count contains
$1~/Jan/{n["January"]++}
$1~/Feb/{n["February"]++}
$1~/Mar/{n["March"]++}
$1~/Apr/{n["April"]++}
$1~/May/{n["May"]++}
$1~/Jun/{n["June"]++}
$1~/Jul/{n["July"]++}
$1~/Aug/{n["August"]++}
$1~/Sep/{n["September"]++}
$1~/Oct/{n["October"]++}
$1~/Nov/{n["November"]++}
$1~/Dec/{n["December"]++}
END{ for (m in n)
     { if (n[m] > 1) NUM="s"
       else NUM=""
       print m ":",n[m],"file" NUM
     }
   }



Error Messages:


can't open `xxx'
The program file, or an argument file, or a redirection file could not be opened.
error in program
syntax error
lexical error
Errors found in the awkprogram.
xxx is not an array
The variable after ``in'' in the 2nd form of a ``for'' loop is not an array.
can't set $0
$0 has occured on the left of an (= += -= *= /= %=).
funny variable xxx
illegal arithmetic operator
illegal assignment operator
illegal boolean operator
illegal function type
illegal jump type
illegal relational operator
illegal statement
illegal transformation to statement
illegal reference to array xxx
An array as been referenced in a context where a normal variable was expected.
newline in string
A string constant started with ``"'' has not been closed before the end of the line.
newline in regular expression
A regular expression started with ``/'' has not been closed before the end of the line.
regular expression: missing `]'
A character class opened with ``['' in a regular expression has not been closed before the end of the line.
not enough arguments in printf(xxx)
printf or sprintf does not have the number of argument corresponding to the format.
trying to access field n
The expression following a ``$'' has a value which does not correspond to the number of a field of the current record.
unexpected break, continue or next
A break, continue, or next has been found at the topmost program level.
too many output files n
The number of files to which output may be redirected is currently limited to 10.
out of memory
format item xxx... too long
record `xxx' has too many fields
record `xxx' too long
string xxx... too long to print
string too long
yacc stack overflow
Various resources have been exhausted.



Portability:


New features of awk introduced in UNIX version V.3 are not yet implemented.

{\tt basename} --- \rm give base part of a pathname\gdef\main{{\tt basename}}

basename - give base part of a pathname



Synopsis:                       basename file


or                                   basename file [... file ] suffix


Extracts the `filename' part from a full pathname.



Description:


In the first form, basename strips from a pathname logical unit and directory specifications. In the second form basename performs this operation on all its arguments excepted the last which is interpreted as a suffix, and stripped from filename arguments which end with it. If this suffix has the form s1 =s2, all arguments ending with s1 will have this final s1 replaced by s2.



Examples:


C:>basename c:\bin\abc.exe
abc.exe
C:>basename c:\bin\abc.exe c:\bin\other.bak .exe
abc
other.bak
C:>basename c:\bin\abc.exe c:\bin\other.bak .exe=.c
abc.c
other.bak.c



Notes:


basename is particularly useful in conjunction with the ``command substitution'' performed by The Berkeley Utilities.

For instance, to rename all files ending in .bin to .com you may use the for command of MS-DOS as follows:

for %i in (*.bin) do mv %i `basename %i .bin=.com`

And to move to directory target all C source files such that there exists an executable with the same name:

mv `basename *.exe .exe=.c` \target



See Also:


find.

{\tt cal} --- \rm Display the calendar for a month or a year\gdef\main{{\tt cal}}

cal - Display the calendar for a month or a year



Synopsis:                       cal [[month number ] year number ]


Prints the calendar for a given month of a given year, or if the month is omitted, for all months of a given year; if given with no arguments, gives the calendar of current month.

Year may be between 1 and 9999; month must be between 1 and 12.



Notes:


To learn something about the history of England, try cal 9 1752.

Here is the output of cal 7 1993

      July   1993
Su Mo Tu We Th Fr Sa
             1  2  3    
 4  5  6  7  8  9 10    
11 12 13 14 15 16 17    
18 19 20 21 22 23 24    
25 26 27 28 29 30 31    

{\tt cat} --- \rm concatenate files\gdef\main{{\tt cat}}

cat - concatenate files



Synopsis:                       cat file [... file ]




Description:


cat writes the concatenation of all the argument files one after the other on stdout . If no argument has been given, or for each occurence of the argument ``-'', cat takes its input from stdin .



Examples:



·   The following two lines are equivalent:

C:>cat abc
C:>type abc


·   A way to add two lines to the beginning and one line to the end of a text file without using an editor:

C:>cat - autoexec.bat - >autoexec.new
set include=c:\msc\include;c:\msc\include\sys
set lib=c:\msc\lib
^Z
set temp=c:\tmp
^Z
C:>mv autoexec.new autoexec.bat
^Z represents < Control-Z > which informs MS-DOS that an end of file was entered from the console.



Bugs:


Since the same buffer is used for input and output, if one of the files being concatenated is also used as stdout , the contents of the file will be destroyed. In order to append file2 at the end of file1, type:

cat file2 >> file1



See Also:


cp, mv, more.

{\tt cb} --- \rm \C\ beautifier\gdef\main{{\tt cb}}

cb - C beautifier



Synopsis:                       cb [options ] [input file [output file ] ]


cb takes as input a C source file, and rewrites it according to the options specified on the command line.



Description:


By default, cb works on stdin and stdout. cb beautifies a C source file according to your programming style, organizing especially the output of blocks. Preprocessor commands and declarations outside of a function are not changed.



Options:


The following options are available on the command line:

-i
n
n is an integer, value of the indentation used after keywords. By default, 2.
-cSame level of indentation for a closing curly bracket and its corresponding keyword. By default, the closing curly bracket is on the same level as the opening one.
-sThe statement immediately following a keyword will appear on the same line. By default, it appears on the following line, beginning at the next level of indentation.
-oThe opening curly bracket appears on the same line as its corresponding keyword. By default, it appears on the next line.
-OThe opening curly bracket is on the line following the keyword, beginning at the next level of indentation. By default, it keeps the same level as the keyword.
-nAn else or a while immediately following a closing curly bracket appears on the same line. By default, every closing curly bracket is followed by a newline.
-rThe first statement inside of a block appears on the same line as the opening curly bracket.
-RThe statements inside of a block appear on the same level of indentation as the opening curly bracket. By default, they begin at the next level.



Examples:


Let us have a look at the output of 3 cb options working on the same input file.

C:>cat fopenp.c
FILE *fopenp (path, fn, fnb, mode)
register char *path;char *fn, *fnb, *mode;
{   ...
    if  (*fn == '\bkslash ') if ( fd = fopen(fn, mode))
     { strcpy(fnb, fn);return (fd);}
    else return(NULL);
    do { while  ( *path != ';')*dst++ = *path++;
    if  (dst != fnb) *dst++ = '\bkslash ';
      while  (*dst++ = *src++);
    if  (fd = fopen (fnb, mode))return (fd);}
    while  (*path++);return (NULL);}
Default option:
C:>cb fopenp.c
gives as output:
FILE *fopenp (path, fn, fnb, mode)
register char *path;char *fn, *fnb, *mode;
{   ...
    if  (*fn == '\bkslash ')
      if  ( fd = fopen(fn, mode))
      {
        strcpy(fnb, fn);
        return (fd);
      } 
      else 
        return(NULL);
    do 
    {
      while  ( *path != ';')
        *dst++ = *path++;
      if  (dst != fnb)
        *dst++ = '\bkslash ';
      while  (*dst++ = *src++);
      if  (fd = fopen (fnb, mode))
        return (fd);
    } 
    while  (*path++);
    return (NULL);
}
Options -o and -n joined:
C:>cb -on fopenp.c
gives as output:
FILE *fopenp (path, fn, fnb, mode)
register char *path;char *fn, *fnb, *mode;
{   ...
    if (*fn == '\bkslash ')
      if ( fd = fopen(fn, mode)) {
        strcpy(fnb, fn);
        return (fd);
      } else
          return(NULL);
    do {
      while ( *path != ';')
        *dst++ = *path++;
      if (dst != fnb)
        *dst++ = '\bkslash ';
      while (*dst++ = *src++);
      if (fd = fopen (fnb, mode))
        return (fd);
    } while (*path++);
    return (NULL);
}
Options -r and -s joined:
C:>cb -rs fopenp.c
gives as output:



FILE *fopenp (path, fn, fnb, mode)
register char *path;char *fn, *fnb, *mode;
{   ...
    if  (*fn == '\bkslash ') if  ( fd = fopen(fn, mode))
    { strcpy(fnb, fn);
      return (fd);
    } 
    else  return(NULL);
    do 
    { while  ( *path != ';') *dst++ = *path++;
      if  (dst != fnb) *dst++ = '\bkslash ';
      while  (*dst++ = *src++);
      if  (fd = fopen (fnb, mode)) return (fd);
    } 
    while  (*path++);
    return (NULL);
}



Error Messages:


else not following an if
unbalanced curly brackets
Can be printed in case of a syntax error. But be careful - cb is not a syntax analyzer !



Bugs:


If you ask for an output with both the options -r and -o, you won't get exactly what you expect: if there are nested blocks, the shift to the right of the output would very soon get unreadable.



Portability:


All options are enhancements (UNIX version cannot be configured).

{\tt cmp} --- \rm compare binary files\gdef\main{{\tt cmp}}

cmp - compare binary files



Synopsis:                       cmp [options ] file1 file2 [offset1 ] [offset2 ]


cmp compares file1 (starting at byte offset1 if given) to file2 (starting at byte offset2 if given). A file given as `-' is taken to be standard input. Wildcards can be used to specify two files. By default, nothing is printed if files are identical; byte and line number of first difference are given otherwise. If one file is identical to some initial part of the other, it is reported.



Options:


-l
long mode: all differences are reported (not only the first one).
-ssilent mode, nothing reported in any case. Only the exit status indicates the result of the comparison:

      0 for identical arguments       1 for differences       2 for errors.



See Also:


diff.

{\tt comm} --- \rm Look for common lines in two files\gdef\main{{\tt comm}}

comm - Look for common lines in two files



Synopsis:                       comm [options ] file1 file2


comm works on two already sorted files, and writes its result to stdout.



Description:


The default is to give the lines common to the two files.



Options:



Two options are allowed on the command line:


-1
Asks comm to give as output the lines which are only in file1.
-2Asks comm to give as output the lines which are only in file2.



Examples:


C:\>ls -1 \util\src >files.c
C:\>ls -1 a:util\src >files.a
C:\>comm files.c files.a
Since, by default, the output of ls is sorted alphabetically, comm gives the list of files which belong to both subdirectories.
C:\>comm -2 files.a files.c
lists the files which appear only in the subdirectory C:\util\src .



See Also:


diff, sort, uniq.

{\tt cp} --- \rm Copy files and directories\gdef\main{{\tt cp}}

cp - Copy files and directories



Synopsis:                       cp [options ] source target


or                                   cp [options ] file|dir ... file|dir dir


cp copies files or directories matched by the pathnames given as argument.



Description:


There are two forms of the command:

·
short form: there are only two arguments, and furthermore both arguments consist of one file, or both of one directory, or the second argument is a new name. The first argument is copied over the second (target).
·
long form: the last argument is a directory (target) and all other arguments are copied to that target directory.
Watch out when using wild cards (like file.*), as the target (the last argument) must expand to at most one name.



Options:



The possible options on the command line are:


-r
Allows cp to copy (and possibly overwrite) non-empty directories (if not given, only empty directories are copied or overwritten).
-mWhen copying directories, merges the source with the target (instead of overwriting the target).
-vGives on stdout a report on copied files.
-fDo not ask confirmation before overwriting read-only files (by default the authorization of the user is asked).
-iAsks confirmation before overwriting any file or directory.
-IAsks confirmation before copying any file or directory. This option implies the -i option.


When the options -i or -I are given, the only answers allowed are:

n:
continue, do not overwrite or copy.
q:
leave.
g:
(go) stop asking questions.
y:
overwrite or copy.
s:
answer valid only for a directory. Overwrite or copy without asking further confirmations for files or sub-directories of this directory.


Examples:


cp -rvm a:dbaseiv c:\
Add the contents of directory dbaseiv from diskette a: to the hard disk c: (and do not overwrite, if this directory already exists on c:, the files in it whose name does not conflict with a name in a:dbaseiv); inform on performed actions (option -v).



Notes:


MS-DOS's COPY is capable of preventing the copying of a file over itself in simple cases, but will fail in more complicated cases (and trash the file):

C:\>COPY top.map top.map
File cannot be copied onto itself
        0 file(s) copied
C:\>COPY t*.map top.map
1 file copied.
cp does not make that kind of mistake.

{\tt cut} --- \rm cut out selected fields\gdef\main{{\tt cut}}

cut - cut out selected fields



Synopsis:                       cut -clist [files ]


or                                   cut -flist [options ] [files ]


cutcuts out columns or fields from each line of the files entered as arguments according to the options specified by the user on the command line. If no files are specified or the file name -, cutworks on stdin.



Description:


cutlooks at every line of files and copies to the standard output only the fields (option -f) or characters (option -c) specified in the argument list . list must immediately follow the option (no space allowed). list is a comma-separated list of integers or integer ranges, given in increasing order. A range is specified by a - as in 8-12. A - not preceded by a number makes cutconsider that the range begins with the first character or field. A - not followed by a number means that the range ends at the end of the line, with the last character or field.



Options:


One of the two following options must appear on the command line:

-c
list
list represents character positions, each integer is the position of a character on the line: for instance, the list -28 asks cutto copy the first 28 characters of every line of files to the standard output.
-flistlist represents field positions, each integer is the position of a field on the line. Fields are delimited by a special character (see option -d). If no delimiting character appears on one line, this line will be copied just as it is to the standard output, unless option -s has been given.
-dcTake c as the delimiting character. By default the fields are delimited by tabs.
-sDo not output lines containing no delimiters.



Examples:



C:\>cat junk
apples 12 \kilos
raisins 14 \pounds
oranges 23 \units

C:\>cut -c3-10 junk
ples 12
isins 14
anges 23

C:\>cut -f2 -d\ junk
kilos
pounds
units



See Also:


paste.

{\tt df} --- \rm Displays space left on various drives\gdef\main{{\tt df}}

df - Displays space left on various drives



Synopsis:                       df [drive specifications ]


Shows statistics for the space used on all hard disk drives (default).

Optionally, a list of drives, including floppies, may be given as an argument. A drive may be specified as a single letter or as a single letter followed by ``:''. Ranges are allowed.



Examples:



·   ask for space left on floppy a: and hard drive e:

 df a e:


·   ask for space left on c:, d:, and e:

 df c:-e:
or
 df c-e
Here is the output which might be given by the above command:
drive  total bytes  bytes used  (%)   bytes free  (%)  cluster size
  C:      33462272    24475648 73.1%     8986624 26.8%       2048
  D:     314613760   295624704 93.9%    18989056  6.0%       8192
  E:      83247104    81633280 98.0%     1613824  1.9%       8192
Total    431323136   401733632 93.1%    29589504  6.8%

{\tt diff} --- \rm Compare files or directories\gdef\main{{\tt diff}}

diff - Compare files or directories



Synopsis:                       diff [options ] f1¼fn


diff compares files or directories. If the argument - is given, stdin is used.



Description:


When there are two arguments: if they are binary files, diff just tells if they differ; if they are text files, diff reports in a format similar to an ed script which lines must be changed to make f1 identical to f2. diff gives no report when the files are identical, except if the option -s described below has been given. If f1  and f2 are both directories, they are first sorted, and then diff shows the files which appear in only one of them, and gives a report on files or subdirectories with the same name. If there are more than two arguments, the last one (fn ) must be a directory, and for each other argument fi, diff compares fi  and fn ||fi. If the option -r described below has not been given, diff reports common subdirectories, even if they are equal.



Options:



The possible options are:

-t
Consider all files as binary, i.e. just tell if they differ.
-h n(half-hearted) Use a faster algorithm which also requires less memory for big files, but which is less precise and may give spurious results or no result at all (while the usual algorithm is guaranteed to find the minimum necessary set of lines to change). The optional number n is the maximum number of lines that a single difference can be (resynchronisation is on 3 identical lines) - default n is 200; making it bigger may make diff -h work in more situations at the cost of slower excution.
-rTells diff to work recursively on subdirectories.
-sGive also a report on identical files.
-bIgnore final whitespace (blanks and tabs) at the end of a line and consider as equal any other non-empty sequence of whitespace characters when comparing lines.


Only one of the following options may be given at once:
-eGives a true ed script.
-fGive an inverted script.
-cnGive n lines of context around each difference. By default, 3 lines are given.
-DnameUseful mostly when dealing with C source files. Gives on stdout a new file which has #ifdef's such that it will compile as f2 if headed with #define name and as f1 otherwise. The result might not compile if there were already #ifdef's within the differences.



Examples:


We show the output given by various options on the following two files:

C:>cat a.c
#define LINT_ARGS
#include <stdio.h>
main(){
printf("hello, world!");
}
C:>cat b.c
#include <stdio.h>
main(){
printf("hello, world!");
exit(0);
}
Default behavior:
C:>diff a.c b.c
<<< a.c  and b.c  differ >>>
1d0
< #define LINT_ARGS
4a4
> exit(0);
Option ``conditional compilation'':
C:>diff -Dx a.c b.c
#ifndef x
#define LINT_ARGS
#endif /* x */
#include <stdio.h>
main(){
printf("hello, world!");
#ifdef x
exit(0);
#endif /* x */
}
Option ``ed script'':
C:>diff -e a.c b.c
4a
exit(0);
.
1d



Bugs:


The number of lines per file is limited to about 15000 unless option -h is given, whence there is no limit.



Portability:


The options -r, -s and -c are only found in BSD 4.xx. Option -t is an enhancement.



See Also:


comm, ed, sed.

{\tt dtree} --- \rm display tree structure of directories\gdef\main{{\tt dtree}}

dtree - display tree structure of directories



Synopsis:                       dtree [option ] [pathname ]




Description:


dtree displays the tree structure formed by subdirectories of the directory given as argument on the command line. By default (when no argument has been given) dtree works on the current directory. Video attributes are used to enhance the display of different levels in the hierarchy.



Option:


The option -a also lists the files in each subdirectory.



Examples:


C:>dtree \
\windows\pif\
\msc\include\sys\
     lib\
\games\
\dos\


Here we have represented different video attributes by different fonts.



Portability:


The use of different video attributes to highlight key parts of the output is an enhancement.



See Also:


ls -RM

{\tt du} --- \rm estimate file space usage\gdef\main{{\tt du}}

du - estimate file space usage



Synopsis:                       du [option ] [pathnames ]




Description:


du reports the disk space used by each argument, in kilobytes. By default (when no argument has been given) du reports on the current directory. Video attributes are used to enhance the display of different levels in the hierarchy.

When a directory is encountered, subdirectories within it are reported on recursively, and then a total printed for that directory.



Options:


-l
nn
Set the level of detail to nn; that is, only print on the report those directories whose distance to the top is less than nn (the space of all subdirectories is still accounted for).
-sOnly print the grand total for each argument (equivalent to -l0).



Portability:


The option -l gives more control than UNIX versions of du. The use of different video attributes to highlight key parts of the output is an enhancement.



See Also:


ls -RMU

{\tt ech} --- \rm echo\gdef\main{{\tt ech}}

ech - echo



Synopsis:                       ech [-n ] arg1 arg2 ¼ argn




Description:


ech echoes its arguments, separated by a space, to stdout and adds an end-of-line after the last argument. ech may be used to find out how The Berkeley Utilities interpret command line arguments.



Option:


The option -n tells ech not to add an end-of-line (\n) character after the last argument.



Examples:


C:\>ech Hello
Hello
C:\>ech $PATH
\bin;\util;\dos
C:\>ls *.dat
C:\                    3 entries       123456 bytes
abc.dat     def.dat     ghi.dat		
C:\>ech *.dat
abc.dat def.dat ghi.dat
C:\>cd \tc\include
C:\TC\INCLUDE>ech  .
c:\tc\include
C:\TC\INCLUDE>ech  ..
c:\tc



Portability:


This command is called echo in UNIX systems, but since ECHO is also an internal command of MS-DOS, we had to give it a different name.

{\tt ed} --- \rm Text editor\gdef\main{{\tt ed}}

ed - Text editor



Synopsis:                       ed [options ] [file ]


ed edits file if given as argument; file becomes the currently remembered filename (see below; more precisely ed simulates the command ``e file'' described below). If no file argument has been given, the edited buffer starts empty with no current filename.



Description:


Regular expressions are used within ed to specify line addresses and to specify part of lines (in the s command). Please consult the Appendix for more information about regular expressions.



Options:


The possible command-line options are:

-s
``Silent'': suppresses printing of a character and line count for commands e, r and w, of diagnostics when using e and q on a modified buffer, and of the prompt ! for the command !command. Keeps the inscrutable form of error messages of UNIX's ed, that is error messages consisting of a simple ``?''.
-p stringSpecifies a prompt string that will be used by ed in command mode.
-f fileTakes the ed script (the sequence of commands to be executed) from file.



Addresses:


Individual lines of the file to edit are specified by addresses built as follows:


.Represents the current line (which is usually the last line affected by a command).
$Represents the last line of the edited file.
nRepresents the n th line of the file (n is an integer, counting starts from 1. As a special convention, 0 sometimes represents a place before the first line of the file).
'xRepresents the line addressed by the label x, where x is a lower-case letter (these labels are created by the command k; see below).
/pattern/Represents the first line from `.' matched by pattern (a regular expression). The search goes forward in the file, and at the end of the file wraps back to the beginning, until a match is found or until the search goes back up to and including its starting line.
?pattern?Like /pattern/ excepted that the search goes backwards.
+number
-number
An address followed by + or - followed by a decimal number means that the computed address must be increased (or decreased) by that number of lines. The + sign may be omitted if the preceding address was nonempty. An address starting with + or - is computed with respect to the current line. If no number is given after the + or - the number 1 is taken by default. In additions several + or - can be given. For instance `++' is the same as `.+2'.



Commands:


The ed commands take 0, 1, or 2 addresses. When 2 addresses are given, they are usually separated by a comma. When two addresses are separated by a semicolon, the current line (`.') is set to the first address and only then the second computed. Several such addresses can be given and then the last two are used for the command. For commands taking two addresses, the second address must always specify a line after the first one in the buffer, and the pair of addresses identifies the range of lines between the two addresses. A command which usually requires n addresses (n = 1 or 2) and has been given fewer addresses assumes default addresses. When one address has been given to a two-addresses command, that address is taken as default for the second address. Finally ``%'' is equivalent to the address pair 1,$. All commands are given below preceded with the specification of their default addresses within brackets.


The first three commands below put ed in insert mode. In that mode any characters entered by the user are taken as text and no command is recognized excepted that the character `.' given as only character on its line exits insert mode to go back to command mode.


[.] a
< text >
.
Appends entered < text > just after the addressed line. The address 0 means the beginning of the file. `.' is set to the last inserted line (to the addressed line if there was no < text > entered).
[.,.] c
< text >
.
Deletes the addressed lines and replaces them by entered < text > . `.' is set to the last entered line (to the next one if no line was entered).
[.] i
< text >
.
Inserts entered < text > just before the addressed line. `.' is set to the last entered line (to the addressed line if no line was inserted).


Other commands:


[.,.] dDeletes the addressed lines. `.' is set to the line after the last deleted line (if the last line of the buffer was deleted then `.' is set to the new last line).
 efile
e!command
In the first form, this command replaces the content of the edited buffer with those of file. `.' is set to the last line read. If no file name has been given, the currently remembered filename, if any, is read. Otherwise file becomes the remembered filename for future e, r, w and f commands. In the second form command is sent to MS-DOS to be executed, and its output (stdout) is read and replaces the contents of the buffer. In that case the remembered filename is not changed. In both forms, if the contents of the buffer have been modified since the last w command, the e command must be confirmed by repeating it.
 Efile
E!command
This command is just like e, except that no confirmation is asked in case of buffer modifications since the last w command.
 ffileIf file is given, this command changes to file the currently remembered filename. Otherwise f just prints on stdout the currently remembered filename.
[1,$] g/pattern/listFirst, all lines containing an occurence of pattern are marked. Then `.' is successively set to each of these lines and the list of commands entered is executed. The list of commands may extend over several lines if each of them, excepted the last, ends with a \. The commands a, c and i are allowed and insert mode is escaped either by a solitary dot (.) or by a line not ending with \. Commands g and v are not allowed in the list of executed commands. An empty list is equivalent to the p command.
[.,.+1] jJoins consecutive lines specified by the addresses (suppressing intervening newline characters).
[.] kx``Labels'' with x the addressed line. x must be a lower-case letter. 'x can then be used to address that line. `.' is left unchanged.
[.,.] lPrints ``visibly'' the addressed lines: that is, nonprintable characters such as `tab' or `newline' are represented as in C by mnemonics and other non-printable characters are represented by their octal code. In addition lines greater than screen width are folded. `l' may be added as a flag to any command excepted ef, r and w, and has then the effect of printing the new `.' after execution of that command.
[.,.] maMoves addressed lines to just after the lines addressed by a. Address 0 is allowed for a meaning before the first line. `.' is set to the new position of the last moved line.
[.,.] nPrints addressed lines, preceded by their line number and a tab. `.' is set to the last printed line. n may be added as a flag to any command other than ef, r and w, and has then the effect of printing the new `.' after execution of that command.
[.,.] pPrints addressed lines. `.' is set to the last line printed. p may be added as a flag to any command other than ef, r  and w, and has then the effect of printing the new `.' after execution of that command.
 PThe prompt in command mode is set to * the first time this command is executed. The prompt is then flipped from * to empty on subsequent uses of P.
 qLeaves ed without saving the buffer. This command must be confirmed by giving it twice if the buffer has been modified since the last w command.
 QLeaves ed; does not ask for confirmation even if the buffer has been changed.
[\$] rfile
r!command
In the first form, inserts the contents of file in the buffer just after the addressed line. If no file name has been given, the remembered filename is used. Otherwise, file becomes the currently remembered filename only if it was the first name given since entering ed. The address 0 is allowed, meaning before the first line. The number of read lines and characters is printed, and `.' is set to the last read line. In the second form, the command is sent to MS-DOS to be executed and its output (stdout) is read into the buffer. In that case the remembered filename is not changed.
[.,.] s/pattern/repl/
s/pattern/repl/g
s/pattern/repl/n
Does substitutions on addressed lines containing the pattern. Depending on the flags, the first occurence (no flags given), or all occurences (with the g flag) or the n th occurrence of the pattern in each of these lines will be replaced by the string repl. Any character other than a space or a newline can be used as a delimiter for the pattern and the replacement. `.' is set to the last line where the substitution occured. Several characters have a special meaning in repl. & represents the part of the line which matched the pattern, and \n where n is a single digit represents the part of the line matched by the n th sub-regular expression (delimited in pattern by \( and \)). If repl consists only of the character `\%', it is replaced by the value is had in the last s command. The special meaning of & and of \and \% can be escaped by preceding them with another \. It is possible to replace a line by several lines by putting newlines in repl; each of these must be preceded by a \, so repl consists of several lines, all but the last ending in a \. This is not allowed within a g command.
[.,.] taThis command copies addressed lines to just after the line addressed by a. `.' is set to the last copied line. Address 0 is allowed for a.
 uUndoes the last command which modified the buffer, i.e the last command amongst ac, d, g, i, j, m, r, s, t and v.
[1,$] v/pattern/listThis command is just like g, excepted that the list of commands is effected on lines containing no match of the pattern.
[1,$] wfile
w!command
In the first form, the addressed lines are written to file. If no file name was given, the currently remembered filename is used. Otherwise, file becomes the currently remembered filename only if it was the first name given since entering ed. The number of written lines and characters is printed. `.' is left unchanged. In the second form, command is sent to MS-DOS to be executed, its standard input stdin being a file consisting of the addressed lines. In that case the remembered filename is not changed.
[$] Prints the line number of the addressed line. `.' is not changed.
 !commandSends command to MS-DOS to be executed. If the first character of the command is !, it is replaced by the last command executed by another ! command in ed. `.' is left unchanged.
[.+1] An address alone on a line is equivalent to the command p. A < CR >  alone on a line is equivalent to the command `.+1p'.

{\tt expand} --- \rm Expands tabs to blanks in character files\gdef\main{{\tt expand}}

expand - Expands tabs to blanks in character files



Synopsis:                       expand [-tabsize ] [-tab1,tab2,... ]file(s)


Expands tabs to blank characters in character files given as argument, prints the result to the console (stdout). If no file arguments are given or one of them is ``-'' the corresponding input is taken from the console (stdin).



Description:


By default tab stops are put every 8 characters. If the option -tabsize is given, they are put instead every tabsize characters.

If instead the option -tab1,tab2,... is given, tab stops are put at columns tab1, tab2, etc¼ (origin 0).



See Also:


unexpand.

{\tt find} --- \rm find files with certain attributes and execute commands on each\gdef\main{{\tt find}}

find - find files with certain attributes and execute commands on each



Synopsis:                       find pathname-list predicate


find searches for files matching predicate, down in the directory hierarchy below each argument of the pathname-list, or by default, below the current directory.



Description:


predicate is made of primary predicates, which are keywords preceded by a - and followed by 0, 1 or more arguments, and combined with logical operators. The operators are, in order of increasing precedence:

·
logical or, represented by the argument -o appearing between two predicates.
For example, -name *.bak -o -name *.tmp is true for each file whose extension is .bak or .tmp.
·
logical and, which is implicitly represented by the juxtaposition of two predicates.
For example -name *.bak -mtime 0 is true for each file whose extension is .bak, and which has been created or modified during the last 24 hours.
·
the negation, which is represented by the argument !, preceding a predicate.
For example ! -name *.bak is true for each file whose extension is not .bak.
·
Arguments consisting of parentheses are used to group predicates, changing the default order of precedence of the operators.
For example ( -name *.bak -o -name *.tmp ) -mtime 0 is true for each file whose extension is .bak or .tmp, and which has been created or modified during the last 24 hours. (since parentheses are just normal arguments on the command line, they must be preceded and followed by at least one space).

By default, directories are looked at before their subdirectories and files. The end of pathname-list (i.e. the beginning of predicate) is indicated by the first argument beginning with ``-'' or ``(''.



Syntax of ``predicate'':


The syntax of predicate may be described by the following formal grammar (in the description, ``iff'' stands for ``if and only if'' and ``|'' stands for ``or''):



< predicate >
       := < conjunctive >

                < predicate > is true iff < conjunctive > is true.


       | < conjunctive > -o < conjunctive >

                < predicate > is true iff one of the two < conjunctives > is true.



< conjunctive >
       := < term >

                < conjunctive > is true iff < term > is true.


       | < term > < term >

                < conjunctive > is true iff the two < terms > are true.



< term >
       := < primary predicate >

                < term > is true iff < primary predicate > is true.


       |! < primary predicate >

                < term > is true iff < primary predicate > is not true.



< primary predicate >
       := ( < predicate > )

                < primary predicate > is true iff < predicate > is true.


       | < primary predicate >

               One of the following predicates defined by keywords:



< primary predicate >
       := -name pattern

                < primary predicate > is true iff the name of the current file is matched by pattern. pattern may contain wild-cards which are expanded according to the usual rules for filename argument (for a precise description, look in the general section of the documentation).


       |-perm permission

                < primary predicate > is true iff the file has the given permission. Two values can be specified for permission :

r: true for a read-only file.

w: true for a writable file.


       |-type filetype

                < primary predicate > is true iff the file is of the given type. Two values can be specified for filetype :

f: true for an ordinary file.

d: true for a directory.


       |-size value

                < primary predicate > is true iff the size of the file (given in kilobytes) matches the given value . Three forms are recognized for value; n below is an integer:

n : true for files whose size is exactly n kilobytes.

+n : true for files whose size is more than n kilobytes.

-n : true for files whose size is less than n kilobytes.


       |-mtime value

                < primary predicate > is true iff the file has been modified a number of days ago matching the given value. Three forms are recognized for value; n below is an integer:

n : true for files modified exactly n days ago.

+n : true for files modified more than n days ago.

-n : true for files modified less than n days ago.


       |-newer filename

                < primary predicate > is true iff the current file has been created or modified more recently than filename.


       |-exec command

               The command is sent to MS-DOS to be executed, where command is a sequence of arguments ending with a ``;''. If one of the arguments is {}, this argument is replaced by the current filename. The resulting < primary predicate > is true iff the executed command returns an exit status of 0 (success). For example,
      -exec grep -sw signal {} ;
is true for the files which contain at least one occurrence of the word signal.


       |-ok command

               Like ``-exec'', but command is echoed to the terminal before execution and the user is asked wether it should be executed. If the answer is negative, < primary predicate > is false. For example,
      -ok cat {} ;
asks if the current file should be copied to the terminal; if the answer is positive, the following predicates will be applied on the current file after its printing. If the answer was negative, find works on the next file.


       |-print

               This < primary predicate > is always true, and causes the current path-name to be printed on the standard output.


       |-depth

               This < primary predicate > is always true, and forces the directories to be looked at after their files or sub-directories.



Beware: operators ( !, -o, (, ) ) must be separated by one or more spaces from the predicates, argumen