The Caboteria / Tech Web / TechNotes / UnixNotes (revision 13)


I had a problem once trying to ssh from work to my machine at home. Everything worked fine until I left the connection idle for a while. When I hit a key I got

Read from remote host www.caboteria.org: Connection reset by peer
Connection to www.caboteria.org closed.

Setting ssh's KeepAlive parameter on the client and server didn't seem to do much good. I found a patch that helped, though:

http://www.icl.isc.tohoku.ac.jp/~hgot/sources/openssh-watchdog.html

The "contributed" patch at the bottom is close to the version that ships with Debian Potato, and although it gives errors when you try to apply it it's pretty small and relatively easy to cut-n-paste by hand.


This appeared in the RISKS digest:

Date: Thu, 22 Nov 2001 23:50:39 +0300
From: Diomidis Spinellis 
Subject: Risks of the space character in Unix filenames                    

The root of the problem reported in the "Glitch in iTunes Deletes Drives  
(Solomon, RISKS-21.74)" article is the default way the Unix shell handles 
filenames with embedded spaces.  Although a space can legally appear in a 
Unix filename, such an occurrence is not usual; Unix filenames tend to be
terse, often even shorter than a single word, (e.g. "src", "doc", "etc",   
"bin") so they can be swiftly typed.  A number of more recent and supposedly
user-friendly operating systems like the Microsoft Windows family, and, I   
understand, the MacOS, use longer and more descriptive file names         
("Documents and Settings", "Program Files").  Many of these filenames
contain spaces; the ones I listed are by default used by Windows 2000 as the
location to store user data and application files (the equivalent of
/home/username and /bin under Unix).

As Unix-style tools and relevant applications are increasingly ported to run
under Windows (see for example [1, 2, 3] and my Windows outwit tool suite
described in [4]) or natively run under Mac OS X, problems and associated   
Risks arise.  The main reason is that some often-used Unix shell constructs
fail when applied to filenames containing a space character.  Unfortunately,
these constructs appear in many existing programs, and even in the writings
of the original system developers, who, in all fairness, could not have    
foreseen how their tools would have been used 25 years after their          
conception.        

Technically, the problem manifests itself when field splitting (the process
by which the shell splits input into words) is naively applied on the output
of an expansion that generates filenames with embedded spaces.  Consider the
following example, appearing on page 95 of one of the classic texts on Unix
programming [5]:        
  for i in ch2.*
    echo $i:
    diff -b old/$i $i
    echo
  done

The above code will compare all files matching the ch2.* pattern in the
current directory with copies presumably stored in the directory called
"old".  Consider what will happen when the code is applied to a file called
"ch2.figure 3.dot" (notice the space between the word figure and the "3").
The shell variable i will be set to the correct filename, but then the shell
will execute the "diff" command with the following argument list
(customarily passed to C programs in the argv array):
  argv[0] = "diff"
  argv[1] = "-b"
  argv[2] = "old/ch2.figure"
  argv[3] = "3.dot"
  argv[4] = "ch2.figure"
  argv[5] = "3.dot"

and diff will complain
  diff: extra operand
as more than two filenames were passed as arguments.  This happens,
because words are expanded by most Unix shells in the following order:
  1. Parameter (including variable) expansion, command substitution.
  2. Field splitting.
As a result, the variable $i is first expanded into "ch2.figure 3" and
then the result is split into fields for further processing or for
passing them as arguments to a command.

The most common dangerous constructs that can appear in step 1 are variable
references (e.g. $PATH, $word) and commands inside backquotes (e.g. `find
. -type f -name 'ch2.*'`).  These dangerous constructs are quite common,
appearing among other places in the original article describing the Bourne
shell [6] (for i in * do if test -d $d/$i [...]), in other scripts in the
reference of the original example [5 p. 141, 143], and even in quite recent
work by the same authors [7, p. 149].  It is also prevalent in existing
operating system tools; I counted 43 occurrences of one suspicious pattern
("$*") in a NetBSD source tree, 8 in a FreeBSD command path, and 49 in the
shell scripts of a Mandrake Linux distribution.  The Unix world is
definitely not ready to deal with filenames containing the space character.

Avoiding this problem is not trivial.  A radical solution would be to change
the value of the shell's "internal field separator" (IFS) variable.  This
variable contains the characters that shell uses to split words.  Its
default value is ".  This solution however would break
more things than it would fix, since most scripts expect words to be
separated by spaces.  As an example the construct "A='ls -l';$A" would not
work.  The most practical solution is to manually enclose variables inside
double quotes when using them in contexts where only a single word is
normally expected.  The shell will still expand the variable inside the
quotes, but will treat the result as a single word.  Thus the offending part
in the original example should have been written as:
  diff -b "old/$i" "$i"
In addition, whenever a shell script uses the variable $* to obtain the
values of all parameters passed to a script, the $* variable should be
replaced by the variable $@, again inside double quotes.  Thus the
common code pattern
  for arg in $*
should be written as
  for arg in "$@"
Interestingly, Kernighan and Pike were aware of the $* problem and the above
solution since 1984; they aptly characterize the "$@" solution as "almost
black magic" [5 p. 161].

Still, these changes will not correctly handle filenames with embedded
whitespace returned from a command substitution.  In this case, temporarily
changing the IFS variable before executing a command may be the only
feasible solution.  The following example illustrates this approach:
  # Save original IFS
  OFS="$IFS"
  # Set IFS to newline
  IFS='
  '
  # The find command might output filenames with spaces
  wc -l `find . -type f`
  # Restore original IFS
  IFS="$OFS"

By searching existing shell scripts for the patterns I described and
applying the suggested changes most problems can be solved.  Other scripting
languages like Tcl and, to a lesser extend, Perl may also have problems
dealing with filenames with spaces.  Similar approaches (appropriate quoting
in Perl "eval" blocks and use of the "list" command in Tcl) can be used to
avoid these problems.

References

[1] David G. Korn. Porting Unix to Windows NT. In Proceedings of the
USENIX 1997 Annual Technical Conference, Anaheim, CA, USA, January 1997.
Usenix Association.
[2] Geoffrey J. Noer. Cygwin32: A free Win32 porting layer for UNIX
applications. In Proceedings of the 2nd USENIX Windows NT Symposium,
Seattle, WA, USA, August 1998. Usenix Association.
[3] Stephen R. Walli. OPENNT: UNIX application portability to Windows NT
via an alternative environment subsystem. In Proceedings of the USENIX
Windows NT Symposium, Seattle, WA, USA, August 1997. Usenix Association.
[4] Diomidis Spinellis. Outwit: Unix tool-based programming meets the
Windows world. In USENIX 2000 Technical Conference Proceedings, pages
149-158, San Diego, CA, USA, June 2000. Usenix Association.

[5] Brian W. Kernighan and Rob Pike. The UNIX Programming Environment.
Prentice-Hall, 1984.
[6] S. R. Bourne. The UNIX shell. Bell System Technical Journal,
57(6):65-84 July/August 1978.  (Also appears in volume 2 of the Unix
Programmer's Manual and in AT & T, UNIX System Readings and
Applications, volume I. Prentice-Hall, 1987.)
[7] Brian W. Kernighan and Rob Pike. The Practice of Programming.
Addison-Wesley, 1999.

Diomidis Spinellis - http://www.dmst.aueb.gr/dds/
Athens University of Economics and Business (AUEB)


I like the Sawfish window manager, and my second-favorite theme is STPflat, by rowan@stasis.org. See http://stasis.org/~rowan/graphics.html. If only he had used the Windows "standard" button order...


Info on diskless workstations:

http://www.ltsp.org/ - the linux terminal server project, folks who have a Linux distro that turns an old PC into a diskless X-terminal. It worked very well for me on an old Pentium 133.

http://lists.debian.org/debian-devel/2001/debian-devel-200104/msg00647.html I wasn't able to get reliable performance using the user-space nfs server (many "stale NFS handle" errors in apt-get, probably caused by moving files and trying to use the same handle) but it seems to be reliable using the kernel nfs server.

If you're using NFS you should sweep your filesystems periodically looking for old files named .nfs*. These get left behind when a process has a file open, another process deletes it, and the first process crashes. See http://www.sunmanagers.org/archives/1998/0229.html

See also http://syslinux.hackerdojo.com/pxe.php if you've got a motherboard with a PXE bios.

http://www.securitysage.com/guides/postfix_uce.html has detailed info on how to set up the Postfix MTA to filter spam.

-- TobyCabot - 17 Oct 2001-28 Mar 2002

Edit | Attach | Print version | History: r51 | r15 < r14 < r13 < r12 | Backlinks | Raw View | Raw edit | More topic actions...
Tech.UnixNotes moved from Tech.UnixTips on 17 Feb 2004 - 22:01 by Main.guest - put it back
Copyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding The Caboteria? Send feedback