[om-list] Re: shell

Fri Oct 5 10:02:02 EDT 2001

Mark

    Using Cygwin sounds too complicated, and I'm guessing it will slow down
my program on Windows.  I'm already afraid of it being too slow, so ... ?

    Do any of the three ways of compiling my script make the end product
into one program, one executable, as opposed to making the script into a
program which still calls other programs (the utilities)?  Is there no
(easy) way to combine the source code of the utilities and the "source code"
of the shell and scripts, to make one executable?

    Can you give me some idea of how much speed-difference there might be
among the three things I'm talking about: one executable, a compiled
shell-script running utilities, an interpreted shell-script running
utilities?

    I guess if writing a compiler is too hard, we could manually write a C
program that did the same thing as the compiled shell script would do,
right?

tomp

----- Original Message -----
From: "Mark Butler" <butlerm at middle.net>
To: "Tom and other Packers" <TomP at burgoyne.com>
Cc: "OM List" <om-list at onemodel.org>; "Ben (h) Oman" <ben at pdapla.net>
Sent: Thursday, October 04, 2001 3:44 PM
Subject: Re: shell

Tom and other Packers wrote:

>     I plan on programming and compiling a bunch of small utilities.  Then
I
> will write a shell that will follow a script to co-ordinate the running of
> multiple utilities, including other shell scripts.  I want to be able to
> write a low-level shell script with the same basic input/output
> functionality as any of the lower-level utilities.  A low-level script
would
> have the following properties:  (1) The data piped into the first
utilities
> in this script would be piped into it from some other utility/script, as
> though it (the shell script) were just another utility.  (2) The data
> produced by the last utilities in this script would be output from the
> script as though it were a utility.  (3) The script could be run like a
> utility from a higher-level shell script, including the use of parameters
> and return values.
>
>     Is this not a logical continuation of the original design?  Were you
> assuming as much, or is this harder to do than I think?

First of all, keep in mind that a script needs to be distinguished from a
process or thread executing the script.  A shell, by definition, starts
subprocesses and communicates with those processes.  If the sub-processes
are written in shell language, instead of in compiled code, what you are
really doing is creating a new shell process and telling it to run a certain
shell script.  You can do this as many levels deep as you want.  Unix has a
"fork()" system call that duplicates a process very efficiently for this
exact purpose.

Unfortunately, all this system level code is not very OS independent.  If
you want to use write portable code on Windows, I suggest you download and
install Cygwin from sources.redhat.com.  Cygwin is an Unix style environment
for Windows that can compile programs written for the POSIX API.  The
resulting executables can be run on any Windows machine with just one extra
DLL.

>     My secondary concern:
>
>     I'm guessing that after I've written a few levels of scripts, the
> high-level shell script would run rather sluggishly.  Do you think so?  In
> these cases, once we've written a high-level script that should be run
many
> times, but which is too slow as an interpreted script, would it be
possible
> (and reasonably easy) to write a compiler to convert that *same script*
into
> something that executes faster?  How could this be done?  Is there a way
to
> merge the script and the source code of the shell and the utilities in
such
> a way that it can be compiled into one homogeneous executable?

Three ways:

  1. Write a MT translator that translates the script into bytecode
     before execution, either ahead of time (Java) or at runtime (Perl)
  2. Write an MT -> C translator, compile the result
  3. Expand on (1) by writing an internal Just In Time compiler

(1) Is the standard Java strategy, also used by Perl interpreter at runtime
(2) Used by initial C++ implementations (cfront)
(3) Used by modern Java virtual machines - very difficult

- Mark