MOBAGWHO - TCP/IP Internetworking With `gawk'

Next: STOXPRED, Previous: MAZE, Up: Some Applications and Techniques

3.8 MOBAGWHO: a Simple Mobile Agent

There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies.
C. A. R. Hoare

A mobile agent is a program that can be dispatched from a computer and transported to a remote server for execution. This is called migration, which means that a process on another system is started that is independent from its originator. Ideally, it wanders through a network while working for its creator or owner. In places like the UMBC Agent Web, people are quite confident that (mobile) agents are a software engineering paradigm that enables us to significantly increase the efficiency of our work. Mobile agents could become the mediators between users and the networking world. For an unbiased view at this technology, see the remarkable paper Mobile Agents: Are they a good idea?.¹

When trying to migrate a process from one system to another, a server process is needed on the receiving side. Depending on the kind of server process, several ways of implementation come to mind. How the process is implemented depends upon the kind of server process:

HTTP can be used as the protocol for delivery of the migrating process. In this case, we use a common web server as the receiving server process. A universal CGI script mediates between migrating process and web server. Each server willing to accept migrating agents makes this universal service available. HTTP supplies the POST method to transfer some data to a file on the web server. When a CGI script is called remotely with the POST method instead of the usual GET method, data is transmitted from the client process to the standard input of the server's CGI script. So, to implement a mobile agent, we must not only write the agent program to start on the client side, but also the CGI script to receive the agent on the server side.
The PUT method can also be used for migration. HTTP does not require a CGI script for migration via PUT. However, with common web servers there is no advantage to this solution, because web servers such as Apache require explicit activation of a special PUT script.
Agent Tcl pursues a different course; it relies on a dedicated server process with a dedicated protocol specialized for receiving mobile agents.

Our agent example abuses a common web server as a migration tool. So, it needs a universal CGI script on the receiving side (the web server). The receiving script is activated with a POST request when placed into a location like /httpd/cgi-bin/PostAgent.sh. Make sure that the server system uses a version of gawk that supports network access (Version 3.1 or later; verify with `gawk --version').

     
     #!/bin/sh
     MobAg=/tmp/MobileAgent.$$
     # direct script to mobile agent file
     cat > $MobAg
     # execute agent concurrently
     gawk -f $MobAg $MobAg > /dev/null &
     # HTTP header, terminator and body
     gawk 'BEGIN { print "\r\nAgent started" }'
     rm $MobAg      # delete script file of agent

By making its process id ($$) part of the unique file name, the script avoids conflicts between concurrent instances of the script. First, all lines from standard input (the mobile agent's source code) are copied into this unique file. Then, the agent is started as a concurrent process and a short message reporting this fact is sent to the submitting client. Finally, the script file of the mobile agent is removed because it is no longer needed. Although it is a short script, there are several noteworthy points:

Security: There is none. In fact, the CGI script should never be made available on a server that is part of the Internet because everyone would be allowed to execute arbitrary commands with it. This behavior is acceptable only when performing rapid prototyping.
Self-Reference: Each migrating instance of an agent is started in a way that enables it to read its own source code from standard input and use the code for subsequent migrations. This is necessary because it needs to treat the agent's code as data to transmit. gawk is not the ideal language for such a job. Lisp and Tcl are more suitable because they do not make a distinction between program code and data.
Independence: After migration, the agent is not linked to its former home in any way. By reporting `Agent started', it waves “Goodbye” to its origin. The originator may choose to terminate or not.

The originating agent itself is started just like any other command-line script, and reports the results on standard output. By letting the name of the original host migrate with the agent, the agent that migrates to a host far away from its origin can report the result back home. Having arrived at the end of the journey, the agent establishes a connection and reports the results. This is the reason for determining the name of the host with `uname -n' and storing it in MyOrigin for later use. We may also set variables with the -v option from the command line. This interactivity is only of importance in the context of starting a mobile agent; therefore this BEGIN pattern and its action do not take part in migration:

     
     BEGIN {
       if (ARGC != 2) {
         print "MOBAG - a simple mobile agent"
         print "CALL:\n    gawk -f mobag.awk mobag.awk"
         print "IN:\n    the name of this script as a command-line parameter"
         print "PARAM:\n    -v MyOrigin=myhost.com"
         print "OUT:\n    the result on stdout"
         print "JK 29.03.1998 01.04.1998"
         exit
       }
       if (MyOrigin == "") {
          "uname -n" | getline MyOrigin
          close("uname -n")
       }
     }

Since gawk cannot manipulate and transmit parts of the program directly, the source code is read and stored in strings. Therefore, the program scans itself for the beginning and the ending of functions. Each line in between is appended to the code string until the end of the function has been reached. A special case is this part of the program itself. It is not a function. Placing a similar framework around it causes it to be treated like a function. Notice that this mechanism works for all the functions of the source code, but it cannot guarantee that the order of the functions is preserved during migration:

     
     #ReadMySelf
     /^function /                     { FUNC = $2 }
     /^END/ || /^#ReadMySelf/         { FUNC = $1 }
     FUNC != ""                       { MOBFUN[FUNC] = MOBFUN[FUNC] RS $0 }
     (FUNC != "") && (/^}/ || /^#EndOfMySelf/) \
                                      { FUNC = "" }
     #EndOfMySelf

The web server code in A Web Service with Interaction, was first developed as a site-independent core. Likewise, the gawk-based mobile agent starts with an agent-independent core, to which can be appended application-dependent functions. What follows is the only application-independent function needed for the mobile agent:

     
     function migrate(Destination, MobCode, Label) {
       MOBVAR["Label"] = Label
       MOBVAR["Destination"] = Destination
       RS = ORS = "\r\n"
       HttpService = "/inet/tcp/0/" Destination
       for (i in MOBFUN)
          MobCode = (MobCode "\n" MOBFUN[i])
       MobCode = MobCode  "\n\nBEGIN {"
       for (i in MOBVAR)
          MobCode = (MobCode "\n  MOBVAR[\"" i "\"] = \"" MOBVAR[i] "\"")
       MobCode = MobCode "\n}\n"
       print "POST /cgi-bin/PostAgent.sh HTTP/1.0"  |& HttpService
       print "Content-length:", length(MobCode) ORS |& HttpService
       printf "%s", MobCode                         |& HttpService
       while ((HttpService |& getline) > 0)
          print $0
       close(HttpService)
     }

The migrate function prepares the aforementioned strings containing the program code and transmits them to a server. A consequence of this modular approach is that the migrate function takes some parameters that aren't needed in this application, but that will be in future ones. Its mandatory parameter Destination holds the name (or IP address) of the server that the agent wants as a host for its code. The optional parameter MobCode may contain some gawk code that is inserted during migration in front of all other code. The optional parameter Label may contain a string that tells the agent what to do in program execution after arrival at its new home site. One of the serious obstacles in implementing a framework for mobile agents is that it does not suffice to migrate the code. It is also necessary to migrate the state of execution of the agent. In contrast to Agent Tcl, this program does not try to migrate the complete set of variables. The following conventions are used:

Each variable in an agent program is local to the current host and does not migrate.
The array MOBFUN shown above is an exception. It is handled by the function migrate and does migrate with the application.
The other exception is the array MOBVAR. Each variable that takes part in migration has to be an element of this array. migrate also takes care of this.

Now it's clear what happens to the Label parameter of the function migrate. It is copied into MOBVAR["Label"] and travels alongside the other data. Since travelling takes place via HTTP, records must be separated with "\r\n" in RS and ORS as usual. The code assembly for migration takes place in three steps:

Iterate over MOBFUN to collect all functions verbatim.
Prepare a BEGIN pattern and put assignments to mobile variables into the action part.
Transmission itself resembles GETURL: the header with the request and the Content-length is followed by the body. In case there is any reply over the network, it is read completely and echoed to standard output to avoid irritating the server.

The application-independent framework is now almost complete. What follows is the END pattern that is executed when the mobile agent has finished reading its own code. First, it checks whether it is already running on a remote host or not. In case initialization has not yet taken place, it starts MyInit. Otherwise (later, on a remote host), it starts MyJob:

     
     END {
       if (ARGC != 2) exit    # stop when called with wrong parameters
       if (MyOrigin != "")    # is this the originating host?
         MyInit()             # if so, initialize the application
       else                   # we are on a host with migrated data
         MyJob()              # so we do our job
     }

All that's left to extend the framework into a complete application is to write two application-specific functions: MyInit and MyJob. Keep in mind that the former is executed once on the originating host, while the latter is executed after each migration:

     
     function MyInit() {
       MOBVAR["MyOrigin"] = MyOrigin
       MOBVAR["Machines"] = "localhost/80 max/80 moritz/80 castor/80"
       split(MOBVAR["Machines"], Machines)           # which host is the first?
       migrate(Machines[1], "", "")                  # go to the first host
       while (("/inet/tcp/8080/0/0" |& getline) > 0) # wait for result
         print $0                                    # print result
       close("/inet/tcp/8080/0/0")
     }

As mentioned earlier, this agent takes the name of its origin (MyOrigin) with it. Then, it takes the name of its first destination and goes there for further work. Notice that this name has the port number of the web server appended to the name of the server, because the function migrate needs it this way to create the HttpService variable. Finally, it waits for the result to arrive. The MyJob function runs on the remote host:

     
     function MyJob() {
       # forget this host
       sub(MOBVAR["Destination"], "", MOBVAR["Machines"])
       MOBVAR["Result"]=MOBVAR["Result"] SUBSEP SUBSEP MOBVAR["Destination"] ":"
       while (("who" | getline) > 0)               # who is logged in?
         MOBVAR["Result"] = MOBVAR["Result"] SUBSEP $0
       close("who")
       if (index(MOBVAR["Machines"], "/") > 0) {   # any more machines to visit?
         split(MOBVAR["Machines"], Machines)       # which host is next?
         migrate(Machines[1], "", "")              # go there
       } else {                                    # no more machines
         gsub(SUBSEP, "\n", MOBVAR["Result"])      # send result to origin
         print MOBVAR["Result"] |& "/inet/tcp/0/" MOBVAR["MyOrigin"] "/8080"
         close("/inet/tcp/0/" MOBVAR["MyOrigin"] "/8080")
       }
     }

After migrating, the first thing to do in MyJob is to delete the name of the current host from the list of hosts to visit. Now, it is time to start the real work by appending the host's name to the result string, and reading line by line who is logged in on this host. A very annoying circumstance is the fact that the elements of MOBVAR cannot hold the newline character ("\n"). If they did, migration of this string did not work because the string didn't obey the syntax rule for a string in gawk. SUBSEP is used as a temporary replacement. If the list of hosts to visit holds at least one more entry, the agent migrates to that place to go on working there. Otherwise, we replace the SUBSEPs with a newline character in the resulting string, and report it to the originating host, whose name is stored in MOBVAR["MyOrigin"].

Footnotes

[1] http://www.research.ibm.com/massive/mobag.ps