DIY Agents

DIY Agents

You too can create simple autonomous software agents. Armed only with standard programming techniques, Mark Harman covers the ground.

I believe that there is a pressing need for the creation of autonomous software agents that are tailored to an individual’s tastes and requirements. These agents can sift the vast stores of available information for relevant data, or simply keep a watchful eye upon developments. They can also share the administrative load that the information society places upon humans. Perhaps agents will finally help us to achieve the long sought after goal of the paperless office.

An autonomous software agent is simply a program that is continually executing, or being periodically executed, and that gathers information (from the Internet, or elsewhere) to report back to its owner. Agents can be as simple or as sophisticated as the problem requires. Many very effective agents can be implemented without a great degree of programming sophistication. We simply need to create a smooth interface between the Internet, other humans, and the program that implements the agent, thus allowing it to process information on behalf of its owner.

Let’s look at how to build some simple agents using very basic programming skills based on facilities readily available from the Unix and Linux operating systems. None of these agents are particularly involved, nor will they provide facilities that are not already available ‘off the shelf’. However, I hope to show that they are strikingly easy to implement and that readers will be encouraged to create their own particular agents.

Waking up and executing

The first step to take in setting up an agent is to find an appropriate machine on which to execute it. This will become the agent’s life support system; switch it off and the agent will die. (It is possible to write agents that move from one computer to another, but we shall not be doing that in this article. This sort of mobile agent is more difficult to write and requires co-operation (willing or otherwise) on the part of the machine’s owner.)

We then need a mechanism for ensuring that the agent runs regularly. Of course, we could simply let the agent be a continually executing program, but this is a little wasteful of resources. It would be better to execute the agent periodically. There will be different mechanisms for achieving this on different systems. On Unix it is particularly easy – we can use cron.

This Unix daemon is woken up and executed every minute. Each user has a crontab (cron table) that describes which programs are to be executed by cron and when. The important issue here is that cron allows us to execute a program at regular intervals. For example, given that we have a program, p, we can specify that p is to be executed every five minutes by typing crontab -e. This will put us in an editor in which we should create the file */5 * * * * p. The five entries (separated by white space) denote a minute, hour, day of month, month of year, and day of week. The asterisk denotes every. Therefore, the entry */5 * * * * means ‘every five minutes’.

To remove our crontab entries we type crontab -r. This will be necessary if we want to ‘switch off’ our agent.

Ping

It’s often nice to know when a machine becomes available. For example, if one of the machines on your network goes down for some reason, it would be nice to receive an automatic notification when the machine comes back online.

A simple way to find out if a machine is online is to use fping (a variant of ping). This is a Unix command that takes as its parameter the name of a machine (one which could be anywhere on the Internet). If the machine is up and running then fping returns true as its exit status and false otherwise. We can test the exit status of a command execution using the if then else fi shell construct. So, we could write an agent to test if the machine we are interested in ‘comes online’. The only remaining problem is how to get the agent to notify us when the machine does reach this state.

We could arrange for the agent to display a message on our screen or to ring the bell on our terminal or some such device, but these approaches have the drawback that the agent will need to keep track of where we are. A better solution is to use email. We will therefore give the agent a username and let it send email. In doing this we are making a subtle shift from thinking of the agent as a program that runs on the machine to thinking of it as a first-class citizen of the network – a bona fide user, just like ourselves.

To implement the agent we simply write a shell script that tests the machine in which we’re interested (let’s suppose this is called machine.co.uk) and emails us when the machine comes online. This small shell script is shown in Listing 1. The mail message is sent to the user mark, and it is quoted in the text of the shell script itself, using the here document operator <<. We can now use cron to execute the script at regular intervals, making it into a simple agent.

Accessing the Internet via an agent

To achieve its full potential as an information gatherer and filter, an agent will have to be capable of surfing the Internet. This might sound quite hard to achieve. However, using Unix shell script and lynx (a very simple, text-based, browser) it can be quite easy.

Using lynx will allow us to exploit the Unix concept of a pipe to feed data from the web into files or programs on our local system under the control of our agent. The command lynx -source -dump url copies the source of the web page located at url to the screen. Using redirection we can copy this to a file with lynx -source -dump url > filename. Our agent can use this command to access the source of any page it chooses.

Web pages continually change. Indeed most good web pages will necessarily change frequently, as any good page will need to be kept up-to-date. It will often be useful to know as soon as a change is made to a particular page, so that we can immediately check the new information. It would be a nuisance if we had to maintain a bookmarked list of such useful pages and check them ourselves; it would be far better to have an agent do the leg work for us.

We can use lynx to access the source code of the web page and compare this with the code for the page last time we looked at it. Listing 2 shows a shell script that does just this.

In the script, the shell variable $HOME is used to ensure that files are stored in the home directory of the script’s owner. If we’ve allocated the agent a username, then this will be the agent’s home directory, otherwise it will be our own home directory. In either case, a subdirectory called robot is assumed to exist. As it stands the shell script is not robust, but it does the job and illustrates the ease with which simple agents can be established.

The assignment to PATH is required to ensure that only system versions of the standard commands are used, rather than any local alternatives that may exist in the file system. The shell command A || B, for commands A and B, executes A and if it fails, goes on to execute B. So test -e file fails if file does not exist. The next line therefore tests to see if the file data.html exists and, if it does not, creates it (from the web page http://www.unl.ac.uk/~mark/welcome.html).

Having established that the data file exists, we download the latest version of the web page and use an if construct to compare the new and old versions of the page. If the two pages are different then we send an email message to the user mark. The text of the mail message is produced as a here document, using <<. An alternative would have been to create a file containing a message to be sent to the owner of the agent and pipe the contents into the mail command. Having sent the message the agent updates its local copy of the web page using the cp command.

If we run the agent program regularly using cron, it will send an email to the user mark every time the page http://www.unl.ac.uk/~mark/welcome.html changes.

A simple mail server

An agent is an autonomous program, running continually on our behalf. Like any good apprentice, we shall want the agent to be capable of operating without command from us. However, we shall also want to be able to send the agent the odd command, altering its behaviour. We therefore need an interface mechanism that will allow us to send commands to the agent. A simple way to achieve this is to use email. After all, we have already used email as the medium through which the agent sends its findings to us.

This is not the only way to communicate with an agent. An alternative would be to use an HTML form to enter the data, which would then be sent to a cgi script using the post mechanism. However, the agent would then need to be sent the command by the cgi script (perhaps by email). To keep things simple, we shall simply send email direct to the agent.

In order for the agent to read email there is a little more work to do. The first step is to find out where new mail is stored. On my Linux system, supposing that the agent’s username is mark, mail sent to the agent will be stored in /var/spool/mail/mark. This file will exist even if there is no new mail, in which case it is empty. To read the new mail for user mark, we therefore simply have to read the contents of the file. Having read the new mail, we shall copy an empty file to /var/spool/mail/mark so that the next time the agent code is executed it will only consider mail messages that it has yet to process.

To use email to send messages (commands) to the agent, we shall need to work out a simple way to code up a command. One way this can be done is to put commands on the subject line of the email. This will be easy to locate, as the subject line is prefixed in the mail file by the word Subject:.

To see how this might work, we shall consider a simple agent that consists entirely of reading and sending email – a mail server. This is a mechanism for achieving asynchronous virtual conferencing. That is, any user may send a message to all users via the mail server. The users are therefore able to take part in a ‘virtual conference’. The communication is asynchronous because the sending and receiving of email is asynchronous. To use an email server, we send an email to the server and the server sends the message on to all users registered with the virtual conference. (Notice that this approach to mailing a list of recipients is superior to that of maintaining a simple distribution list, because it separates the process of joining and leaving the list from the process of sending messages to each registered recipient. It also centralises the distribution list, facilitating sharing among all users, and allows usage statistics to be collected at a central point.)

We don’t have to use shell script to implement an agent; we could use any language. For simple agents, shell script is often the most suitable notation. For more complex agents it is usually more appropriate to use a programming language like C. An outline of a simple mail server program is implemented as a C program in Listing 3.

In this program, several auxiliary functions are used to process the text of incoming mail messages and to send out mail messages (these are not shown but are all simply string processing functions and email commands). The program scans the incoming mail for the From: line, which contains the sender’s email address. This is stored in the string user, so that the agent can reply to the sender and store the sender’s email address in the database of registered users. Next the agent scans for the Subject: line, that contains the command. Three commands are understood: sub, to register the user as a subscriber to the conference, unsub to remove the user from the conference, and send to post a message to all users registered with the conference. Each of these is implemented by an auxiliary function. Finally, if none of the three commands is found on the subject line, then the agent replies to the sender, indicating that the command was not understood. (The C function system allows us to pass a string to the operating system to be executed as a normal OS command.)

Once we have implemented a simple mail server like this, we could easily add many features to it. For example, we could add a command who, to which the mail server responds by emailing the sender with a list of registered users. We could allow the agent to maintain lists of registered users for several conferences and maintain a hierarchy of such groups. We could implement a voting mechanism among the registered users. We could implement a ‘moderator’ system, whereby each conference had an owner to whom all messages were passed for acceptance before being forwarded to the wider group of registered users. All these facilities can be implemented with the existing communication mechanism.

Life with software agents

We have only considered a few simple examples of agents, showing what can be achieved with a little bit of code, an internet connection, and a machine that can be left running continually. Perhaps this kind of agent barely deserves the title because it does not display a high degree of intelligence; it is true that much of the research work concerned with the development of agent technology is simply a branch of artificial intelligence. However, I firmly believe that very simple agents such as the ones we’ve considered here will become useful and widespread as the Internet expands. Even with this simple approach, we could construct agents that are very adept at supporting internal and external commercial interaction. Moreover they can be constructed, maintained, and developed quickly and cheaply using existing technology and without additional staff training. See the box Other possible agents.

The ability to run programs regularly, combined with commands that allow programs to send and receive email and to access web pages, allows us to write simple software agents using straightforward programming techniques. As they reside on a single machine and use email to communicate with the owner, such agents are, in effect, playing the role of an automated user of the system. More complex agents would require more complex programming, but not new techniques. The added complexity is simply that required to pattern match and perform more sophisticated string processing.

Again, agent technology is currently in an embryonic state. However, the rapid growth of the Internet and our increasing reliance upon it as a source of information will demand that we adapt to life with software agents. In future, good agent programmers may be as highly sought after as good network engineers are today.

Mark Harman is director of research and acting head at the School of Informatics and Multimedia Technology in the University of North London (http://www.unl.ac.uk/~mark). He can be contacted via e-mail at m.harman@unl.ac.uk or by post to Mark Harman, Project Project, School of Informatics and Multimedia Technology, University of North London, Holloway Road, London N7 8DB.

Thanks are due to Sebastian Danicic and Ross Paterson who provided the original insight that lead to this article.

Listing 1 – A simple script to report when a machine comes online.

# Simple script to report (via e-mail)

# when a machine comes online

if fping machine.co.uk

then mail mark <<'end'

machine.co.uk has come online

end

Listing 2 – A simple URL monitor.

# check if URL has changed

PATH=/bin:/usr/bin

test -e $HOME/robot/data.html ||

lynx -source -dump http://www.unl.ac.uk/~mark/welcome.html > $HOME/robot/data.html

lynx -source -dump http://www.unl.ac.uk/~mark/welcome.html > $HOME/robot/newdata.html

if ! cmp -s $HOME/robot/data.html $HOME/robot/newdata.html

then

mail mark <<'end'

The page I'm monitoring has changed.

Yours sincerely,

The Robot.

end

cp $HOME/robot/newdata.html $HOME/robot/data.html

Listing 3 – Outline of a simple mail server.

main()

{

FILE* mail;

char aword[100];

char user[100];

mail = fopen("/var/spool/mail/mark","r");

while (!eof(mail)) {

fscanf(mail,"%s",aword);

while (!eof(mail) && strncmp(aword,"From:",5));

if (!strncmp(aword,"From:",5))

user = GetName(mail);

fscanf(mail,"%s",aword);

while (!eof(mail) && strncmp(aword,"Subject",7));

if (!strncmp(aword,"Subject",7)) {

fscanf(mail,"%s",aword);

if (!strncmp(aword,"sub",3))

subscribe(user);

else if (!strncmp(aword,"unsub",5))

unsubscribe(user);

else if (!strncmp(aword,"send",4))

sendmail();

else Unknown(aword,user);

}

system("cp emptymail /var/spool/mail/mark");

}

Other possible agents

As with all programs, the aim is to take the drudgery out of processing large amounts of data quickly and correctly. In the new business world, the web will act as an interface to the organisational database, and all employees will be able to communicate with one another via email. This creates a new programming paradigm in which programs have access to all data (potentially throughout the world) using a common interface and have the ability to communicate with all users.

Locating and reporting on dead links

The task of managing a web site in which data is continually updated has become a full-time job. This is recognised by all organisations that are serious about the web. The role of managing a web site is likely to rely increasingly upon agent technology. For example, a simple agent could roam the site, following only links that remained inside the organisation’s domain, and build a list of dead links, ie links that point to pages, both internal and external, that have become unobtainable. The agent could then email the (internal) page owner and ask that the links be updated. If the page remained uncorrected for a long period, then the agent would inform the web manager who would decide what action needed to be taken. This agent is simply a variation on the URL monitor theme, we implemented in Listing 2.

Data collection

The role of collecting data in an organisation typically falls upon an individual who becomes responsible for chasing their co-workers for information. This is a particularly debilitating process as it requires repeated reminders and checking and storing of data. In the past a well organised ‘data collector’ would have a database in which to enter the data. This database would produce reports of what was still required, thereby facilitating the ‘chase’. Using agent technology, such a person can set up an agent that checks the date, chases individuals for information, stores the information received in a database, and simply sends the human ‘chaser’ status reports. This is simply a (much enhanced) version of the mail server agent.

Data mining

Many organisations have individual employees adding data to web pages in an uncontrolled and unstructured manner. However, these web pages provide a useful resource to the organisation as they represent readily available data. We could design an agent that scanned a particular web site looking for information that is likely to be useful. In this case, the power of the agent would depend critically upon the sophistication of the pattern matching algorithms it relies upon. This is where existing artificial intelligence work starts to play a role in the development of agent technology (although a lot could be achieved simply by using the Unix egrep utility). The basic mechanism for this agent is simply that which we used to implement the URL monitor.

I would be very interested to hear from readers about applications to which they think agents could be applied and also of those already working within reader’s organisations (see my email address).

(P)1997, Centaur Communications Ltd. EXE Magazine is a publication of Centaur Communications Ltd. No part of this work may be published, in whole or in part, by any means including electronic, without the express permission of Centaur Communications and the copyright holder where this is a different party.

EXE Magazine, St Giles House, 50 Poland Street, London W1V 4AX, email editorial@dotexe.demon.co.uk