Objective:

Objective:

Design and implement an architecture for an on-line game that handles scalability of large number of participants.

Things we agree:

During our brief exchange we managed to agree on a methodology and possible outcome of the work to be undertaken.

There are problems with the multicast model that require some innovation. However rather than undertake the traditional approach of conceptualising a solution and use a simulation to produce results, we will build an application to test the ideas and drive the requirements. The end application will be field tested on the testbed connecting 200 households.

The chosen application is a networked game due to its different requirements on different types of data streams, this approach will avoid adopting any of the traditional closed world assumptions that reduce the problem domain, thereby allowing optimisations. The end result should be general enough to allow change of context without total re-design and implementation.

In terms of design there are two different perspectives to consider:

Network infrastructure. This involves basically all the communication support for the data streams. The main focus will be on the multicast model. Combination and modification of existing solution will be explored along with the concepts put forward by RAMA. Of particular interest is what proves to be better in terms of routing; a bi-directional tree (Simple Multicast) or per source (EXPRESS).

Application. This involves everything else. Most of the stuff is strictly application domain, however there are other things that are network related, such as the namespace.

Considering the scope to problem to tackle it is hoped that some of the building blocks may be used from external resources. Therefore, working code for PGM, EXPRESS, SM, PIM will be accessible. Currently the proposed basis for the game will be MiMaze.

Naturally, it goes without saying that the work done will be published.

Areas to Cover:

The following topics are things we have discussed but a little more elaborate. Should a particular area be of interest then it may expanded into further detail. The ideas are not organised or structured yet, since we are discussing them.

Game Elements

The architecture of the game should include the following elements:

Client. The client application resides on remote hosts and provides the "window" to the world. It should be designed and implemented in such a way that it is irrelevant if it is a human user or an agent controlling the client.

Most Virtual Environments (VE) applications use models based on dead-reckoning. We should consider the same but enhance the model to have states and behaviours, rather than model just dynamics. This might reduce data traffic, since all you have to send is what triggers the behaviour state change. This does not necessarily imply the need for total reliability.

World Server (WS). The WS is responsible for all the off-line functionality of the environment. It guarantees the persistency of the contents of the world, both in terms of geometry and behaviours. Another responsibility is the management of users and the security mechanisms. There could be other set of services but it is not the focus neither relevant to the project.

Portal. The portal is basically a TP to alleviate the load on the WS. Since its functionality may confined within a process, it may be run on the same machine as the WS. As the number of users increases it would be advisable to have the portals on independent nodes of the network.

Group Manager (GM). The GM relates to the common AOIM described in the literature. It remains open to discussion if it should be part of the client application, thus enforcing distribution of group management (real hard) or if it should be an independent element running on dedicated machines thereby centralising the management but within a hierarchy. I am biased to the latter approach.

Grouping

There are different ways of grouping and it seems that there is no unique solution, so maybe the best approach is to combine both the static and dynamic grouping mechanisms.

The static grouping would be spatial based. The traffic generated would be low-bandwidth and probably a shared routing scheme like SM could be used.

The dynamic grouping would be focus/interest based. The traffic generated would most likely be high bandwidth and the routing would be per sender (EXPRESS) or shared (SM).

So taking the example of the room with people where we both hold a private conversation. Everyone in the room belongs to the same static grouping, so every client is aware where people are and has LOD of their behaviours. In addition, we both subscribe to same logical focus/interest group where either there would be one multicast port for both (SM) or one per sender (EXPRESS).

Interesting issues arise when considering accessibility rights. Security policies could be implemented on the dynamic grouping. So a person could only join a conversation if they are supposed to.

Dissemination of Data

We started an interesting discussion here but never finished. You state that no reliability should be applied to the communication model, while I say otherwise. I understand that errors occur in bursts and that sometimes the application may not afford the times related to retransmission.

I advocate that there are different types of data in the VE. Some do not have temporal memory so what you say is true. The newer state replaces the older one, so there is no point in asking for retransmission, since that information will become obsolete. I totally agree with you, however there are other data such as control and events that require reliability.

Thinking of the door example. If you disregard its behaviour and rely on updates then you do not require reliability. However you waste more bandwidth and need dead-reckoning. Now if you think that the door has a behaviour and there is an event that triggers the behaviour then you save bandwidth and do not need dead-reckoning. I do not see the problem of retransmitting using a scheme like PGM. If the transmission times are too high then most likely the user will generally experience high latency in its application so it does not make a difference.

I propose a compromise. Each entity that may generate traffic has a behaviour model with state machine. This model is based on events that trigger state transition, resulting in sending a packet with the information. But the model does send periodically current state info, however the frequency is adjustable according to how the network is behaving.

So in the example of the door, lets consider a simple state machine with two states: opened and closed. There is an event generated everytime during state transition. The behaviour upon receiving the event and taking into consideration current state, may go through a certain movement sequence. However state info is sent at a certain frequency that updates the model. If the intial event was lost then the door verifies that it has changed state and updates acoordingly. The frequency of transmission is decided by the application. Now if packets are being delivered then the frequency of the updates is very low, otherwise it is high.

We could refine the model so it could measure the pattern of state transition and let that influence the frequency of the updates.

This idea is a generalisation of the dead-reckoning model to include all forms of behaviours, instead of just movements.

Multicast Routing

I am aware that the effort behind Simple Multicast (SM) has evolved towards RAMA, which now encompasses EXPRESS. We have great interest of comparing both and see if either one eases the problems of scalability.

The mask in the EXPRESS/SM protocol is a very attractive property which allows some filtering in the network at the routers. How this is done I have not found information yet. I am re-reading the available information and will discuss it further.

You proposed the idea of using PGM to help the pruning of the sub-trees within a routing tree, without regard to reliability. However the focus of PGM is reliability, otherwise it is better not to use it. Both SM and EXPRESS require state at routers for the combination of address/mask so why use PGM except when reliability is necessary?

The idea of using PIM-SM for building the routing tree seems attractive however I need to look further into it.

Namespace

Current applications completely disregard the network when considering their namespaces. It would be interesting how to integrate application with network information into the namespace, in particular grouping data.

Also of interest, is how this could be tied up with address allocation.

It is necessary to come up with something that may have generalised use but without performance penalties. An option like Limbo and Linda is not viable.

PGM

I understood that you do not believe in the need for reliability but PGM is focused on semi-reliability and without that requirement the protocol is useless. You said you are interested in the pruning capabilities of subtrees. However this is done because of its state management, but since this will be a requirement in both SM and EXPRESS, I do not see the need for PGM except if you agree to have semi-reliability. It would be interesting to do a proper study of how PGM recovers from transmission failures and verify the associated timings.

VR Protocols

This is sometime to consider but it may too early without an idea of the architecture and overall functionality.

WorkPlan:

There is not much in terms of workplan for the time being except that the first 6 weeks should be dedicated to the following tasks:

Design the architecture to support the application. Integrate into the design the topics above. Build small programs to test ideas.

Analyse the EXPRESS and SM code. Build a simple program to use both. Test the existing code.

Test viability of doing project in Java. Compare performance of C and Java. In theory there should be no difference and currently I am more productive in Java than C/C++. The APIs that are in other languages could be wrapped in Java using the JNI.

Analyse the MiMaze source code. Study its design with the perspective of extending and customising. Depending upon ease of extension, may integrate SM/EXPRESS.

The above tasks are be carried out in parallel, although some have precedence over others.

Suggestions:

I wonder if it were possible to use the work I have been doing with Jamboo and the Datastreams? Maybe I should present it to you upon arrival, no?