DATA STREAMS

OVERVIEW

When analysing current Virtual Environments (VE) toolkits, there exist several shortcomings in overall architectural design, however each presents particular strengths and respective weaknesses that clearly indicate the impossibility of a unique solution at this time. The vision shared by many of a unified matrix of cyberspace is only possible with standardisation, either de jure or de facto. However the many solutions that proliferate the field clearly demonstrate the immaturity and the need for further research to then begin drafting the rules of the game.

The objective of the data streams package is to provide a flexible framework which corresponds to the OSI transport layer but adopting the tenants of Application Level Framing thus bringing awareness to the application while transforming the network capabilities into manageable components. The usage of Java as the implementation language facilitates deployment and interoperability as the protocols may be built at load time when a host joins or creates a session.

There exist other packages that attempt to provide the desired flexibility, however they approach the problem in terms of well defined layers as portrayed in the OSI model. This results in a package that may be based on object-oriented design, however it presents itself as a black box to the application, which ultimately is not what is desired. Consequently the existing solution fall upon a major fallacy corresponding to the interpretation of inherent characteristics of the network as problems to be solved. This approach fails because latency is a property and not a problem that can be solved with any amount of programming. Therefore the approach taken in the data streams package is to provide the application with the necessary awareness to handle network properties that have undesired impact upon the users, such as latency. Towards this end a set of small and flexible components are defined, exposing the network and delegating the application the responsibility handling the data.

Figure 1 - Overview of the Streams package

The figure above provides an overview of the streams package where five subpackages exist:

Each of the above subpackages will be briefly described in the remainder sections.

The main principle of any communication is that data is passed along a virtual channel between one host to either a single or group of hosts, depending upon the communication model. There exist connectors at the end-points of the virtual channel and the transport protocols provide a service based on their design. The virtual channel will be hereafter referenced as a stream.

The TCP based streams already provide the transport protocol with a well-defined and mature service of delivery. There may be variations but the core idea is to provide a FIFO service for data. Thus the stream package merely enhances this by delegating the application the responsibility of handling the data. This is achieved either by the adoption of an event delegation model or direct protocol definition. The former is based on receiving data, which is placed in a data repository while an event is triggered to the appropriate StreamListeners while the latter allows the application to directly manipulate the protocol stack in the stream.

The UDP based streams does not provide any service of delivery, thus requiring the developer to define the protocol stack accordingly. The communication between the application and network components may either be done through the delegation model or the direct protocol definition, as mentioned previously. Considering that only one stack protocol exists per stream, it is necessary to combine both the transport and application protocols together, should the direct protocol definition be adopted.

It is worth to single out the DataBuffer since it will provide the backbone to the streams, whether they are based on either TCP or UDP. The DataBuffer holds a specific amount of bytes but its strength resides in the operations provided in manipulation of those bytes along with the accessibility methods to different data types. Also important is the lazy deserialisation, meaning it is possible to extract specific parameters at designated places.

Connectors

This package contains the classes that handle the end-points of the streams as illustrated in the figure below:

Figure 2 - Overview of connectors package

Mainly there is a connector that has information about the local CommPort and remote connector of the stream. The information concerning the remote connector may correspond to a single host or to more than one group, thus the existence of the UnicastConnector and GroupConnector respectively.

The GroupConnector is for usage of streams with multicast connectors where it is possible to join several different groups on the same connection. Naturally masks are used for both sending and receiving data. In the latter case filters may be used locally should the low level API not support masking schemes such as Simple Multicast.

Protocols

This package defines the elements necessary to use and define protocols, independently if their usage belong to either the application or transport domain. The core class is the ProtocolStack, which encompasses all the protocols that operate sequentially upon a DataBuffer at a time. If the protocols of the ProtocolStack are exclusively for transport purposes then most likely the communication with the application is done based upon the event delegation model. Otherwise the protocols from the application domain operate in sequence after the ones responsible for transport services.

The protocols that comprise a ProtocolStack handles data in a single direction, thus either the protocols are used for sending or receiving. This approach requires different ProtocolStacks should the stream handle both input and output. The protocols are created from the ProtocolRepository.

Although it is believed that a ProtocolStack does not change in run-time, it should be possible to reconfigure the ProtocolStack if necessary. It would be necessary the existence of a supporting control protocol.

Figure 3 - Overview of protocols package

An application may prefer to have direct protocol intervention thus avoiding the event delegation model but at the expense of assuming more responsibility within the stream package. It is necessary for the application to subclass the respective Protocol for either sending or receiving data. Although this mechanism is similar to the callback pattern, in reality it is more efficient and exposes the network through components so the application not only handles the data but also may react to current status of the network. This approach reduces the overall toll of latency however careful design for concurrency purposes is required.

 

DataStreams

This package is dedicated for the streams based on TCP services.

Figure 4 - Overview datastreams package

There are two main distinct types of DataStreams as illustrated in the above figure:

The DataStream may be used for the model where the role of sender and receiver is not interchangeable. To support both roles it is necessary to use the DataInputStream where another ProtocolStack exists with the protocols to handle reception of data. The CommReader consists on a Thread that receives data concurrently, however its activation is not compulsory. This is possible through the usage of the receiver interface, therefore allowing non-blocking calls. However it is possible for a developer to have blocking calls by invoking the corresponding receiving method directly, bypassing the CommReader.

The streams package encourages the ALF approach, thus promoting the delegation of responsibility of direct handling of the data to the application. However a delegation model based on events is also permissible through the usage of the DataRepository.

Should the direct protocol definition be adopted then the application has total responsibility to handle the data across the stream. However if the event-based model is used, it is required to use the DataBuffer, which corresponds to messages stored in DataRepository. These messages are retrieved from the application at a latter stage after the event has been caught and processed. The DataRepository may be regarded to be similar to the TupleSpace of Linda and Limbo.

PacketStreams

This package aims at the streams based upon packets.

Figure 5 - Overview packetstreams package

The design principles of the packetstreams package are very much similar to the datastreams package, however there are subtle differences due to the dissimilarity of the streams. The PacketStream may only send data and towards that end has a single PacketResource. When receiving, the PacketInputStream requires the existence of two important elements:

Naturally both the PacketPool and PacketQueue may be subclassed to support specific policies in their operation. So in the case of the PacketQueue, which is a FIFO queue, it is possible to have the following variations:

Observations

This document is just a first draft to validate principles and ideas, however it sourly needs further work and maturing. However it will only be possible to continue to work the ideas through by beginning implementation and such action only makes sense after validation of the principles to avoid dispensing unnecessary effort.

Should the idea be validated then it becomes necessary to implement a few protocols to evaluate the performance and flexibility. Also most of the classes should be extended to provide specific functionality, such is the case of the PacketQueues.

Most likely the designations currently used for the datastreams package need to be modified to avoid confusion with the corresponding classes of identical naming from the SUN iostreams.

The need for both an event delegation model and the direct protocol definition remains an open issue. The event-based approach de-couples cleanly the application from the network, however the cost of the overhead may not be attractive. While the direct protocol approach goes more in line with ALF but compromises, to an extent, the de-coupling. For now supporting both is possible but design decisions will be in favour of the direct protocol definition approach.

Need to add references to support claims, which consequently requires to extend the document further.