Partners header

ECOSIM Telematics Applications Project:
Deliverable D04.02

Specification of Electronic Interfaces
Bandwidth and Latency Analysis (updates)

Author(s) : H.Hazewinkel, C. Hoffmann, R. Grech-cini, CEO Programme
Measurements: P. Maersk-Moller, CEO Programme





Synopsis
Programme name Telematics Application Programme
Sector Environment
Project Acronym ECOSIM
Contract number EN 1006
Project title Ecological and environmental monitoring and simulation system for management decision support in urban areas
Deliverable number D04.02
Deliverable title Specification of Electronic Interfaces
Bandwidth and Latency Analysis (updates)
Deliverable version number 0.3
Work package contributing to deliverable 4
Nature of the deliverable Online Working Document
Dissemination level Limited to Project Participants
Contractual date of delivery PM20 (August 1997)
Actual date of delivery PM20 (online)
10 November 1997 (hardcopy)
Author H.Hazewinkel, C. Hoffmann, R. Grech-cini
CEO Programme
Project technical co-ordinator Dr. Kurt Fedra, Environmental Software & Services GmbH
tel: +43 2252 633 050
fax: +43 2252 633 059
E-mail: kurt@ess.co.at





Executive Summary:

ECOSIM proposes the use of a centralised modelling centre - referred to as the ECOSIM main server - which can be remotely accessed via HTTP or locally via X windows clients. Internet connectivity, latency and throughput represent network constraints and determine among other factors, the available throughput of data. Thus it is evident that they will finally determine the amount of time needed for the end-user to receive and for the server to send information.

During the limited time period of measurement the connections between the three validation sites, via Internet, were poor. Execution of the latency code continually failed to provide benchmark results due to broken connections. In the light of this unreliability it would appear that using the Internet for the http based communication between servers is inappropriate. However, it should be kept in mind that there are ongoing schemes to improve connectivity bandwidth such as TEN-34, which should improve the situation and the shift to persistent connection support under http.





1. INTRODUCTION 

ECOSIM will be a support system to investigate and forecast pollution levels in urban areas. By allowing the effects of industrial developments or new roads to be quickly and easily examined, public authorities can use ECOSIM to ensure that their urban plans fully consider environmental impacts. ECOSIM will also be able to forecast pollution levels - typically over the next 24 hour period.

ECOSIM will be based on sophisticated computer models such as those which allow ozone levels to be calculated from road traffic emissions. It combines these models with up-to-date measurements of current meteorological conditions and pollution by linking directly into local databases and pollution measurement stations. ECOSIM also links the various domains such as surface water, coastal water and air to ensure that as complete a picture as possible can be predicted of environmental conditions. ECOSIM makes use of very high performance computers whenever it needs to and uses the latest methods in handling maps and similar data to ensure that its results can be easily translated into practical measures for pollution control.

Currently there is a number of data modelling centres around Europe which process information of particular types. ECOSIM proposes the use of a centralised modelling centre - referred to as the ECOSIM main server - which can be remotely accessed via HTTP or locally via X windows clients. During the test phase three public servers will provide day-to-day feedback to the ECOSIM developers and will use and operate their system on current environmental problems. This will require operational data traffic between the different sites and thus fast accessibility to the various data sets will have to be ensured.

2. BACKGROUND

The ECOSIM system is a distributed system which can use, among other means of data transport, the Internet for its communication. The following excerpts from the architecture document emphasis the distributed, Internet nature of the proposed system and describe the nature of the system:

"Based on its client-server architecture the server communicates with the
various distributed information resources either through its internal functions and
local file system, or through the HTTP protocol to access distributed resources
like databases, monitoring systems, or simulation models."

"This communication is based on the public HTTP protocol, and can be based on the
Internet or dedicated connections (such as ISDN phone lines) for the
physical communication layer."

http://www.ess.co.at/ECOSIM/architecture.html 

Since the communication will be based on HTTP, it is implicit that benchmarking, for the communication over Internet, must be based on the TCP/IP stack. TCP/IP is a common used protocol in the Internet which is an open protocol standard, freely available and platform independent.

 HTTP currently runs over TCP, but could run over any connection-oriented
service. The interpretation of the protocol below in the case of a sequenced packet service (such as DECnet(TM) or ISO TP4) is that the request
should be one TPDU, but the response may be many.
"

[http://www.w3.org/pub/WWW/Protocols/HTTP/AsImplemented.html]

The larger the amount of data and the more important the speed of data transfer becomes, the more significant it is to understand issues such as connectivity, latency and throughput. Thus the analysis of these issues has become an integral component of the ECOSIM deliverable.

"… Implementation constraints will also be identified (e.g. Processing throughput,
latency, communication bandwidths). These will form the basis of subsequent
design, development integration and test activity."

[ECOSIM, Project Programme Version 1.1]

 The analysis of implementation constraints for the ECOSIM project is in the current work-programme being performed between the three demonstration stage ECOSIM-test-sites. The sites are located at Berlin (Germany), Athens (Greece) and Gdansk (Poland).

 

 

Figure 1: Distribution of ECOSIM test sites

  

The local test sites are currently set-up as follows:

  Germany Greece Poland
Machine name flash.first.gmd.de aix.meng.auth.gr boss.zie.pg.gda.pl
LAN Ethernet (10Mb/s)
Partial ATM 155Mb/s
n/a at time of

measurement

n/a at time of

measurement

Connection to Internet X.25, 2Mb/s n/a at time of

measurement

256Kb/s

Table 1: Set-up of test sites

Currently a machine in Thessaloniki is being used as the ECOSIM-test-site in Greece and no connection to the intended ‘Athens’ site exists. It is expected that the site will be connected to the Internet via an X.21 leased line supporting 64-128Kb/s.

 

3. DEFINITIONS

 

Internet connectivity, latency and throughput represent network constraints and determine among other factors, the available throughput of data. Thus it is evident that they will finally determine the amount of time needed for the end-user to receive and for the server to send information.

  •  Connectivity determines whether a connection between sites can or cannot be established.

  •  Latency is defined as the time interval starting when the last bit of the input frame reaches the input port and ending when the first bit of the output frame is seen on the output port (see RFC1242, ftp://ftp.ripe.net/rfc/rfc1242.txt). The current definition being used in this document is "the time required to establish a TCP connection".

  • Throughput is defined as the number of bits of user data transferred per second. User data only means the data being communicated and not the control data used during the delivery. In a wider sense throughput is also referred to as bandwidth.

 Further information on throughput and latency issues on the Internet can be found at the so-called "Requests for Comments" (RFCs). These RFC’s are memos for use by the Internet community. In particular RFC1944 and RFC1242 relate to network terminology and benchmark methodologies.

 ftp://ds.internic.net/rfc/rfc1242.tx, ftp://ds.internic.net/rfc/rfc1944.txt

 

4. THE MEASUREMENT REQUIREMENTS

 

 This section illustrates the foreseen architecture of the ECOSIM system with a high level of abstraction. The ECOSIM architecture can be described as a network model in which the measurement requirements of the project are fitted.

4.1 The ECOSIM architecture

The ECOSIM architecture represents a distributed application composed of a number of entities which are connected to each other through a communication service. Each of these architecture elements can be made independently and/or different as long as they all obey the requirements set for communication with other elements. The schematic architecture is illustrated in Figure 2.

Figure 2: The ECOSIM architecture

 Applications and an HTTP-service are illustrated in Figure 2. The applications depict the 3 test-servers of the ECOSIM project during its test-phase and are connected, for exchanging messages with each other, through an HTTP-service. Figure 2 also illustrates an example of message exchanges between two applications. While messages are exchanged at the application level it is important to note that the messages are transported via the HTTP-service. In turn, the HTTP-service is the service implemented by the HTTP-protocol. Therefore, the above described ECOSIM architecture can be decomposed into smaller units of work. However, since the current deliverable does not define the applications it only decomposes the HTTP-service into HTTP-protocol entities and the TCP-service.

 The decomposition is illustrated in Figure 3 where in every horizontal level the same kind of functionality is used. In vertical direction however the upper units make use of the capabilities provided by lower elements. In this type of structuring functionality is divided into highly abstracted atomic units of work.

 

Figure 3: Decomposition of the HTTP-service

In addition, Figure 3 depicts the functionality of the HTTP-service composed with HTTP-protocol entities which implement the HTTP protocol and use the underlying TCP-service. The HTTP-protocol entities exchange HTTP-PDU’s (Protocol Data Units) by using the underlying service. Since the ECOSIM project uses HTTP and the Internet the underlying service is the TCP-service.

4.2 The Measurement System

The system to measure bandwidth and latency is being placed, as has the ECOSIM architecture illustrated in Figure 4, above the TCP-service. This measurement system will then simulate the traffic generated by the application and transmitted by the HTTP-protocol entities. The result of these simulations will be collected and processed in order to determine the throughput and latency over the Internet.

 

Figure 4: Measurement System

Figure 4 shows the measurement entities on top of the TCP-service and the HTTP-protocol entities. The measurement entities will connect with each other via the TCP-service in order to measure the time of connection set-up (latency) and the amount of data transport per time-frame (throughput). To determine reliable information of the measured values it is foreseen that the measurement entities should run over a longer period of time than in current testing.

 

 5. METHODOLOGY

 

This document aims to define a method for obtaining the constraints, namely the available connectivity, latency and throughput between the three main ECOSIM servers. The connectivity, latency and throughput measurements are currently made after a fixed waiting period which follows the completion of the previous measurement. The work performed so far illustrates preliminary results on which a detailed analysis will be based. In addition, the results will serve to establish a more solid and better analysis on certain interdependencies between connectivity, latency and throughput.

Figure 5 illustrates the prototype of the measurement entity which at the same time implements the methodology to measure the required parameters. The design is made with a so-called main-loop, which invokes the measurements implemented in the libraries and then furnishes the results towards a measurement processing capability. The speed with which the main-loop runs is determined by the time between two consecutive measurements. The measurement processor is needed to convert the data into readable and interpretable results.

 

 Figure 5: The functional design of the measurement entity

5.1 Connectivity

The quality of connectivity is determined by the possibility of building connections between sites. It can be assessed in a variety of ways, the simplest of which is via the use of the ping command. It should be noted that time-outs inherent in TCP/IP on a failed connection can cause misleading results on latency measurements. In addition, it might under certain network circumstances not be possible at all to open a connection between the source and destination hosts or a connection may be lost during data transfer.

5.2 Latency

In this document latency is referred to as "the time required to establish a TCP-connection". Within the prototype the latency is measured once per cycle of the main-loop. However, RFC1944 states that, "the test (for latency) must be repeated at least 20 times with the reported value being the average of the recorded values". It should be noted that no guidance is given to the frequency of these tests and that the use of the mean as a measure seems to be at odds with the normal median used with network data of this type. Since HTTP constructs and then tears down connections on every request, the latency is of direct relevance to the performance of the system. Later developments of persistent connections, also mentioned as keep-alive, should reduce the latency effect.

Implementation of Latency is performed as follows:

      •  Zero Timer
      • DNS Lookup
      • Reverse Lookup
      • Start Timer
      • Open Discard Port on host B
        (implicit wait for Acknowledge)
      • Stop Timer
      • Store timer value in T1
      • Tlatency=T1

5.3 Throughput

 In this document the throughput is referred to as "the amount of data which can be transferred in a certain period of time". With reference to throughput it should be noted that the measurements currently underestimate the throughput since the time taken for the acknowledgement, on completion of the transfer, is not compensated for. The amount of underestimation is a function of the duration of transfer (to destination) and time for the receipt of acknowledgement. However, since 400Kbits of user data is being used to measure bandwidth the effect is negligible.

 Implementation of Bandwidth is performed as follows:

      •  Zero Timer
      • DNS Lookup
      • Reverse Lookup
      • Start Timer
      • Open Discard Port on host B
      • Stop Timer
      • Store timer value in T1
      • Start Timer
      • Write data of size (S) to open port on host B
        (Linger for Acknowledge)
      • Stop Timer
      • Store timer value in T2
      • Ttransfer_duration=dT=T2-T1
      • Bandwidth=S/Ttransfer_duration

 

6. RESULTS

 

This section describes some preliminary results of testing the measurement entity which have been very useful to evaluate the measurement entity/method. The results achieved by these measurements should not be applied to draw final conclusions on the connectivity, latency and throughput, since when taking measurement for this document there have been times when connections could not be established or failed to complete. Thus an assessment of connectivity and reliability of connections will have to be an area for further analysis.

6.1 Latency

The results of Table 2 illustrate that there is a great difference between the average (as suggested in RFC1944) and the median as used in the literature. The reason for this can be understood when inspecting the results for latency in more detail. Measurements from Greece to Germany and from Greece to Poland have due to software issues not been taken. 

in secondsPoland -> Greece

Average

4.4

5.1

1.4

2.5

Median

0.8

0.4

0.7

1.1

Table 2: Measurements between ECOSIM test sites

 

Figure 6: Measurements from Germany to Poland

While more than 75% of the measurements illustrated in Figure 6 are below 2 seconds of duration a notable number of measurements lasts more than 10 seconds with groupings around 13 and 37 seconds and a maximum of 86 seconds. This apparent grouping of the results at the long duration end of the graph (>10seconds) is clearly affecting the calculation of the average. The median, as used elsewhere in the literature, should thus be the preferred measure in future work. Currently the causes for this grouping at the long duration end, which cannot be explained in terms of network latency alone, are under further investigation.

6.2 Throughput

 The maximum bandwidth of 96Kbits/s was found from Germany to Greece. The lowest bandwidth of only 200bits/s in turn appeared from Poland to Greece. Therefore it is clear that a homogeneous network basis does not exist between the three sites. Since the distributed functionality of the servers relies on the network characteristics the inhomogeneous network basis has serious implications for the system architecture. Measurements from Greece to Germany and from Greece to Poland have due to software issues not been taken.

  

In

Kbits/s

Germany -> Poland

Germany -> Greece

Poland -> Germany

Poland ->

Greece

Average

29.5

32.7

24.0

8.9

Median

27.2

29.7

25.9

6.7

Std.Dev.

17.0

20.0

8.5

7.8

Minimum

2.2

0.9

1.4

0.2

Maximum

67.0

96.0

39.6

27.0

 

Table 3: Throughput Measurements between ECOSIM test sites

 

 

Obviously 200bits per second is an extremely low throughput. Combined with a median latency of over a second the Poland/Greece connection would appear to be unsuitable for the transfer of significant quantities of data via Internet or for interactive use. While the statistics allow to draw general conclusions on average connectivity it is important to investigate the temporal characteristics (peaks/valleys) of the available bandwidth.

 In the following section bandwidth measurements taken between the 2rd and the 5th of December ‘96 are illustrated for the following connections: "Poland to Greece", "Germany to Greece" and "Germany to Poland". Since different numbers of samples were taken for different connection paths and the measurements were not synchronised, separate graphs are shown. The bandwidth measurements have been taken in Kbytes/sec and then been recalculated into Kbits/sec.

 Figure 7 shows the bandwidth measurements over a 29 hour period on the 2nd of December 1996. The periodic pattern that can be seen reflects Internet traffic load. Not surprisingly, the best time for data transfer is around four o’clock in the morning (flash.first.gmd.de time). However due to time gaps in measurements and other not yet determined factors a further examination over a longer period of time will be required to draw further conclusions.

Figure 7: Results of bandwidth measurements Poland-Greece

Figure 8 and 9 show connection bandwidths from Germany to Greece and Poland. It is interesting to note that occasionally extremely high bandwidths are achieved to both destination sites. However time-outs and bandwidth instability diminish the possibility to interpret the results and will require further measurements over longer, more representative periods of time.

 

Figure 8: Results of bandwidth measurements Germany-Greece

 

 

Figure 9: Results of bandwidth measurements Germany-Poland

7. SCHEDULING OF MEASUREMENTS

 The bandwidth and latency measurements are currently made after a fixed waiting period which follows the completion of the previous measurement. If, as is currently the case, a scheduling dependency exists on network bandwidth then unforeseen effects might decrease the validity of the achieved measurement values.

 One possible scheduling algorithm would be to take measurements at regular intervals and to ensure that either the maximum time for measurement does not exceed the chosen interval or to take the effect of multiple measurements at the same time into account. Unfortunately this type of regular sampling might also be problematic, since consequences of other regular processes within the system might have a contaminating effect. Another possibility would be to take measurements at random intervals whilst avoiding knock-on effects of previous measurements. This however remains an issue for discussion.

8. CONCLUSION

 During the limited time period of measurement the connections between the three sites, via Internet, were poor. Execution of the latency code continually failed to provide benchmark results due to broken connections. In the light of this unreliability it would appear that using the Internet for the http based communication between servers is inappropriate. However, it should be kept in mind that there are ongoing schemes to improve connectivity bandwidth such as TEN-34, which should improve the situation and the shift to persistent connection support under http.


© Copyright 1995-2002 by:   ESS   Environmental Software and Services GmbH AUSTRIA